Greedy algorithm 叶德仕 yedeshi@zju.edu.cn.

Slides:



Advertisements
Similar presentations
Chapter 9 Greedy Technique. Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: b feasible - b feasible.
Advertisements

CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Great Theoretical Ideas in Computer Science
1 Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Comp 122, Spring 2004 Greedy Algorithms. greedy - 2 Lin / Devi Comp 122, Fall 2003 Overview  Like dynamic programming, used to solve optimization problems.
Greedy Algorithms Greed is good. (Some of the time)
Great Theoretical Ideas in Computer Science for Some.
1 Discrete Structures & Algorithms Graphs and Trees: III EECE 320.
Greed is good. (Some of the time)
Approximation, Chance and Networks Lecture Notes BISS 2005, Bertinoro March Alessandro Panconesi University La Sapienza of Rome.
Introduction to Algorithms Jiafen Liu Sept
Introduction to Algorithms 6.046J/18.401J L ECTURE 16 Greedy Algorithms (and Graphs) Graph representation Minimum spanning trees Optimal substructure Greedy.
Combinatorial Algorithms
Lectures on Network Flows
Great Theoretical Ideas in Computer Science.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Greedy Algorithms for Matroids Andreas Klappenecker.
Discussion #36 Spanning Trees
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Minimal Spanning Trees. Spanning Tree Assume you have an undirected graph G = (V,E) Spanning tree of graph G is tree T = (V,E T E, R) –Tree has same set.
CSE 421 Algorithms Richard Anderson Lecture 6 Greedy Algorithms.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Minimum Spanning Trees
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Data Structures and Algorithms Graphs Minimum Spanning Tree PLSD210.
Analysis of Algorithms
Called as the Interval Scheduling Problem. A simpler version of a class of scheduling problems. – Can add weights. – Can add multiple resources – Can ask.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
MST Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Introduction to Algorithms L ECTURE 14 (Chap. 22 & 23) Greedy Algorithms I 22.1 Graph representation 23.1 Minimum spanning trees 23.1 Optimal substructure.
Spring 2015 Lecture 11: Minimum Spanning Trees
9/8/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 6 Greedy Algorithms.
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
Accessible Set Systems Andreas Klappenecker. Matroid Let S be a finite set, and F a nonempty family of subsets of S, that is, F  P(S). We call (S,F)
1 Greedy algorithm 叶德仕 2 Greedy algorithm’s paradigm Algorithm is greedy if it builds up a solution in small steps it chooses a decision.
 2004 SDU Lecture 7- Minimum Spanning Tree-- Extension 1.Properties of Minimum Spanning Tree 2.Secondary Minimum Spanning Tree 3.Bottleneck.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
Greedy Algorithms and Matroids Andreas Klappenecker.
Lecture 19 Greedy Algorithms Minimum Spanning Tree Problem.
Lectures on Greedy Algorithms and Dynamic Programming
CSCI 256 Data Structures and Algorithm Analysis Lecture 6 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
1 Chapter 5-1 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Lecture 12 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
6/13/20161 Greedy A Comparison. 6/13/20162 Greedy Solves an optimization problem: the solution is “best” in some sense. Greedy Strategy: –At each decision.
Greedy Algorithms. p2. Activity-selection problem: Problem : Want to schedule as many compatible activities as possible., n activities. Activity i, start.
Greedy Algorithms General principle of greedy algorithm
Algorithm Analysis Fall 2017 CS 4306/03
Greedy Technique.
Lectures on Network Flows
Lecture 12 Algorithm Analysis
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
Chapter 5. Optimal Matchings
Greedy Algorithms / Interval Scheduling Yin Tat Lee
Presented by Po-Chuan & Chen-Chen 2016/03/08
Minimal Spanning Trees
CSCE350 Algorithms and Data Structure
Lectures on Graph Algorithms: searching, testing and sorting
Lecture 12 Algorithm Analysis
Minimum Spanning Tree Algorithms
Greedy Algorithms Comp 122, Spring 2004.
Introduction to Algorithms: Greedy Algorithms (and Graphs)
Lecture 12 Algorithm Analysis
Minimum Spanning Trees
Presentation transcript:

Greedy algorithm 叶德仕 yedeshi@zju.edu.cn

Greedy algorithm’s paradigm Algorithm is greedy if it builds up a solution in small steps it chooses a decision at each step myopically to optimize some underlying criterion Analyzing optimal greedy algorithms by showing that: in every step it is not worse than any other algorithm, or every algorithm can be gradually transformed to the greedy one without hurting its quality

Interval scheduling Input: set of intervals on the line, represented by pairs of points (ends of intervals). In another word, the ith interval, starts at time si and finish at fi. Output: finding the largest set of intervals such that none two of them overlap. Or the maximum number of intervals that without overlap. Greedy algorithm: Select intervals one after another using some rule

Rule 1 Select the interval which starts earliest (but not overlapping the already chosen intervals) Underestimated solution! OPT #4 Algorithm #1

Rule 2 Select the interval which is shortest (but not overlapping the already chosen intervals) Underestimated solution! OPT #2 Algorithm #1

Rule 3 Select the interval with the fewest conflicts with other remaining intervals (but still not overlapping the already chosen intervals) Underestimated solution! OPT #4 Algorithm #3

Rule 4 Select the interval which ends first (but still not overlapping the already chosen intervals) Quite a nature idea: we ensure that our resource become free as soon as possible while still satisfying one request Hurray! Exact solution!

f1 smallest Algorithm #3

Analysis - exact solution Algorithm gives non-overlapping intervals: obvious, since we always choose an interval which does not overlap the previously chosen intervals The solution is exact: Let A be the set of intervals obtained by the algorithm, and OPT be the largest set of pairwise non-overlapping intervals. We show that A must be as large as OPT

Analysis – exact solution cont. Let and be sorted. By definition of OPT we have k ≤ m Fact: for every i ≤ k, Ai finishes not later than Bi. Pf. by induction. For i = 1 by definition of a step in the algorithm. Suppose that Ai-1 finishes not later than Bi-1.

Analysis con. From the definition of a step in the algorithm we get that Ai is the first interval that finishes after Ai-1 and does not verlap it. If Bi finished before Ai then it would overlap some of the previous A1 ,…, Ai-1 and consequently - by the inductive assumption - it would overlap Bi-1, which would be a contradiction. Bi-1 Bi Ai Ai-1

Analysis con. Theorem: A is the exact solution. Proof: we show that k = m. Suppose to the contrary that k < m. We have that Ak finishes not later than Bk Hence we could add Bk+1 to A and obtain bigger solution by the algorithm-contradiction Bk-1 Bk Bk+1 Ak Ak-1

Time complexity Sorting intervals according to the right-most ends For every consecutive interval: If the left-most end is after the right-most end of the last selected interval then we select this interval Otherwise we skip it and go to the next interval Time complexity: O(n log n + n) = O(n log n)

Edge: towns no far than 30 miles Planning of schools A collection of towns. We want to plan schools in towns. Each school should be in a town No one should have to travel more than 30 miles to reach one of them. Edge: towns no far than 30 miles Nodes within 30 miles have an edge

Set cover Input. A set of elements B, sets Output. A selection of the Si whose union is B. Cost. Number of sets picked.

Greedy Greedy: first choose a set that covers the largest number of elements. example: place a school at town a, since this covers the largest number of other towns. Greedy #4 OPT #3

Upper bound Theorem. Suppose B contains n elements that the optimal cover consist of k sets. Then the greedy algorithm will use at most k ln n sets. Pf. Let nt be the number of elements still not covered after t iterations of the greedy algorithm (n0=n). Since these remaining elements are covered by the optimal k sets, there must be some set with at least nt /k of them. Therefore, the greedy algorithm will ensure that

Upper bound con. Then , since for all x, with equality if and only if x=0. Thus At t=k ln n, therefore, nt is strictly less than ne-ln n =1, which means no elements remains to be covered. Consequently, the approximation ratio is at most ln n

Exercise Knapsack problem

Marking Changes Goal. Given currency denominations in HK: 1, 2, 5, 10, 20, 50, 100, 500, and 1000, devise a method to pay amount to customer using fewest number of notes/coins. Cashier's algorithm. At each iteration, add note/coin of the largest value that does not take us past the amount to be paid.

Optimal Offline Caching Cache with capacity to store k items. Sequence of m item requests d1, d2, …, dm. Cache hit: item already in cache when requested. Cache miss: item not already in cache when requested: must bring requested item into cache, and evict some existing item, if full. (It also refers to the operation of bringing an item into cache.) Goal. Eviction schedule that minimizes number of cache misses. Ex: k = 2, initial cache = ab, requests: a, b, c, b, c, a, a, b. Optimal eviction schedule: 2 cache misses. a a b b a b c c b b c b c c b a a b a a b b a b requests cache

Optimal Offline Caching: Farthest-In-Future Farthest-in-future. Evict item in the cache that is not requested until farthest in the future. Theorem. [Bellady, 1960s] FF is optimal eviction schedule. Pf. Algorithm and theorem are intuitive; proof is subtle. current cache: a b c d e f future queries: g a b c e d a b b a c d e a f a d e f g h ... cache miss eject this one

Minimum spanning tree Input: weighted graph G = (V,E) every edge in E has its positive weight Output: finding the spanning tree such that the sum of weights is not bigger than the sum of weights of any other spanning tree Spanning tree: subgraph with no cycle, and connected (every two nodes in V are connected by a path) 2 2 2 1 1 1 1 1 1 2 2 2 3 3 3

Properties of minimum spanning trees MST n nodes n - 1 edges at least 2 leaves (leaf - a node with only one neighbor) MST cycle property: After adding an edge we obtain exactly one cycle and all the edges from MST in this cycle have no bigger weight than the weight of the added edge 2 2 1 1 1 1 2 2 3 3 cycle

Optimal substructures MST T: (Other edges of G are not shown.)

Optimal substructures MST T: (Other edges of G are not shown.) v Remove any edge (u, v) ∈ T.

Optimal substructures MST T: (Other edges of G are not shown.) T2 Remove any edge (u, v) ∈ T. Then, T is partitioned into two subtrees T1 and T2.

Optimal substructures MST T: (Other edges of G are not shown.) T2 Remove any edge (u, v) ∈ T. Then, T is partitioned into two subtrees T1 and T2. Theorem. The subtree T1 is an MST of G1 = (V1, E1), the subgraph of G induced by the vertices of T1: V1 = vertices of T1, E1 = { (x, y) ∈ E : x, y ∈ V1 }. Similarly for T2.

Proof of optimal substructure Proof. Cut and paste: w(T) = w(u, v) + w(T1) + w(T2). If T1′ were a lower-weight spanning tree than T1 for G1, then T′ = {(u, v)} ∪ T1′ ∪ T2 would be a lower-weight spanning tree than T for G.

Do we also have overlapping subproblems? Yes. Great, then dynamic programming may work! Yes, but MST exhibits another powerful property which leads to an even more efficient algorithm.

Crucial observation about MST Consider sets of nodes A and V - A Let F be the set of edges between A and V - A Let a be the smallest weight of an edge from F Theorem: Every MST must contain at least one edge of weight a from set F A A 2 2 1 1 1 1 2 2 3 3

Proof of the observation Let e be the edge in F with the smallest weight - for simplicity assume that there is unique such edge. Suppose to the contrary that e is not in some MST. Choose one such MST. Add e to MST - obtain the cycle, where e is (among) smallest weights. Since two ends of e are in different sets A and V - A, there is another edge f in the cycle and in F. Remove f from the tree (with added edge e) - obtain a spanning tree with the smaller weight (since f has bigger weight than e). This is a contradiction with MST. A A 2 2 1 1 1 1 2 2 3 3

Greedy algorithm finding MST Kruskal’s algorithm: Sort all edges according to the weights in non-increasing order Choose n - 1 edges one after another as follows: If a new added edge does not create a cycle with previously selected then we keep it in (partial) MST, otherwise we remove it Remark: we always have a partial forest 2 2 2 1 1 1 1 1 1 2 2 2 3 3 3

Greedy algorithm finding MST Prim’s algorithm: Select a node as a root arbitrarily Choose n - 1 edges one after another as follows: Look on all edges incident to the currently build (partial) tree and which do not create a cycle in it, and select one which has the smallest weight Remark: we always have a connected partial tree root 2 2 2 1 1 1 1 1 1 2 2 2 3 3 3

Example of Prim A 12 6 9 5 V - A 8 14 7 15 3 10

Example of Prim A 12 6 9 5 V - A 8 14 7 15 3 10

Example of Prim A 12 6 9 5 7 V - A 8 14 7 15 3 10

Example of Prim A 12 6 9 5 7 V - A 8 14 7 15 3 10

Example of Prim A 12 6 9 5 5 7 V - A 8 14 7 15 3 10

Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 15 3 10

Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 15 3 8 10

Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 15 3 8 10

Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 3 15 3 8 10

Example of Prim 6 A 12 6 9 5 5 7 9 V - A 8 14 7 3 15 3 8 10

Example of Prim 6 A 12 6 9 5 5 7 9 V - A 8 14 7 3 15 15 3 8 10

Example of Prim 6 A 12 6 9 5 5 7 9 V - A 8 14 7 3 15 15 3 8 10

Why the algorithms work? Follows from the crucial observation Kruskal’s algorithm: Suppose we add edge {v,w}. This edge has the smallest weight among edges between the set of nodes already connected with v (by a path in selected subgraph) and other nodes. Prim’s algorithm: Always chooses an edge with the smallest weight among edges between the set of already connected nodes and free nodes.

Time complexity There are implementations using Union-find data structure (Kruskal’s algorithm) Priority queue (Prim’s algorithm) achieving time complexity O(m log n) where n is the number of nodes and m is the number of edges

Best of MST Best to date: Karger, Klein, and Tarjan [1993]. Randomized algorithm. O(V + E) expected time.

Conclusions Greedy algorithms for finding minimum spanning tree in a graph, both in time O(m log n) : Kruskal’s algorithm Prim’s algorithm Remains to design the efficient data structures!

Conclusions Greedy algorithms: algorithms constructing solutions step after step using a local rule Exact greedy algorithm for interval selection problem - in time O(n log n) illustrating “greedy stays ahead” rule Greedy algorithm may not produce optimal solution such as set cover problem Matroids can help to prove when will greedy can lead to optimal solution Minimum spanning tree could be solved by greedy method in O(m log n)

Matroids When will the greedy algorithm yields optimal solutions? Matroids [Hassler Whitney]: A matroid is an ordered pair M=(S, ℓ) satisfying the following conditions. S is a finite nonempty set ℓ is a nonempty family of subsets of S, called the independent subsets of S, such that if We say that ℓ is hereditary if it satisfies this property. Note that empty set is necessarily a member of ℓ. If , then there is some element such that . We say that M satisfies the exchage property.

Max independent Theorem. All maximal independent subsets in a matroid have the same size. Pf. Suppose to the contrary that A is a maximal independent subset of M and there exists another larger maximal independent subset B of M. Then, the exchange property implies that A is extendible to a larger independent set A ∪ {x} for some x ∈ B - A, contradicting the assumption that A is maximal.

Weighted Matroid We say that a matroid M = (S,ℓ) is weighted if there is an associated weight function w that assigns a strictly positive weight w(x) to each element x ∈ S. The weight function w extends to subsets of S by summation: for any A ⊆ S.

Greedy algorithms on a weighted matroid Many problems for which a greedy approach provides optimal solutions can be formulated in terms of finding a maximum-weight independent subset in a weighted matroid. That is, we are given a weighted matroid M = (S,ℓ), and we wish to find an independent set A ∈ℓ such that w(A) is maximized. We call such a subset that is independent and has maximum possible weight an optimal subset of the matroid. Because the weight w(x) of any element x ∈ S is positive, an optimal subset is always a maximal independent subset-it always helps to make A as large as possible.

Greedy algorithm GREEDY(M, w) 1. A ← Ø 2. sort S[M] into monotonically decreasing order by weight w 3. for each x ∈ S[M], taken in monotonically decreasing order by weight w(x) 4. do if A ∪ {x} ∈ℓ [M] 5. then A ← A ∪ {x} 6. return A

Lemma Lemma 1. Suppose that M = (S,ℓ) is a weighted matroid with weight function w and that S is sorted into monotonically decreasing order by weight. Let x be the first element of S such that {x} is independent, if any such x exists. If x exists, then there exists an optimal subset A of S that contains x. Pf. If no such x exists, then the only independent subset is the empty set and we're done. Otherwise, let B be any nonempty optimal subset. Assume that x ∉ B; otherwise, we let A = B and we're done. No element of B has weight greater than w(x). To see this, observe that y ∈ B implies that {y} is independent, since B ∈ℓ and ℓ is hereditary. Our choice of x therefore ensures that w(x) ≥ w(y) for any y ∈ B.

Lemma Construct the set A as follows. Begin with A = {x}. By the choice of x, A is independent. Using the exchange property, repeatedly find a new element of B that can be added to A until |A| = |B| while preserving the independence of A. Then, A = B - {y} ∪ {x} for some y ∈ B, and so w(A)=w(B) - w(y) + w(x) ≥ w(B). Because B is optimal, A must also be optimal, and because x ∈ A, the lemma is proven.

Lemma Lemma 2. Let M = (S,ℓ) be any matroid. If x is an element of S that is an extension of some independent subset A of S, then x is also an extension of Ø. Pf. Since x is an extension of A, we have that A ∪ {x} is independent. Since ℓ is hereditary, {x} must be independent. Thus, x is an extension of Ø. It is shown that if an element is not an option initially, then it cannot be an option later. We next show that if an element is not an option initially, then it cannot be an option later.

Corollary Corollary Let M = (S,ℓ) be any matroid. If x is an element of S such that x is not an extension of Ø, then x is not an extension of any independent subset A of S. Any element that cannot be used immediately can never be used. Therefore, GREEDY cannot make an error by passing over any initial elements in S that are not an extension of Ø, since they can never be used.

ℓ′ ={B ⊆ S - {x} : B ∪ {x} ∈ℓ}, Lemma 3. Let x be the first element of S chosen by GREEDY for the weighted matroid M = (S,ℓ). The remaining problem of finding a maximum-weight independent subset containing x reduces to finding a maximum-weight independent subset of the weighted matroid M′ = (S′,ℓ), where S′ ={y ∈ S : {x, y} ∈ℓ}, ℓ′ ={B ⊆ S - {x} : B ∪ {x} ∈ℓ}, and the weight function for M′ is the weight function for M, restricted to S′. (We call M′ the contraction of M by the element x.)

Proof If A is any maximum-weight independent subset of M containing x, then A′ = A - {x} is an independent subset of M′. Conversely, any independent subset A′ of M′ yields an independent subset A = A′ ∪ {x} of M. Since we have in both cases that w(A) = w(A′) + w(x), a maximum-weight solution in M containing x yields a maximum-weight solution in M′, and vice versa.

Theorem Theorem. If M = (S,ℓ) is a weighted matroid with weight function w, then GREEDY(M, w) returns an optimal subset. Pf. By Corollary, any elements that are passed over initially because they are not extensions of Ø can be forgotten about, since they can never be useful. Once the first element x is selected, Lemma 1 implies that GREEDY does not err by adding x to A, since there exists an optimal subset containing x.

Theorem con. Finally, Lemma 3 implies that the remaining problem is one of finding an optimal subset in the matroid M′ that is the contraction of M by x. After the procedure GREEDY sets A to {x}, all of its remaining steps can be interpreted as acting in the matroid M′ = (S′,ℓ′), because B is independent in M′ if and only if B ∪ {x} is independent in M, for all sets B ∈ℓ′. Thus, the subsequent operation of GREEDY will find a maximum-weight independent subset for M′, and the overall operation of GREEDY will find a maximum-weight independent subset for M.