# Greedy Algorithms Greed is good. (Some of the time)

## Presentation on theme: "Greedy Algorithms Greed is good. (Some of the time)"— Presentation transcript:

Greedy Algorithms Greed is good. (Some of the time)

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms2 Outline  Elements of greedy algorithm Greedy choice property Optimal substructures  Minimum spanning tree Kruskal’s algorithm Prim’s algorithm  Huffman code  Activity selection

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms3 Introduction  Concepts Choosing the best possible choice at each step. This decision leads to the best over all solution.  Greedy algorithms do not always yield optimal solutions.

Elements of Greedy Algorithms

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms5 Greedy-choice property  A globally optimal solution is derived from a locally optimal (greedy) choice.  When choices are considered, the choice that looks best in the current problem is chosen, without considering results from subproblems.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms6 Optimal substructures  A problem has optimal substructure if an optimal solution to the problem is composed of optimal solutions to subproblems.  This property is important for both greedy algorithms and dynamic programming.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms7 Greedy Algorithm v.s. Dynamic Programming Dynamic programming  A choice is made at each step.  The choice made at each step usually depends on the solutions to subproblems.  Dynamic-programming problems are often solved in a bottom-up manner. Greedy algorithm  The best choice is made at each step and after that the subproblem is solved.  The choice made by a greedy algorithm may depend on choices so far, but it cannot depend on any future choices or on the solutions to subproblems.  A greedy strategy usually progresses in a top-down fashion, making one greedy choice after another, reducing each given problem instance to a smaller one.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms8 Steps in Design Greedy Algorithms  Determine the optimal substructure of the problem.  Develop a recursive solution.  Prove that at any stage of the recursion, one of the optimal choices is the greedy choice. Thus, it is always safe to make the greedy choice.  Show that all but one of the subproblems induced by having made the greedy choice are empty.  Develop a recursive algorithm that implements the greedy strategy.  Convert the recursive algorithm to an iterative algorithm.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms9 Shortcuts Design  Form the optimization problem so that, after a choice is made and there is only one subproblem left to be solved.  Prove that there is always an optimal solution to the original problem that makes the greedy choice, so that the greedy choice is always safe.  Demonstrate that, having made the greedy choice, what remains is a subproblem with the property that if we combine an optimal solution to the subproblem with the greedy choice we have made, we arrive at an optimal solution to the original problem.

Minimum Spanning Tree

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms11 Definitions  Let G = (V, E) be an undirected graph. T is a minimum spanning tree of G if T ⊆ E is an acyclic subset that connects all of the vertices and whose total weight w(T ) =  w(u, v) is minimized. (u,v) ∈ T  Let A be a subset of some minimum spanning tree. An edge (u, v) is a safe edge for A if A  {(u, v)} is also a subset of a minimum spanning tree.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms12 Definitions  Let G = (V, E) be an undirected graph. A cut (S, V − S) of G is a partition of V. An edge (u, v)  E crosses the cut (S, V − S) if one of its endpoints is in S and the other is in V − S. A cut respects a set A of edges if no edge in A crosses the cut. An edge is a light edge crossing a cut if its weight is the minimum of any edge crossing the cut.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms13 Theorem Let G = (V, E) be a connected, undirected graph with a real-valued weight function w defined on E, A be a subset of E that is included in some minimum spanning tree for G, (S, V − S) be any cut of G that respects A. If (u, v) is a light edge crossing (S, V − S), then edge (u, v) is safe for A. u v x y S V-S A light edge

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms14 Proof Let T be a minimum spanning tree that includes A. If (u,v) is in T, then (u,v) is safe for A. If (u,v) is not in T, then T contains an edge (x,y) crossing S and V-S. Since both (u,v) and (x,y) cross S and V-S, there is a cycle for edges in T  (u,v). Another spanning tree T ’ can be created from T. u v x y S V-S A light edge T

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms15 Proof u v x y S V-S A light edge T u v x y S V-S A light edge T’T’ Add edge (u,v) -> create a cycleRemove edge (x,y) -> cut the cycle T ’ is created from T.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms16 Proof Thus, T ’ is a spanning tree that includes A. Next, we need to show that T ’ is a minimum spanning tree. From the construction of T ’, T ’ = T-{(x,y)}  {(u,v)}. Thus, w(T ’ ) = w(T) - w(x,y) + w(u,v). Since (u,v) is a light edge cross S and V-S, w(u,v)  w(x,y). Thus, w(T ’ )  w(T). Since T is a minimum spanning tree, w(T ’ ) = w(T). Then, T ’ is also a minimum spanning tree. As a result, (u, v) is safe for A.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms17 Corollary Let G = (V, E) be a connected, undirected graph with a real- valued weight function w defined on E, A be a subset of E that is included in some minimum spanning tree for G, and C = (V C, E C ) be a connected component (tree) in the forest G A = (V, A). If (u, v) is a light edge connecting C to another tree in G A, then (u, v) is safe for A. Proof (V C, V−V C ) respects A, and (u, v) is a light edge for this cut. Therefore, (u, v) is safe for A.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms18 Kruskal’s Algorithm  Concept Build a forest of minimum spanning trees. Repeatedly connect the trees to create a subset of a minimum spanning tree, until all nodes are covered. In connecting two trees, choose the edge of the least weight.  Let C 1 and C 2 denote the two trees that are connected by (u, v). Since (u, v) must be a light edge connecting C 1 to some other tree, we need to prove that (u, v) is a safe edge for C 1.  Kruskal’s algorithm is a greedy algorithm, because at each step it adds to the forest an edge of least possible weight.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms19 Example of Kruskal’s Algorithm 128 5 1 29 3 1 8 75 3 9 10 11

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms20 Kruskal’s Algorithm MST-KRUSKAL(G,w) A =  for each vertex v in V [G] do MAKE-SET(v) sort the edges of E in nondecreasing order by weight w for each edge (u, v) in E, taken in nondecreasing order by weight do if FIND-SET(u)  FIND-SET(v) then A = A  {(u, v)} UNION(u, v) return A

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms21 Kruskal’s Algorithm: Complexity MST-KRUSKAL(G,w) A =  for each vertex v in V [G] do MAKE-SET(v) sort the edges of E in nondecreasing order by weight w for each edge (u, v) in E, taken in nondecreasing order by weight do if FIND-SET(u)  FIND-SET(v) then A = A  {(u, v)} UNION(u, v) return A O(e lg e) O((v+e) lg v) O(v)O(v) e  v 2 ; O(e lg v)

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms22 Prim’s Algorithm  Prim’s algorithm has the property that the edges in the set A always form a single tree.  The tree starts from an arbitrary root vertex r.  Grow the tree until it spans all the vertices in V. At each step, a light edge is added to the tree A that connects A to an isolated vertex of G A = (V, A).

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms23 Example of Prim’s Algorithm 128 5 1 29 3 1 8 75 3 9 10 11

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms24 Prim’s Algorithm PRIM(G,w, r) for each u in V[G] do key[u] = ∞ π[u] = NIL key[r] = 0 Q = V[G] while Q   dou = EXTRACT-MIN(Q) for each v in Adj[u] doif v in Q and w(u, v) < key[v] then π[v] = u key[v] = w(u, v)

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms25 Execution of Prim’s Algorithm 128 5 1 29 3 1 8 75 3 9 10 11 012 2 5 9 71 3 3 10 1 8 5

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms26 Prim’s Algorithm:Complexity PRIM(G,w, r) for each u in V[G] do key[u] = ∞ π[u] = NIL key[r] = 0 Q = V[G] while Q   dou = EXTRACT-MIN(Q) for each v in Adj[u] doif v in Q and w(u, v) < key[v] then π[v] = u key[v] = w(u, v) O(v)O(v) Also build min-heap O(lg v) O(v lg v) O(e lg v)

Huffman Code

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms28 Problem  Find an optimal prefix code representing a set of characters in a file, where each character has a frequency of occurrences.  Prefix code Codes in which no codeword is also a prefix of some other codeword. {0, 110, 101, 111, 1000, 1001} is a prefix code. {0, 110, 101, 111, 1010, 0111} is not prefix code  Optimality The code yields a file with the minimum number of bits.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms29 Creating Huffman Code: Example ก : 36 ข : 17 ค : 17 ง : 15 จ : 10 ฉ : 5 15 30 34 64 100 0 0 0 0 1 1 1 1 10 0 10111011101111

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms30 Optimal Code  An optimal code for a file is always represented by a full binary tree, in which every nonleaf node has two children.  If C is the alphabet from which the characters are drawn and all character frequencies are positive, then the tree for an optimal prefix code has exactly |C| leaves, one for each letter of the alphabet, and exactly |C|−1 internal nodes.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms31 Full Binary Trees for Prefix Code 1 Tree for 1 letter 1 A 2 Tree for 2 letters B 3 1 A 2 A 1 2 B 3 Tree for 3 letters 3 4 B C 1 A 2 B 3 1 A 2 C 4 Tree for 4 letters

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms32 Full Binary Trees for Prefix Code B 3 1 A 2 C 4 3 4 B C 1 A 2 D 5 3 4 B C 1 A 2 Tree for 4 letters 3 4 B C 1 A 2D 5 Tree for 5 letters

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms33 Creating Huffman Code: Example ก : 36 ข : 17 ค : 17 ง : 15 จ : 10 ฉ : 5 จฉ : 15 งจฉ : 30 ขค : 34 ขคงจฉ : 64 กขคงจฉ : 100 0 0 0 0 1 1 1 1 10 0 10010111011101111

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms34 Theorem  A full binary tree for an optimal prefix code for C letters has exactly C leaves, and exactly C−1 internal nodes. Proof by induction. Basis: C=1. If there is one letter, the binary tree requires only 1 leaf node, and 0 internal node. Induction hypotheses: For C< n, the full binary tree for an optimal prefix code for C letters has exactly C leaves, and exactly C−1 internal nodes.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms35 Theorem  Induction Step: Let T be a full binary tree for an optimal prefix code for C+1 letters. To create a full binary tree or optimal prefix code, we can take a full binary tree for an optimal prefix code for C letters, and add another leaf node L by either adding a new root node R and put L and the old root of T as its children, or replacing a leaf node N of T by an internal node and put L and N as its children. In either case, the number of leaf nodes of T is C+1 and the number of internal nodes is C.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms36 Creating Huffman Code: Algorithms HUFFMAN(C) n = |C| Q = C for i = 1 to n − 1 doallocate a new node z z.left = x = EXTRACT-MIN(Q) z.right = y = EXTRACT-MIN(Q) z.f =x.f + y.f INSERT(Q, z) ►Return the root of the tree. return EXTRACT-MIN(Q) Use min-heap for Q O(n) to build min-heap O(lg n) O(n lg n)

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms37 Greedy-choice property of Huffman Code Let C be an alphabet in which each character c  C has frequency f [c], and x and y be two characters in C having the lowest frequencies.  Then, there exists an optimal prefix code for C in which the codewords for x and y have the same length and differ only in the last bit.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms38 Proof  The idea of the proof is to : take the tree T representing an arbitrary optimal prefix code modify it to make a tree representing another optimal prefix code such that the characters x and y appear as sibling leaves of maximum depth in the new tree.  If we can do this, then their codewords will have the same length and differ only in the last bit.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms39 Proof Let a and b be two characters that are sibling leaves of maximum depth in T. Assume that f [a] ≤ f [b] and f [x] ≤ f [y]. Since f [x] and f [y] are the two lowest leaf frequencies, in order, and f [a] and f [b] are two arbitrary frequencies, in order, we have f [x] ≤ f [a] and f [y] ≤ f [b].

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms40 Proof Exchange the positions of a and x in T to produce T ’. B(T) − B(T ’ ) =  f(c) dT (c) −  f(c)dT ’ (c) c  C c  C = f [x] dT (x) + f [a] dT (a) − f [x] dT ’ (x) − f [a] dT ’ (a) = f [x] dT (x) + f [a] dT (a) − f [x] dT (a) − f [a] dT (x) = ( f [a] − f [x])( dT (a) − dT (x)) ≥ 0 x y bax y b a T T ’

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms41 Proof Then, exchange the positions of b and y in T ’ to produce T ’’. Similarly, it does not increase the cost, and so B(T ’ ) − B(T ’’ )  0. Therefore, B(T ’’ ) ≤ B(T). Since T is optimal, B(T) ≤ B(T ’’ ). Then, B(T ’’ ) = B(T). Thus, T ’’ is an optimal tree in which x and y appear as sibling leaves of maximum depth. xy b a x y b a T ’’ T ’

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms42 Optimal-substructure Property Let C be a given alphabet with frequency f [c] defined for each character c  C, x and y be two characters in C with minimum frequency, C ’ be the alphabet C with characters x, y removed and (new) character z added, so that C ’ = C − {x, y}  {z}; define f for C ’ as for C, except that f [z] = f [x] + f [y], and T be any tree representing an optimal prefix code for the alphabet C ’.  Then the tree T, obtained from T ’ by replacing the leaf node for z with an internal node having x and y as children, represents an optimal prefix code for the alphabet C.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms43 Proof Show that the cost B(T) can be expressed in terms of the cost B(T ’ ) by considering the component costs. For each c  C − {x, y}, we have d T (c) = d T ’ (c). f [c] d T (c) = f [c] d T ’ (c). Since d T (x) = d T (y) = d T ’ (z) + 1, we have f [x]d T (x) + f [y]d T (y)= ( f [x] + f [y])( d T ’ (z) + 1) = f [z] d T ’ (z) + ( f [x] + f [y]) Thus, B(T) = B(T ’ ) + f [x] + f [y]. That is, B(T ’ ) = B(T) − f [x] − f [y].

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms44 Proof We now prove by contradiction. Suppose T does not represent an optimal prefix code for C. Then there exists a tree T ’’ such that B(T ’’ ) < B(T). Without loss of generality, T ’’ has x and y as siblings.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms45 Proof Let T ’’’ be the tree T ’’ with the common parent of x and y replaced by a leaf z with frequency f [z] = f [x] + f [y]. Then, B(T ’’’ ) = B(T ’’ ) − f [x] − f [y] < B(T) − f [x] − f [y] = B(T ’ ). We reach a contradiction to the assumption that T ’ represents an optimal prefix code for C ’. Thus, T must represent an optimal prefix code for the alphabet C.

Interval Scheduling

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms47 Problem definition  Let S be {a 1, a 2,..., a n } of n proposed activities that wish to use the same resource.  Only one activity can use the resource at a time.  Each activity a i has a start time s i and a finish time f i, where 0 ≤ s i < f i < ∞.  If selected, activity a i takes place during the half-open time interval [s i, f i ).  Activities a i and a j are compatible if the intervals [s i, f i ) and [s j, f j ) do not overlap.  The activity-selection problem is to select a largest subset of mutually compatible activities.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms48 Example A1A1 A2A2 A3A3 S

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms49 Subproblems  S i j = {a k  S : f i ≤ s k < f k ≤ s j } the subset of activities in S that can start after activity a i finishes and finish before activity a j starts.  Add activities a 0 and a n+1 such that f 0 = 0 and s n+1 =∞.  Then S = S 0,n+1, and 0 ≤ i, j ≤ n + 1.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms50 Optimal Solution  Let A i j be an optimal solution to S i j.  Let c[i, j ] be the number of activities in a maximum-size subset of mutually compatible activities in S i j. c[i, j ] = 0 for i ≥ j (i.e. S i j =  )  c[i, j ] = c[i, k] + c[k, j ] + 1 if a k be an activity in A i j. Then,  c[i, j ] =  0 if S i j =   max {c[i, k] + c[k, j ] + 1} if S i j    i<k<j a k  S i j akak

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms51 Theorem  Consider any nonempty subproblem S i j, and let a m be the activity in S i j with the earliest finish time: f m = min { f k : a k ∈ S i j }. Then,  Activity a m is used in some maximum-size subset of mutually compatible activities of S i j.  The subproblem S im is empty, so that choosing a m leaves the subproblem S mj as the only one that may be nonempty.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms52 Proof: First part  Let A i j be a maximum-size subset of mutually compatible activities of S i j, and A i j is sorted in monotonically increasing order of finish time.  Let a k be the first activity in A i j.  If a k = a m, a m is used in some maximum-size subset of mutually compatible activities of S i j.  If a k  a m, we construct the subset A ’ i j = A i j − {a k }  {a m }. A ij A ’ ij S akak amam

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms53 Proof: First part  The activities in A ’ i j are disjoint, since the activities in A ij are, a k is the first activity in A i j to finish, and f m ≤ f k.  Noting that A ’ i j has the same number of activities as A i j, we see that A ’ i j is a maximum-size subset of mutually compatible activities of S i j that includes a m. A ij A ’ ij S akak amam

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms54 Proof: Second part Let S im be nonempty.  Then, there is an activity a k such that f i ≤ s k <f k ≤ s m < f m.  Then, a k is also in S i j and it has an earlier finish time than a m, which contradicts our choice of a m.  We conclude that S im is empty. S amam aiai ajaj akak

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms55 Greedy Solution A i j = {a k }  A kj where a k is the activity which finishes earliest among activities in S i j.

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms56 Recursive Algorithms ACTIVITY-SELECTOR (s, f, i, n) m = i + 1 while m ≤ n and s[m] < f [i] ► Find the first activity in S i,n+1. do m = m + 1 if m ≤ n then return {a m }  ACTIVITY-SELECTOR (s, f,m, n) else return 

Jaruloj ChongstitvatanaChapter 3: Greedy Algorithms57 Iterative Algorithm GREEDY-ACTIVITY-SELECTOR (s, f ) n = length[s] A = {a 1 } i = 1 for m = 2 to n doif s[m ] ≥ f[i] then A = A  {a m } i = m return A