Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 2: Greedy Algorithms

Similar presentations


Presentation on theme: "Lecture 2: Greedy Algorithms"— Presentation transcript:

1 Lecture 2: Greedy Algorithms
The change-making problem: Give changes for a specific amount n with the least number of coins of the denominations d1>d2>…>dm. For example, coin denominations in some country are: 10 dollars, 5 dollars and 1 dollar. How do you give change of 28 dollars? 210 dollars +15 dollars+3 1 dollar=28. GREEDY ALGORIYHM: (1) use 10 dollar coins as many as possible; (2) use 5 dollar coins as many as possible, and (3) the remain amount is for 1 dollar coins. You can prove that the no. of coins used is minimized. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

2 The change-making problem:
Strange denominations: 7 dollars, 5 dollars and 1 dollars Greedy algorithm for 11 dollars: 7dollars+4 1 dollar=11 dollars. (5 coins are required.) A better way: 2 5 dollars +1 dollar=11 dollars. (3 coins are required) Sometimes greedy algorithm works, sometimes it does not. For denominations: 10, 5 and 1 (The proof is easy.) If 5-dollar coins are used at least twice, then the remaining amount is 10 and we use a 10-dollar coin to replace two 5-dollar coins. If 1-dollar coins are used at least five times, then the remaining amount is  5 and we use a 5-dollar coin to replace five 1-dollar coins. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

3 Greedy works for denominations 10, 5, and 1
Strategy for Proof: (The strategy can be used for many problems. The hard part.) Compare an optimal solution with the solution given by greedy algorithm (bit by bit, or component by component). Let an optimal solution, denoted (x, y, z), have: x 10 dollar coins , y 5 dollar coins and z 1 dollar coins. Let the solution obtained from our greedy algorithm, denoted (x’, y’, z’) have: x’ 10 dollar coins , y’ 5 dollar coins and z’ 1 dollar coins. Show that the two solutions are the same. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

4 Greedy works for denominations 10, 5, and 1 (Fun part, not tested)
Theorem: The greedy algorithm works for denominations: 10, 5 and 1. Proof: We compare x and x’ (# of 10 dollar coins), (y and y’) # of 5 dollar coins and (z and z’) # of 1 dollar coins, one by one. Comparison of x and x’. Since x’ is from greedy algorithm, x’x. If x’>x, then y5+ z 1 10. (because x  10+ y5+ z 1 = x’ 10+y’5+ z ‘1 ) Thus, we can modify the optimal solution (x, y, z) using one more 10 dollar coin to replace 10 dollars that are expressed by (1) two 5 dollar coins, (2) one 5 dollar coin +five 1 dollar coins or (3) ten 1 dollar coins. The new solution (x+1, y’’, z’’) contains less # of coins than (x, y, z). Contradiction! Because by assumption, (x, y, z) is optimum. Thus, x’>x cannot be ture. Therefore, x’=x. Similarly, we can show y’=y and z’=z. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

5 The 0-1 Knapsack problem:
N items, where the i-th item is worth vi dollars and weight wi pounds. vi and wi are integers. A thief can carry at most W (integer) pounds. How to take as valuable a load as possible. An item cannot be divided into pieces. The fractional knapsack problem: The same setting, but the thief can take fractions of items. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

6 Solve the fractional Knapsack problem:
Greedy on the value per pound vi/wi. Each time, take the item with maximum vi/wi . If exceeds W, take fractions of the item. Proof of correctness: (The hard part) Let X = i1, i2, …ik be the optimal items taken. Consider the item j with the highest vi /wi. if j is not used in X (the optimal solution), get rid of some items (possibly fractional items) and add item j. (since fractional items are allowed, we can do it.) Total value is increased. One more item selected by greedy is added to X Repeat the process, X is changed to contain all items selected by greedy WITHOUT decreasing the totall value taken by the thief. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

7 The 0-1 knapsack problem cannot be solved by greedy
Counter example: (moderate part) W=10 Items found (6pounds, 12dollars), (5pounds, 9 dollar), (5pounds, 9 dollars), (3pounds, 3 dollars), (3 pounds, 3 dollars) If we first take (6, 12) according to greedy algorithm, then solution is (6, 12), (3, 3) (total value is 12+3=15). However, a better solution is (5, 9), (5, 9) with total value 18. To show that a statement does not hold, we only have to give an example. To show that the theorem is true, we have to give an proof. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

8 Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 The first way needs 1003=300 bits. The second way needs 45 1+13 3+12 3+16 3+9 4+5 4=232 bits. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

9 CS3335 Design and Analysis of Algorithms/WANG Lusheng
Variable-length code Need some care to read the code. (codeword: a=0, b=00, c=01, d=11.) Where to cut? 00 can be explained as either aa or b. Prefix of 0011: 0, 00, 001, and 0011. Prefix codes: no codeword is a prefix of some other codeword. (prefix free) Prefix codes are simple to encode and decode. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

10 Using codeword in Table to encode and decode
Encode: abc = = (just concatenate the codewords.) Decode: = = aabe a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

11 Encode: abc = = (just concatenate the codewords.) Decode: = = aabe (use the (right)binary tree below:) a:45 b:13 c:12 d:16 e:9 f:5 1 100 14 86 28 58 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 Tree for the fixed length codeword Tree for variable-length codeword 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

12 CS3335 Design and Analysis of Algorithms/WANG Lusheng
Binary tree Every nonleaf node has two children. The fixed-length code in our example is not optimal. The total number of bits required to encode a file is f ( c ) : the frequency (number of occurrences) of c in the file dT(c): denote the depth of c’s leaf in the tree 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

13 Constructing an optimal code
Formal definition of the problem: Input: a set of characters C={c1, c2, …, cn}, each cC has frequency f[c]. Output: a binary tree representing codewords so that the total number of bits required for the file is minimized. Huffman proposed a greedy algorithm to solve the problem. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

14 CS3335 Design and Analysis of Algorithms/WANG Lusheng
b:13 d:16 a:45 (b) a:45 d:16 e:9 f:5 14 1 b:13 c:12 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

15 CS3335 Design and Analysis of Algorithms/WANG Lusheng
14 1 b:13 c:12 25 (c) a:45 b:13 c:12 d:16 e:9 f:5 25 30 14 1 (d) 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

16 CS3335 Design and Analysis of Algorithms/WANG Lusheng
b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 1 (f) (e) 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

17 CS3335 Design and Analysis of Algorithms/WANG Lusheng
HUFFMAN(C) 1 n:=|C| 2 Q:=C 3 for i:=1 to n-1 do 4 z:=ALLOCATE_NODE() 5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q) 7 f[z]:=f[x]+f[y] 8 INSERT(Q,z) 9 return EXTRACT_MIN(Q) 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

18 CS3335 Design and Analysis of Algorithms/WANG Lusheng
The Huffman Algorithm This algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. C is a set of n characters, and each character c in C is a character with a defined frequency f[c]. Q is a priority queue, keyed on f, used to identify the two least-frequent characters to merge together. The result of the merger is a new object (internal node) whose frequency is the sum of the two objects. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

19 CS3335 Design and Analysis of Algorithms/WANG Lusheng
Time complexity Lines 4-8 are executed n-1 times. Each heap operation in Lines 4-8 takes O(lg n) time. Total time required is O(n lg n). Note: The details of heap operation will not be tested. Time complexity O(n lg n) should be remembered. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

20 CS3335 Design and Analysis of Algorithms/WANG Lusheng
Another example: e:4 a:6 c:6 b:9 d:11 c:6 b:9 d:11 e:4 a:6 10 1 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

21 CS3335 Design and Analysis of Algorithms/WANG Lusheng
10 1 d:11 c:6 b:9 15 1 c:6 b:9 15 1 d:11 e:4 a:6 10 1 21 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
b:9 15 1 d:11 e:4 a:6 10 21 36 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

23 Correctness of Huffman’s Greedy Algorithm (Fun Part, not required)
Again, we use our general strategy. Let x and y are the two characters in C having the lowest frequencies. (the first two characters selected in the greedy algorithm.) We will show the two properties: There exists an optimal solution Topt (binary tree representing codewords) such that x and y are siblings in Topt. Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T’ be an optimal tree for C’. Then we can get Topt from T’ by replacing z with z x y 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

24 CS3335 Design and Analysis of Algorithms/WANG Lusheng
Proof of Property 1 b x y c x b c y Topt Tnew Look at the lowest siblings in Topt, say, b and c. Exchange x with b and y with c. B(Topt)-B(Tnew)0 since f[x] and f[y] are the smallest. 1 is proved. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

25 CS3335 Design and Analysis of Algorithms/WANG Lusheng
Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T’ be an optimal tree for C’. Then we can get Topt from T’ by replacing z with Proof: Let T be the tree obtained from T’ by replacing z with the three nodes. B(T)=B(T’)+f[x]+f[y] … (1) (the length of the codes for x and y are 1 bit more than that of z.) Now prove T= Topt by contradiction. If TTopt, then B(T)>B(Topt) …(2) From 1, x and y are siblings in Topt . Thus, we can delete x and y from Topt and get another tree T’’ for C’. B(T’’)=B(Topt) –f[x]-f[y]<B(T)-f[x]-f[y]=B(T’). using (2) using (1) Thus, T(T’’)<B(T’). Contradiction to the assumption : T’ is optimum for C’. z y x 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng


Download ppt "Lecture 2: Greedy Algorithms"

Similar presentations


Ads by Google