Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy.

Similar presentations


Presentation on theme: "CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy."— Presentation transcript:

1

2 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 http://www.cs.cityu.edu.hk/~helena Greedy Algorithms

3 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 2 http://www.cs.cityu.edu.hk/~helena Greedy Algorithm Design Comparison: Dynamic ProgrammingGreedy Algorithms At each step, the choice is determined based on solutions of subproblems. At each step, we quickly make a choice that currently looks best. --A local optimal (greedy) choice. Bottom-up approachTop-down approach Sub-problems are solved first.Greedy choice can be made first before solving further sub- problems. Can be slower, more complexUsually faster, simpler

4 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 3 http://www.cs.cityu.edu.hk/~helena Huffman Codes For compressing data (sequence of characters) Widely used Very efficient (saving 20-90%) Use a table to keep frequencies of occurrence of characters. Output binary string. “Today’s weather is nice” “001 0110 0 0 100 1000 1110”

5 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 4 http://www.cs.cityu.edu.hk/~helena Huffman Codes FrequencyFixed-lengthVariable-lengthcodeword ‘a’450000000 ‘b’13000001101 ‘c’12000010100 ‘d’16000011111 ‘e’90001001101 ‘f’50001011100 Example: A file of 100,000 characters. Containing only ‘a’ to ‘e’ 300,000 bits 1*45000 + 3*13000 + 3*12000 + 3*16000 + 4*9000 + 4*5000 = 224,000 bits eg. “abc” = “000001010” eg. “abc” = “0101100”

6 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 5 http://www.cs.cityu.edu.hk/~helena 01 a:45b:13c:12d:16e:9 f:5 0 0 0 0 0 1 1 1 1 Huffman Codes The coding schemes can be represented by trees: FrequencyFixed-length (in thousands)codeword ‘a’45000 ‘b’13001 ‘c’12010 ‘d’16011 ‘e’9100 ‘f’5101 100 86 14 01 58 28 a:45b:13c:12d:16e:9 f:5 0 0 0 0 0 1 1 1 1 FrequencyVariable-length (in thousands)codeword ‘a’450 ‘b’13101 ‘c’12100 ‘d’16111 ‘e’91101 ‘f’51100 100 55 01 25 30 0 0 0 1 1 1 a:45 14 e:9 f:5 0 1 d:16 b:13c:12 01 a:45b:13c:12d:16e:9 f:5 0 0 0 0 0 1 1 1 1 14 01 58 28 a:45b:13c:12d:16e:9 f:5 0 0 0 0 0 1 1 1 1 86 14 01 58 28 a:45b:13c:12d:16e:9 f:5 0 0 0 0 0 1 1 1 1 Not a full binary tree A full binary tree every nonleaf node has 2 children A file of 100,000 characters.

7 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 6 http://www.cs.cityu.edu.hk/~helena Huffman Codes Frequency Codeword ‘a’450000 ‘b’13000101 ‘c’12000100 ‘d’16000111 ‘e’90001101 ‘f’50001100 100 55 01 25 30 0 0 0 1 1 1 a:45 14 e:9 f:5 0 1 d:16 b:13c:12 To find an optimal code for a file: 1. The coding must be unambiguous. Consider codes in which no codeword is also a prefix of other codeword. => Prefix Codes Prefix Codes are unambiguous. Once the codewords are decided, it is easy to compress (encode) and decompress (decode). 2. File size must be smallest. => Can be represented by a full binary tree. => Usually less frequent characters are at bottom Let C be the alphabet (eg. C={‘a’,’b’,’c’,’d’,’e’,’f’}) For each character c, no. of bits to encode all c’s occurrences = freq c *depth c File size B(T) =  c  C freq c *depth c Eg. “abc” is coded as “0101100”

8 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 7 http://www.cs.cityu.edu.hk/~helena Huffman Codes Huffman code (1952) was invented to solve it. A Greedy Approach. Q: A min-priority queue f:5e:9c:12b:13d:16 a:45 100 55 25 30 a:45 14 e:9 f:5 d:16 b:13c:12 b:13d:16 a:45 14 f:5 e:9 d:16 a:45 14 25 c:12 b:13 30 f:5 e:9 a:45 d:16 14 25 c:12 b:13 30 55 f:5 e:9 d:16 a:45 14 25 c:12 b:13 f:5 e:9 How do we find the optimal prefix code?

9 CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 8 http://www.cs.cityu.edu.hk/~helena Huffman Codes HUFFMAN(C) 1 Build Q from C 2 For i = 1 to |C|-1 3 Allocate a new node z 4z.left = x = EXTRACT_MIN(Q) 5z.right = y = EXTRACT_MIN(Q) 6z.freq = x.freq + y.freq 7Insert z into Q in correct position. 8 Return EXTRACT_MIN(Q) Q: A min-priority queue f:5e:9c:12b:13d:16 a:45 c:12b:13d:16 a:45 14 f:5 e:9 d:16 a:45 14 25 c:12 b:13 f:5 e:9 …. If Q is implemented as a binary min-heap, “Build Q from C” is O(n) “ EXTRACT_MIN (Q)” is O(lg n) “Insert z into Q” is O(lg n) Huffman(C) is O(n lg n) How is it “greedy”?


Download ppt "CS3381 Des & Anal of Alg (2001-2002 SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy."

Similar presentations


Ads by Google