Data Structure and Algorithms

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Lecture 4 (week 2) Source Coding and Compression
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Greedy Algorithms Amihood Amir Bar-Ilan University.
Greedy Algorithms Greed is good. (Some of the time)
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Compression & Huffman Codes
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
1 Assignment 2: (Due at 10:30 a.m on Friday of Week 10) Question 1 (Given in Tutorial 5) Question 2 (Given in Tutorial 7) If you do Question 1 only, you.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Lecture 6: Greedy Algorithms I Shang-Hua Teng. Optimization Problems A problem that may have many feasible solutions. Each solution has a value In maximization.
Data Structures – LECTURE 10 Huffman coding
Greedy Algorithms Huffman Coding
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
CS420 lecture eight Greedy Algorithms. Going from A to G Starting with a full tank, we can drive 350 miles before we need to gas up, minimize the number.
16.Greedy algorithms Hsu, Lih-Hsing. Computer Theory Lab. Chapter 16P An activity-selection problem Suppose we have a set S = {a 1, a 2,..., a.
Algorithm Design & Analysis – CS632 Group Project Group Members Bijay Nepal James Hansen-Quartey Winter
Advanced Algorithm Design and Analysis (Lecture 5) SW5 fall 2004 Simonas Šaltenis E1-215b
1 Analysis of Algorithms Chapter - 08 Data Compression.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Bahareh Sarrafzadeh 6111 Fall 2009
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
1 Algorithms CSCI 235, Fall 2015 Lecture 30 More Greedy Algorithms.
COSC 3101A - Design and Analysis of Algorithms 9 Knapsack Problem Huffman Codes Introduction to Graphs Many of these slides are taken from Monica Nicolescu,
Greedy Algorithms.
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
1Computer Sciences Department. 2 Advanced Design and Analysis Techniques TUTORIAL 7.
CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy.
Greedy Algorithms Analysis of Algorithms.
Huffman encoding.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
CS6045: Advanced Algorithms Greedy Algorithms. Main Concept –Divide the problem into multiple steps (sub-problems) –For each step take the best choice.
CSCI 58000, Algorithm Design, Analysis & Implementation Lecture 12 Greedy Algorithms (Chapter 16)
Design & Analysis of Algorithm Huffman Coding
HUFFMAN CODES.
Greedy Algorithms Alexandra Stefan.
CSC317 Greedy algorithms; Two main properties:
Madivalappagouda Patil
CSCE 411 Design and Analysis of Algorithms
The Greedy Method and Text Compression
Introduction to Algorithms`
Greedy Algorithm.
Chapter 8 – Binary Search Tree
Chapter 16: Greedy Algorithm
Chapter 9: Huffman Codes
Huffman Coding.
CS6045: Advanced Algorithms
Algorithms (2IL15) – Lecture 2
Advanced Algorithms Analysis and Design
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Chapter 16: Greedy algorithms Ming-Te Chi
Merge Sort Dynamic Programming
Greedy Algorithms TOPICS Greedy Strategy Activity Selection
Greedy Algorithms Alexandra Stefan.
Chapter 16: Greedy algorithms Ming-Te Chi
Podcast Ch23d Title: Huffman Compression
Lecture 2: Greedy Algorithms
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%)
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Analysis of Algorithms CS 477/677
Presentation transcript:

Data Structure and Algorithms Dr. Maheswari Karthikeyan Lecture 10 27/04/2013

Greedy Algorithms

Greedy Algorithms Similar to dynamic programming, but simpler approach Also used for optimization problems Idea: When we have a choice to make, make the one that looks best right now Make a locally optimal choice in hope of getting a globally optimal solution Greedy algorithms don’t always yield an optimal solution When the problem has certain general characteristics, greedy algorithms give optimal solutions

Designing Greedy Algorithms Cast the optimization problem as one for which: we make a choice and are left with only one subproblem to solve Prove the GREEDY CHOICE that there is always an optimal solution to the original problem that makes the greedy choice Prove the OPTIMAL SUBSTRUCTURE: the greedy choice + an optimal solution to the resulting subproblem leads to an optimal solution

Dynamic Programming vs. Greedy Algorithms We make a choice at each step The choice depends on solutions to subproblems Bottom up solution, from smaller to larger subproblems Greedy algorithm Make the greedy choice and THEN Solve the subproblem arising after the choice is made The choice we make may depend on previous choices, but not on solutions to subproblems Top down solution, problems decrease in size

Huffman Codes Widely used technique for data compression Assume the data to be a sequence of characters Looking for an effective way of storing the data Binary character code Uniquely represents a character by a binary string

Fixed-Length Codes E.g.: Data file containing 100,000 characters 3 bits needed a = 000, b = 001, c = 010, d = 011, e = 100, f = 101 Requires: 100,000  3 = 300,000 bits a b c d e f Frequency (thousands) 45 13 12 16 9 5

Huffman Codes a b c d e f Frequency (thousands) 45 13 12 16 9 5 Use the frequencies of occurrence of characters to build a optimal way of representing each character a b c d e f Frequency (thousands) 45 13 12 16 9 5

Variable-Length Codes E.g.: Data file containing 100,000 characters Assign short codewords to frequent characters and long codewords to infrequent characters a = 0, b = 101, c = 100, d = 111, e = 1101, f = 1100 (45  1 + 13  3 + 12  3 + 16  3 + 9  4 + 5  4) 1000 = 224,000 bits a b c d e f Frequency (thousands) 45 13 12 16 9 5

Prefix Codes Prefix codes: Codes for which no codeword is also a prefix of some other codeword Better name would be “prefix-free codes” We can achieve optimal data compression using prefix codes We will restrict our attention to prefix codes

Encoding with Binary Character Codes Concatenate the codewords representing each character in the file E.g.: a = 0, b = 101, c = 100, d = 111, e = 1101, f = 1100 abc = 0  101  100 = 0101100

Decoding with Binary Character Codes Prefix codes simplify decoding No codeword is a prefix of another  the codeword that begins an encoded file is unambiguous Approach Identify the initial codeword Translate it back to the original character Repeat the process on the remainder of the file E.g.: a = 0, b = 101, c = 100, d = 111, e = 1101, f = 1100 001011101 =  0  101  1101 = aabe

Prefix Code Representation Binary tree whose leaves are the given characters Binary codeword the path from the root to the character, where 0 means “go to the left child” and 1 means “go to the right child” Length of the codeword Length of the path from root to the character leaf (depth of node) 100 a: 45 55 1 25 30 c: 12 b: 13 14 f: 5 e: 9 d: 16 100 86 14 58 28 a: 45 b: 13 c: 12 d: 16 e: 9 f: 5 1

Optimal Codes An optimal code is always represented by a full binary tree Every non-leaf has two children Fixed-length code is not optimal, variable-length is How many bits are required to encode a file? Let C be the alphabet of characters Let f(c) be the frequency of character c Let dT(c) be the depth of c’s leaf in the tree T corresponding to a prefix code the cost of tree T

Constructing a Huffman Code A greedy algorithm that constructs an optimal prefix code called a Huffman code Assume that: C is a set of n characters Each character has a frequency f(c) The tree T is built in a bottom up manner Idea: Start with a set of |C| leaves At each step, merge the two least frequent objects: the frequency of the new node = sum of two frequencies Use a min-priority queue Q, keyed on f to identify the two least frequent objects a: 45 c: 12 b: 13 f: 5 e: 9 d: 16

Example a: 45 c: 12 b: 13 d: 16 14 f: 5 e: 9 1 a: 45 c: 12 b: 13 f: 5 1 a: 45 c: 12 b: 13 f: 5 e: 9 d: 16 d: 16 c: 12 b: 13 25 a: 45 f: 5 e: 9 14 1 f: 5 e: 9 14 c: 12 b: 13 25 d: 16 30 a: 45 1 f: 5 e: 9 14 c: 12 b: 13 25 d: 16 30 55 a: 45 100 1 a: 45 f: 5 e: 9 14 c: 12 b: 13 25 d: 16 30 55 1

Building a Huffman Code Running time: O(nlgn) Alg.: HUFFMAN(C) n  C  Q  C for i  1 to n – 1 do allocate a new node z left[z]  x  EXTRACT-MIN(Q) right[z]  y  EXTRACT-MIN(Q) f[z]  f[x] + f[y] INSERT (Q, z) return EXTRACT-MIN(Q) O(n) O(nlgn)

Example Calculate the Huffman code for the following characters Character Frequency 'a' 12 'b' 2 'c' 7 'd' 13 'e' 14 'f' 85

Example Letter Code 'a' 001 'b' 0000 'c' 0001 'd' 010 'e' 011 'f' 1 Decode the bits 1100010000010010 The string is ffcbdd