The Greedy Method and Text Compression

Slides:



Advertisements
Similar presentations
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Advertisements

The Greedy Method1. 2 Outline and Reading The Greedy Method Technique (§5.1) Fractional Knapsack Problem (§5.1.1) Task Scheduling (§5.1.2) Minimum Spanning.
Greedy Algorithms Amihood Amir Bar-Ilan University.
Chapter 5 Fundamental Algorithm Design Techniques.
Merge Sort 4/15/2017 6:09 PM The Greedy Method The Greedy Method.
Compression & Huffman Codes
Optimal Merging Of Runs
© 2004 Goodrich, Tamassia Greedy Method and Compression1 The Greedy Method and Text Compression.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
© 2004 Goodrich, Tamassia Greedy Method and Compression1 The Greedy Method and Text Compression.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
CSC 213 Lecture 18: Tries. Announcements Quiz results are getting better Still not very good, however Average score on last quiz was 5.5 Every student.
6/26/2015 7:13 PMTries1. 6/26/2015 7:13 PMTries2 Outline and Reading Standard tries (§9.2.1) Compressed tries (§9.2.2) Suffix tries (§9.2.3) Huffman encoding.
1 The Greedy Method CSC401 – Analysis of Algorithms Lecture Notes 10 The Greedy Method Objectives Introduce the Greedy Method Use the greedy method to.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Lecture 1: The Greedy Method 主講人 : 虞台文. Content What is it? Activity Selection Problem Fractional Knapsack Problem Minimum Spanning Tree – Kruskal’s Algorithm.
Advanced Algorithm Design and Analysis (Lecture 5) SW5 fall 2004 Simonas Šaltenis E1-215b
Data Compression1 File Compression Huffman Tries ABRACADABRA
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Data Structures Week 6: Assignment #2 Problem
© 2004 Goodrich, Tamassia Tries1. © 2004 Goodrich, Tamassia Tries2 Preprocessing Strings Preprocessing the pattern speeds up pattern matching queries.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
5-1-1 CSC401 – Analysis of Algorithms Chapter 5--1 The Greedy Method Objectives Introduce the Brute Force method and the Greedy Method Compare the solutions.
GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling.
The Greedy Method. The Greedy Method Technique The greedy method is a general algorithm design paradigm, built on the following elements: configurations:
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Greedy Algorithms Analysis of Algorithms.
Spring 2008The Greedy Method1. Spring 2008The Greedy Method2 Outline and Reading The Greedy Method Technique (§5.1) Fractional Knapsack Problem (§5.1.1)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
Sorting With Priority Queue In-place Extra O(N) space
Tries 4/16/2018 8:59 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
HUFFMAN CODES.
CSC317 Greedy algorithms; Two main properties:
Tries 07/28/16 11:04 Text Compression
Tries 5/27/2018 3:08 AM Tries Tries.
CSCE 411 Design and Analysis of Algorithms
Greedy Technique.
Greedy Method 6/22/2018 6:57 PM Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.
Applied Algorithmics - week7
Merge Sort 7/29/ :21 PM The Greedy Method The Greedy Method.
The Greedy Method and Text Compression
Heaps 8/2/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser,
Tries 9/14/ :13 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
Introduction to Algorithms`
Greedy Algorithm.
Pattern Matching 9/14/2018 3:36 AM
13 Text Processing Hongfei Yan June 1, 2016.
Optimal Merging Of Runs
Analysis & Design of Algorithms (CSCE 321)
Optimal Merging Of Runs
Merge Sort 11/28/2018 2:18 AM The Greedy Method The Greedy Method.
The Greedy Method Spring 2007 The Greedy Method Merge Sort
Merge Sort 11/28/2018 2:21 AM The Greedy Method The Greedy Method.
Merge Sort 11/28/2018 8:16 AM The Greedy Method The Greedy Method.
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Huffman Coding CSE 373 Data Structures.
Merge Sort Dynamic Programming
Merge Sort 1/17/2019 3:11 AM The Greedy Method The Greedy Method.
Greedy Algorithms TOPICS Greedy Strategy Activity Selection
Data Structure and Algorithms
Greedy Algorithms Alexandra Stefan.
Tries 2/23/2019 8:29 AM Tries 2/23/2019 8:29 AM Tries.
Algorithm Design Techniques Greedy Approach vs Dynamic Programming
Merge Sort 5/2/2019 7:53 PM The Greedy Method The Greedy Method.
Podcast Ch23d Title: Huffman Compression
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Analysis of Algorithms CS 477/677
Presentation transcript:

The Greedy Method and Text Compression 9/11/2018 8:34 PM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser, Wiley, 2014 The Greedy Method and Text Compression Greedy Method

Greedy Method A general algorithm design with the following elements: configurations: different choices, collections, or values to find (e.g. paths, spanning trees) objective function: a score assigned to configurations (e.g. path distance, total edge weight) Greedy Method

Greedy Method A general algorithm design with the following elements: configurations: different choices, collections, or values to find (e.g. paths, spanning trees) objective function: a score assigned to configurations (e.g. path distance, total edge weight) best when problems have the greedy-choice property: a globally-optimal solution can be found by a series of (best/greedy) local improvements from a starting configuration. (e.g. choose the smallest D[v] value in Dijkstra, choose the smallest weighted edge in Kruksal) Greedy Method

Text Compression Given a string X, efficiently encode X into a smaller string Y Saves memory and/or bandwidth A good approach: Huffman encoding Compute frequency f(c) for each character c. Encode high-frequency characters with short code words Use an optimal encoding tree to determine the code words Greedy Method

Encoding Tree Example A code is a mapping of each character of an alphabet to a binary code-word 00 010 011 10 11 a b c d e Greedy Method

Encoding Tree Example A code is a mapping of each character of an alphabet to a binary code-word A prefix code is a binary code such that no code-word is the prefix of another code-word E.g 00 is not a prefix of 010 or 011 00 010 011 10 11 a b c d e Greedy Method

Encoding Tree Example a b c d e 1 A code is a mapping of each character of an alphabet to a binary code-word A prefix code is a binary code such that no code-word is the prefix of another code-word An encoding tree represents a prefix code Each external node stores a character The code word of a character is given by the path from the root to the external node storing the character (0 for a left child and 1 for a right child) a b c d e 1 00 010 011 10 11 a b c d e Greedy Method

Encoding a string into bits string = abeac 00 010 11 00 011 [spaces added] ie 000101100011 a b c d e 1 00 010 011 10 11 a b c d e Greedy Method

Decoding bits into a string 000101100011 000101100011 -> a 000101100011 -> b 000101100011 -> e 000101100011 -> c Not ambiguous because each code-word is not a prefix of another code-word “Prefix code” a b c d e 1 Greedy Method

Encoding Tree Optimization Given a text string X, find a prefix code for the characters of X that yields a small encoding for X Frequent characters should have long code-words Rare characters should have short code-words Example X = abracadabra T1 encodes X into 29 bits T2 encodes X into 24 bits T1 T2 c a r d b a c d b r Greedy Method

Hoffman algorithm X = abracadabra Frequencies a b c d r 5 2 1 c a b d 4 6 11 X = abracadabra Frequencies a b c d r 5 2 1 c a b d r 2 5 4 6 c a r d b 5 2 1 c a r d b 2 5 c a b d r 2 5 4 Greedy Method

Extended Huffman Tree Example Greedy Method

Huffman’s Algorithm Greedy Method

Huffman’s Algorithm Greedy choice/step Greedy Method

Huffman’s Algorithm Given a string X Huffman’s algorithm constructs a prefix code that minimizes the size of the encoding of X optimal when each symbol (e.g. letter) has a separate code Greedy Method

Time Complexity Total: ? n = size of X d = # of distinct characters in X Find frequency of each character ? d insertions into priority queue (Q) 2 * (d-1) removeMin operations from Q d-1 insertions into Q Total: ? Greedy Method

Time Complexity Total: O(n + d + d log d) –> O(n + d log d) n = size of X d = # of distinct characters in X Find frequency of each character O(n) d insertions into priority queue (Q) O(d) [bottom up heap construction] 2 * (d-1) removeMin operations from Q O(d log d) d-1 insertions into Q Total: O(n + d + d log d) –> O(n + d log d) Greedy Method

Skipping the rest Greedy Method

The Fractional Knapsack Problem (not in book) Given: A set S of n items, with each item i having bi - a positive benefit wi - a positive weight Goal: Choose items with maximum total benefit but with weight at most W. If we are allowed to take fractional amounts, then this is the fractional knapsack problem. In this case, we let xi denote the amount we take of item i Objective: maximize Constraint: Greedy Method

Example Given: A set S of n items, with each item i having bi - a positive benefit wi - a positive weight Goal: Choose items with maximum total benefit but with weight at most W. 10 ml “knapsack” Solution: 1 ml of 5 2 ml of 3 6 ml of 4 1 ml of 2 Items: 1 2 3 4 5 Weight: 4 ml 8 ml 2 ml 6 ml 1 ml Benefit: $12 $32 $40 $30 $50 Value: 3 4 20 5 50 ($ per ml) Greedy Method

The Fractional Knapsack Algorithm Greedy choice: Keep taking item with highest value (benefit to weight ratio) Since Run time: O(n log n). Why? Correctness: Suppose there is a better solution there is an item i with higher value than a chosen item j, but xi<wi, xj>0 and vi<vj If we substitute some i with j, we get a better solution How much of i: min{wi-xi, xj} Thus, there is no better solution than the greedy one Algorithm fractionalKnapsack(S, W) Input: set S of items w/ benefit bi and weight wi; max. weight W Output: amount xi of each item i to maximize benefit w/ weight at most W for each item i in S xi  0 vi  bi / wi {value} w  0 {total weight} while w < W remove item i w/ highest vi xi  min{wi , W - w} w  w + min{wi , W - w} Greedy Method

Task Scheduling (not in book) Given: a set T of n tasks, each having: A start time, si A finish time, fi (where si < fi) Goal: Perform all the tasks using a minimum number of “machines.” 1 9 8 7 6 5 4 3 2 Machine 1 Machine 3 Machine 2 Greedy Method

Task Scheduling Algorithm Greedy choice: consider tasks by their start time and use as few machines as possible with this order. Run time: O(n log n). Why? Correctness: Suppose there is a better schedule. We can use k-1 machines The algorithm uses k Let i be first task scheduled on machine k Machine i must conflict with k-1 other tasks But that means there is no non-conflicting schedule using k-1 machines Algorithm taskSchedule(T) Input: set T of tasks w/ start time si and finish time fi Output: non-conflicting schedule with minimum number of machines m  0 {no. of machines} while T is not empty remove task i w/ smallest si if there’s a machine j for i then schedule i on machine j else m  m + 1 schedule i on machine m Greedy Method

Example Given: a set T of n tasks, each having: A start time, si A finish time, fi (where si < fi) [1,4], [1,3], [2,5], [3,7], [4,7], [6,9], [7,8] (ordered by start) Goal: Perform all tasks on min. number of machines Machine 3 Machine 2 Machine 1 1 2 3 4 5 6 7 8 9 Greedy Method