Greedy algorithms David Kauchak cs161 Summer 2009.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Introduction to Computer Science 2 Lecture 7: Extended binary trees
Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
CSCE 411H Design and Analysis of Algorithms Set 8: Greedy Algorithms Prof. Evdokia Nikolova* Spring 2013 CSCE 411H, Spring 2013: Set 8 1 * Slides adapted.
Greedy Algorithms Amihood Amir Bar-Ilan University.
Greedy Algorithms Greed is good. (Some of the time)
Analysis of Algorithms
CS38 Introduction to Algorithms Lecture 5 April 15, 2014.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Optimal Merging Of Runs
© 2004 Goodrich, Tamassia Greedy Method and Compression1 The Greedy Method and Text Compression.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 21 Instructor: Paul Beame.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy.
Greedy Algorithms CSE 331 Section 2 James Daly. Reminders Exam 2 next week Thursday, April 9th Covers heaps to spanning trees Greedy algorithms (today)
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
CS420 lecture eight Greedy Algorithms. Going from A to G Starting with a full tank, we can drive 350 miles before we need to gas up, minimize the number.
16.Greedy algorithms Hsu, Lih-Hsing. Computer Theory Lab. Chapter 16P An activity-selection problem Suppose we have a set S = {a 1, a 2,..., a.
David Luebke 1 8/23/2015 CS 332: Algorithms Greedy Algorithms.
Approaches to Problem Solving greedy algorithms dynamic programming backtracking divide-and-conquer.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling.
CSCI 256 Data Structures and Algorithm Analysis Lecture 6 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Bahareh Sarrafzadeh 6111 Fall 2009
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
1 Algorithms CSCI 235, Fall 2015 Lecture 30 More Greedy Algorithms.
Greedy Algorithms BIL741: Advanced Analysis of Algorithms I (İleri Algoritma Çözümleme I)1.
Greedy algorithms David Kauchak cs302 Spring 2012.
Greedy Algorithms Chapter 16 Highlights
1 Chapter 16: Greedy Algorithm. 2 About this lecture Introduce Greedy Algorithm Look at some problems solvable by Greedy Algorithm.
Greedy Algorithms Analysis of Algorithms.
CS 146: Data Structures and Algorithms July 28 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
Greedy algorithms 2 David Kauchak cs302 Spring 2012.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 17.
CS6045: Advanced Algorithms Greedy Algorithms. Main Concept –Divide the problem into multiple steps (sub-problems) –For each step take the best choice.
HUFFMAN CODES.
Greedy Algorithms Alexandra Stefan.
CSC317 Greedy algorithms; Two main properties:
CSCE 411 Design and Analysis of Algorithms
Chapter 5 : Trees.
Greedy Technique.
Greedy Method 6/22/2018 6:57 PM Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.
Merge Sort 7/29/ :21 PM The Greedy Method The Greedy Method.
The Greedy Method and Text Compression
The Greedy Method and Text Compression
Chapter 16: Greedy Algorithm
Analysis & Design of Algorithms (CSCE 321)
Merge Sort 11/28/2018 2:21 AM The Greedy Method The Greedy Method.
Algorithms (2IL15) – Lecture 2
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Chapter 16: Greedy algorithms Ming-Te Chi
Advanced Algorithms Analysis and Design
Merge Sort Dynamic Programming
Greedy Algorithms TOPICS Greedy Strategy Activity Selection
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structure and Algorithms
Greedy Algorithms Alexandra Stefan.
Chapter 16: Greedy algorithms Ming-Te Chi
Algorithm Design Techniques Greedy Approach vs Dynamic Programming
Lecture 2: Greedy Algorithms
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Analysis of Algorithms CS 477/677
Presentation transcript:

Greedy algorithms David Kauchak cs161 Summer 2009

Administrative Thank your TAs

Kruskal’s and Prim’s Make greedy decision about the next edge to add

Greedy algorithm structure A locally optimal decision can be made which results in a subproblem that does not rely on the local decision

Interval scheduling Given n activities A = [a 1,a 2,.., a n ] where each activity has start time s i and a finish time f i. Schedule as many as possible of these activities such that they don’t conflict.

Interval scheduling Given n activities A = [a 1,a 2,.., a n ] where each activity has start time s i and a finish time f i. Schedule as many as possible of these activities such that they don’t conflict. Which activities conflict?

Interval scheduling Given n activities A = [a 1,a 2,.., a n ] where each activity has start time s i and a finish time f i. Schedule as many as possible of these activities such that they don’t conflict. Which activities conflict?

Simple recursive solution Enumerate all possible solutions and find which schedules the most activities

Simple recursive solution Is it correct? max{all possible solutions} Running time? O(n!)

Can we do better? Dynamic programming O(n 2 ) Greedy solution – Is there a way to make a local decision?

Overview of a greedy approach Greedily pick an activity to schedule Add that activity to the answer Remove that activity and all conflicting activities. Call this A’. Repeat on A’ until A’ is empty

Greedy options Select the activity that starts the earliest, i.e. argmin{s 1, s 2, s 3, …, s n }?

Greedy options Select the activity that starts the earliest? non-optimal

Greedy options Select the shortest activity, i.e. argmin{f 1 -s 1, f 2 -s 2, f 3 -s 3, …, f n -s n }

Greedy options Select the shortest activity, i.e. argmin{f 1 -s 1, f 2 -s 2, f 3 -s 3, …, f n -s n } non-optimal

Greedy options Select the activity with the smallest number of conflicts

Greedy options Select the activity with the smallest number of conflicts

Greedy options Select the activity with the smallest number of conflicts

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }? Multiple optimal solutions

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Greedy options Select the activity that ends the earliest, i.e. argmin{f 1, f 2, f 3, …, f n }?

Efficient greedy algorithm Once you’ve identified a reasonable greedy heuristic: Prove that it always gives the correct answer Develop an efficient solution

Is our greedy approach correct? “Stays ahead” argument: show that no matter what other solution someone provides you, the solution provided by your algorithm always “stays ahead”, in that no other choice could do better

Is our greedy approach correct? “Stays ahead” argument Let r 1, r 2, r 3, …, r k be the solution found by our approach Let o 1, o 2, o 3, …, o k of another optimal solution Show our approach “stays ahead” of any other solution … r1r1 r2r2 r3r3 rkrk o1o1 o2o2 o3o3 okok …

Stays ahead … r1r1 r2r2 r3r3 rkrk o1o1 o2o2 o3o3 okok … Compare first activities of each solution

Stays ahead … r1r1 r2r2 r3r3 rkrk o1o1 o2o2 o3o3 okok … finish(r 1 ) ≤ finish(o 1 )

Stays ahead … r2r2 r3r3 rkrk o2o2 o3o3 okok … We have at least as much time as any other solution to schedule the remaining 2…k tasks

An efficient solution

Running time? Θ(n log n) Θ(n) Overall: Θ(n log n) Better than: O(n!) O(n 2 )

Scheduling all intervals Given n activities, we need to schedule all activities. Minimize the number of resources required.

Greedy approach? The best we could ever do is the maximum number of conflicts for any time period

Calculating max conflicts efficiently 3

1

3

1

Calculating max conflicts

Correctness? We can do no better then the max number of conflicts. This exactly counts the max number of conflicts.

Runtime? O(2n log 2n + n) = O(n log n)

Horn formulas Horn formulas are a particular form of boolean logic formulas They are one approach to allow a program to do logical reasoning Boolean variables: represent some event x = the murder took place in the kitchen y = the butler is innocent z = the colonel was asleep at 8 pm

Implications Left-hand side is an AND of any number of positive literals Right-hand side is a single literal x = the murder took place in the kitchen y = the butler is innocent z = the colonel was asleep at 8 pm If the colonel was asleep at 8 pm and the butler is innocent then the murder took place in the kitchen

Implications Left-hand side is an AND of any number of positive literals Right-hand side is a single literal x = the murder took place in the kitchen y = the butler is innocent z = the colonel was asleep at 8 pm the murder took place in the kitchen

Negative clauses An OR of any number of negative literals u = the constable is innocent t = the colonel is innocent y = the butler is innocent not every one is innocent

Goal Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula) u x y z

Goal Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula) u x y z not satifiable

Goal Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula) ?

Goal Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula) implications tell us to set some variables to true negative clauses encourage us make them false

A brute force solution Try each setting of the boolean variables and see if any of them satisfy the formula For n variables, how many settings are there? 2 n

A greedy solution? w 0 x 0 y 0 z 0

A greedy solution? w 0 x 1 y 0 z 0

A greedy solution? w 0 x 1 y 1 z 0

A greedy solution? w 1 x 1 y 1 z 0

A greedy solution? w 1 x 1 y 1 z 0 not satisfiable

A greedy solution

set all variables of the implications of the form “  x” to true

A greedy solution if the all variables of the lhs of an implication are true, then set the rhs variable to true

A greedy solution see if all of the negative clauses are satisfied

Correctness of greedy solution Two parts: If our algorithm returns an assignment, is it a valid assignment? If our algorithm does not return an assignment, does an assignment exist?

Correctness of greedy solution If our algorithm returns an assignment, is it a valid assignment?

Correctness of greedy solution If our algorithm returns an assignment, is it a valid assignment? explicitly check all negative clauses

Correctness of greedy solution If our algorithm returns an assignment, is it a valid assignment? don’t stop until all implications with all lhs elements true have rhs true

Correctness of greedy solution If our algorithm does not return an assignment, does an assignment exist? Our algorithm is “stingy”. It only sets those variables that have to be true. All others remain false.

Running time? ?

O(nm) n = number of variables m = number of formulas

Knapsack problems: Greedy or not? 0-1 Knapsack – A thief robbing a store finds n items worth v 1, v 2,.., v n dollars and weight w 1, w 2, …, w n pounds, where v i and w i are integers. The thief can carry at most W pounds in the knapsack. Which items should the thief take if he wants to maximize value. Fractional knapsack problem – Same as above, but the thief happens to be at the bulk section of the store and can carry fractional portions of the items. For example, the thief could take 20% of item i for a weight of 0.2w i and a value of 0.2v i.

Data compression Given a file containing some data of a fixed alphabet Σ (e.g. A, B, C, D), we would like to pick a binary character code that minimizes the number of bits required to represent the data. A C A D A A D B … … minimize the size of the encoded file

Compression algorithms

Simplifying assumption over general compression? Given a file containing some data of a fixed alphabet Σ (e.g. A, B, C, D), we would like to pick a binary character code that minimizes the number of bits required to represent the data. A C A D A A D B … … minimize the size of the encoded file

Frequency only The problem formulation makes the simplifying assumption that we only have character frequency information for a file A C A D A A D B … = SymbolFrequency ABCDABCD

Fixed length code Use ceil(log 2 |Σ|) bits for each character A = B = C = D =

Fixed length code Use ceil(log 2 |Σ|) bits for each character A = 00 B = 01 C = 10 D = 11 SymbolFrequency ABCDABCD How many bits to encode the file? 2 x x x x 37 = 260 bits

Fixed length code Use ceil(log 2 |Σ|) bits for each character A = 00 B = 01 C = 10 D = 11 SymbolFrequency ABCDABCD Can we do better? 2 x x x x 37 = 260 bits

Variable length code What about: A = 0 B = 01 C = 10 D = 1 SymbolFrequency ABCDABCD x x x x 37 = 173 bits

Decoding a file A = 0 B = 01 C = 10 D = What characters does this sequence represent?

Decoding a file A = 0 B = 01 C = 10 D = What characters does this sequence represent? A D or B?

Variable length code What about: A = 0 B = 100 C = 101 D = 11 SymbolFrequency ABCDABCD How many bits to encode the file? 1 x x x x 37 = 213 bits (18% reduction)

Prefix codes A prefix code is a set of codes where no codeword is a prefix of some other codeword A = 0 B = 100 C = 101 D = 11 A = 0 B = 01 C = 10 D = 1

Prefix tree We can encode a prefix code using a full binary tree where each child represents an encoding of a symbol A = 0 B = 100 C = 101 D = 11 A BC D 0 1

Decoding using a prefix tree To decode, we traverse the graph until a leaf node is reached and output the symbol A = 0 B = 100 C = 101 D = 11 A BC D 0 1

Decoding using a prefix tree Traverse the graph until a leaf node is reached and output the symbol A BC D

Decoding using a prefix tree Traverse the graph until a leaf node is reached and output the symbol A BC D B

Decoding using a prefix tree Traverse the graph until a leaf node is reached and output the symbol A BC D B A

Decoding using a prefix tree Traverse the graph until a leaf node is reached and output the symbol A BC D B A D

Decoding using a prefix tree Traverse the graph until a leaf node is reached and output the symbol A BC D B A D C

Decoding using a prefix tree Traverse the graph until a leaf node is reached and output the symbol A BC D B A D C A

Decoding using a prefix tree Traverse the graph until a leaf node is reached and output the symbol A BC D B A D C A B

Determining the cost of a file A BC D 0 1 SymbolFrequency ABCDABCD

Determining the cost of a file A BC D 0 1 SymbolFrequency ABCDABCD

Determining the cost of a file A BC D 0 1 SymbolFrequency ABCDABCD What if we label the internal nodes with the sum of the children?

Determining the cost of a file A BC D 0 1 SymbolFrequency ABCDABCD Cost is equal to the sum of the internal nodes and the leaf nodes

Determining the cost of a file A BC D times we see a prefix that starts with a 1 of those, 37 times we see an additional 1 the remaining 23 times we see an additional 0 70 times we see a 0 by itself of these, 20 times we see a last 1 and 3 times a last 0 As we move down the tree, one bit gets read for every nonroot node

A greedy algorithm? Given file frequencies, can we come up with a prefix-free encoding (i.e. build a prefix tree) that minimizes the number of bits?

SymbolFrequency ABCDABCD Heap

SymbolFrequency ABCDABCD Heap B3 C 20 D37 A70

SymbolFrequency ABCDABCD Heap BC23 D37 A70 BC merging with this node will incur an additional cost of 23

SymbolFrequency ABCDABCD Heap BCD 60 A 70 BC D 37 60

SymbolFrequency ABCDABCD Heap ABCD 130 BC D A 70

Is it correct? The algorithm selects the symbols with the two smallest frequencies first (call them f 1 and f 2 ) Consider a tree that did not do this: f1f1 fifi f2f2

Is it correct? The algorithm selects the symbols with the two smallest frequencies first (call them f 1 and f 2 ) Consider a tree that did not do this: f1f1 fifi f2f2 fifi f1f1 f2f2 - frequencies don’t change - cost will decrease since f 1 < f i contradiction

Runtime? 1 call to MakeHeap 2(n-1) calls ExtractMin n-1 calls Insert O(n log n)

Non-optimal greedy algorithms All the greedy algorithms we’ve looked at today give the optimal answer Some of the most common greedy algorithms generate good, but non-optimal solutions set cover clustering hill-climbing relaxation