# COMP 482: Design and Analysis of Algorithms

## Presentation on theme: "COMP 482: Design and Analysis of Algorithms"— Presentation transcript:

COMP 482: Design and Analysis of Algorithms
Spring 2013 Lecture 7 Prof. Swarat Chaudhuri

Recap: Interval Scheduling
Job j starts at sj and finishes at fj. Two jobs compatible if they don't overlap. Goal: find maximum subset of mutually compatible jobs. a Activity selection = interval scheduling OPT = B, E, H Note: smallest job (C) is not in any optimal solution, job that starts first (A) is not in any optimal solution. b c d e f g h Time 1 2 3 4 5 6 7 8 9 10 11

Recap: Interval Scheduling
Algorithm. Greedy algorithm; objective is Earliest Finish Time. Argument for optimality: Greedy stays ahead Also applicable to Truck Driver’s problem (selection of breakpoints) job ir+1 finishes before jr+1 Greedy: i1 i1 ir ir+1 Proof idea: greedy algorithm stays ahead. Ideally you may want to show that Greedy = OPT, but this won’t work as there are many OPT. So we show that the size of the solution in OPT is same as in Greedy. OPT: j1 j2 jr jr+1 . . . why not replace job jr+1 with job ir+1?

Variant: 24-7 Interval Scheduling
You have a processor that can operate People submit requests to run daily jobs on the processor. Each such job comes with a start time and an end time; if the job is accepted it must run continuously for the period between the start and end times, EVERY DAY. (Note that some jobs can start before midnight and end after midnight.) Given a list of n such jobs, your goal is to accept as many jobs as possible (regardless of length), subject to the constraint that the processor can run at most one job at any given point of time. Give an algorithm to do this. For example, here you have four jobs (6pm, 6am), (9pm, 4am), (3am, 2pm), (1pm, 7pm). The optimal solution is to pick the second and fourth jobs.

Solution Let I1,…,In be the n intervals. We call an Ij-restricted solution one that contains the interval Ij. Here’s an algorithm, for fixed j, to compute an Ij-restricted solution of maximum size. Let x be a point in Ij. First delete Ij and all intervals that overlap it. The remaining intervals do not contain the point x, so we can cut the timeline at x and produce an instance of the Interval Scheduling Problem from class. This takes O(n) time assuming intervals are sorted by ending time. Now, the algorithm for the full problem is to compute an Ij-restricted solution of maximum size for each j = 1,…,n. This takes a total of O(n2) time. We now pick the largest of these solutions and claim that it is the optimal. Why? Consider the optimal solution to the full problem. Suppose this produces a set of intervals S. There must be SOME Ij in S, so the solution is an optimal Ij-restricted solution. But then our algorithm would find it. The jobs can be viewed as forming a circle on a 24-hour clock

Recap: Interval Partitioning
Lecture j starts at sj and finishes at fj. Goal: find minimum number of classrooms to schedule all lectures so that no two occur at the same time in the same room. Ex: This schedule uses 4 classrooms to schedule 10 lectures. e j c d g b h a f i 9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 Time

Recap: Interval Partitioning
Lecture j starts at sj and finishes at fj. Goal: find minimum number of classrooms to schedule all lectures so that no two occur at the same time in the same room. Greedy algorithm. Consider lectures in increasing order of start time. Assign lecture to any compatible existing classroom. If not possible, open new classroom. c d f j b g i a e h 9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 Time

Interval Partitioning: Lower Bound on Optimal Solution
Def. The depth of a set of open intervals is the maximum number that contain any given time. Key observation. Number of classrooms needed by any algorithm  depth. Argument for optimality. Show that depth  number of classrooms opened by greedy algorithm. c d f j b g i a e h 9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 Time

Interval Partitioning: Greedy Analysis
Observation. Greedy algorithm never schedules two incompatible lectures in the same classroom. Argument for optimality. Pf. Let d = number of classrooms that the greedy algorithm allocates. Classroom d is opened because we needed to schedule a job, say j, that is incompatible with all d-1 other classrooms. Since we sorted by start time, all these incompatibilities are caused by lectures that start no later than sj. Thus, we have d lectures overlapping at time sj + . Key observation  all schedules use  d classrooms. ▪

Greedy Analysis Strategies
Greedy algorithm stays ahead. Show that after each step of the greedy algorithm, its solution is at least as good as any other algorithm's. Structural. Discover a simple "structural" bound asserting that every possible solution must have a certain value. Then show that your algorithm always achieves this bound. Exchange argument. Gradually transform any solution to the one found by the greedy algorithm without hurting its quality. myopic greed: "at every iteration, the algorithm chooses the best morsel it can swallow, without worrying about the future" Objective function. Does not explicitly appear in greedy algorithm! Hard, if not impossible, to precisely define "greedy algorithm."

Q1: Coin Changing Goal. Given currency denominations: 1, 5, 10, 25, 100, devise a method to pay amount to customer using fewest number of coins. Ex: 34¢. Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid. Ex: \$2.89. most of us solve this kind of problem every day without thinking twice, unconsciously using an obvious greedy algorithm

Q1: Coin-Changing: Greedy Algorithm
Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid. Q1. Is cashier's algorithm optimal? Sort coins denominations by value: c1 < c2 < … < cn. S   while (x  0) { let k be largest integer such that ck  x if (k = 0) return "no solution found" x  x - ck S  S  {k} } return S coins selected takes O(n log n) + |S| where S is the number of coins taken by greedy Note: can be sped up somewhat by using integer division

Coin-Changing: Analysis of Greedy Algorithm
Theorem. Greed is optimal for U.S. coinage: 1, 5, 10, 25, 100. Pf. (by induction on x) Consider optimal way to change ck  x < ck+1 : greedy takes coin k. We claim that any optimal solution must also take coin k. if not, it needs enough coins of type c1, …, ck-1 to add up to x table below indicates no optimal solution can do this Problem reduces to coin-changing x - ck cents, which, by induction, is optimally solved by greedy algorithm. ▪ P = pennies, N = nickels, D = dimes, Q = quarters k ck All optimal solutions must satisfy Max value of coins 1, 2, …, k-1 in any OPT 1 1 P  4 - 2 5 N  1 4 3 10 N + D  2 4 + 5 = 9 4 25 Q  3 = 24 5 100 no limit = 99

Coin-Changing: Analysis of Greedy Algorithm
Observation. Greedy algorithm is sub-optimal for US postal denominations: 1, 10, 21, 34, 70, 100, 350, 1225, 1500. Counterexample. 140¢. Greedy: 100, 34, 1, 1, 1, 1, 1, 1. Optimal: 70, 70. note: may not even lead to *feasible* solution if c_1 > 1! Ex. C = {7, 8, 9}, x = 15

Greedy Analysis Strategies
Greedy algorithm stays ahead. Show that after each step of the greedy algorithm, its solution is at least as good as any other algorithm's. Structural. Discover a simple "structural" bound asserting that every possible solution must have a certain value. Then show that your algorithm always achieves this bound. Exchange argument. Gradually transform any solution to the one found by the greedy algorithm without hurting its quality. myopic greed: "at every iteration, the algorithm chooses the best morsel it can swallow, without worrying about the future" Objective function. Does not explicitly appear in greedy algorithm! Hard, if not impossible, to precisely define "greedy algorithm."

4.2 Scheduling to Minimize Lateness

Scheduling to Minimizing Lateness
Minimizing lateness problem. Single resource processes one job at a time. Job j requires tj units of processing time and is due at time dj. If j starts at time sj, it finishes at time fj = sj + tj. Lateness: j = max { 0, fj - dj }. Goal: schedule all jobs to minimize maximum lateness L = max j. Ex: Note: finding a maximal size subset of jobs that meet deadline is NP-hard. dj 6 tj 3 1 8 2 9 4 14 5 15 lateness = 2 lateness = 0 max lateness = 6 d3 = 9 d2 = 8 d6 = 15 d1 = 6 d5 = 14 d4 = 9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Minimizing Lateness: Greedy Algorithms
Greedy template. Consider jobs in some order. [Shortest processing time first] Consider jobs in ascending order of processing time tj. [Earliest deadline first] Consider jobs in ascending order of deadline dj. [Smallest slack] Consider jobs in ascending order of slack dj - tj.

Minimizing Lateness: Greedy Algorithms
Greedy template. Consider jobs in some order. [Shortest processing time first] Consider jobs in ascending order of processing time tj. [Smallest slack] Consider jobs in ascending order of slack dj - tj. 100 1 10 2 tj counterexample dj 2 1 10 tj counterexample dj

Minimizing Lateness: Greedy Algorithm
Greedy algorithm. Earliest deadline first. Sort n jobs by deadline so that d1  d2  …  dn t  0 for j = 1 to n Assign job j to interval [t, t + tj] sj  t, fj  t + tj t  t + tj output intervals [sj, fj] max lateness = 1 d5 = 14 d2 = 8 d6 = 15 d1 = 6 d4 = 9 d3 = 9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Minimizing Lateness: No Idle Time
Observation. There exists an optimal schedule with no idle time. Observation. The greedy schedule has no idle time. d = 4 d = 6 d = 12 1 2 3 4 5 6 7 8 9 10 11 d = 4 d = 6 d = 12 1 2 3 4 5 6 7 8 9 10 11 "machine not working for some reason, yet work still to be done"

Minimizing Lateness: Inversions
Def. An inversion in schedule S is a pair of jobs i and j such that: di < dj but j scheduled before i. Observation. Greedy schedule has no inversions. Observation. All schedules with no idle time and no inversions have same maximum lateness. Observation. If a schedule (with no idle time) has an inversion, it has one with a pair of inverted jobs scheduled consecutively. inversion before swap j i Recall: jobs sorted in ascending order of due dates Second claim: only differs in way jobs with identical deadlines are scheduled (draw deadline plot) last claim: the only thing here is that jobs w/ same deadline can be reordered

Minimizing Lateness: Inversions
Def. An inversion in schedule S is a pair of jobs i and j such that: di < dj but j scheduled before i. Claim. Swapping two adjacent, inverted jobs reduces the number of inversions by one and does not increase the max lateness. Pf. Let  be the lateness before the swap, and let  ' be it afterwards. 'k = k for all k  i, j 'i  i If job j is late: inversion fi before swap j i after swap i j f'j

Minimizing Lateness: Analysis of Greedy Algorithm
Theorem. Greedy schedule S is optimal. Pf. Define S* to be an optimal schedule that has the fewest number of inversions, and let's see what happens. Can assume S* has no idle time. If S* has no inversions, then S = S*. If S* has an inversion, let i-j be an adjacent inversion. swapping i and j does not increase the maximum lateness and strictly decreases the number of inversions this contradicts definition of S* ▪

Q2: Subsequences Suppose you have a collection of possible events (e.g., possible transactions) and a sequence S of n events. A given event may occur multiple times—e.g., you could have an event “buy Google stock” multiple times in a log of transactions. A sequence S’ is a subsequence of a sequence S if there’s a way to delete certain events from S such that the remaining sequence equals S’. For example, the reason to do this could be pattern-matching. Give an algorithm that takes two sequences of events—S’ of length m and S of length n—and decides in time O(m+n) whether S’ is a subsequence of S.

Solution Greedy algorithm: Let the i-th event of S be S(i). Find the first event in S that matches S’(1), then the second event in S that matches S’(2), and so on. The running time is O(m+n). It is easy to show that if the algorithm finds a match, then S’ is in fact a subsequence of S. More difficult direction: if the algorithm does not find a match, then no match exists. The proof of this is by contradiction. Suppose S’ matches the subsequence S(l1).S(l2)…S(lm). Suppose GREEDY produces the sequence S(k1).S(k2)…. Show that greedy can produce a match all the way up to S(km) and also ki ≤ li for all i. This is done in a way similar to the proof in interval scheduling.