 # Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.

## Presentation on theme: "Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3."— Presentation transcript:

Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.

2 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? 2 9 4 5 8 7 6 3 11 1210 1

3 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? n n 1 = nodes on top. n n 2 = nodes on bottom. 2 9 4 5 8 7 1 6 3 11 Decreases total cost by (n 2 -n 1 )   1210

4 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? n n 1 = nodes on top. n n 2 = nodes on bottom. 2 9 4 5 8 7 1 6 3 11 1210

5 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? n n 1 = nodes on top. n n 2 = nodes on bottom. 2 9 4 5 8 7 1 6 3 11 1210

6 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? Solution: put street at median of y coordinates. 2 9 4 5 8 7 1 6 3 11 1210

7 Order Statistics Given N linearly ordered elements, find i th smallest element. n Min: i = 1 n Max: i = N n Median: i =  (N+1) / 2  and  (N+1) / 2  n O(N) for min or max. n O(N log N) comparisons by sorting. n O(N log i) with heaps. Can we do in worst-case O(N) comparisons? n Surprisingly, yes. (Blum, Floyd, Pratt, Rivest, Tarjan, 1973) Assumption to make presentation cleaner. n All items have distinct values.

8 Select Similar to quicksort, but throw away useless "half" at each iteration. n Select i th smallest element from a 1, a 2,..., a N. x  Partition(N, a 1, a 2,..., a N ) k  rank(x) if (i == k) return x else if (i < k) b[]  all items of a[] less than x return Select(i th, k-1, b 1, b 2,..., b k-1 ) else if (i > k) c[]  all items of a[] greater than x return Select((i-k) th, N-k, c 1, c 2,..., c N-k ) Select (i th, N, a 1, a 2,..., a N ) x = partition element Want to choose x so that x is (roughly) the ith largest.

9 Partition Partition(). n Divide N elements into  N/5  groups of 5 elements each, plus extra. 22 45 14 28 29 44 39 09 23 10 52 50 05 06 38 11 15 03 26 37 53 25 54 40 02 12 16 30 19 55 13 41 48 01 18 43 17 47 46 24 20 33 32 31 34 04 07 51 49 35 27 21 08 36 N = 54

10 22 45 14 28 29 44 39 09 23 10 52 50 05 06 38 11 15 03 26 37 53 25 54 40 02 12 16 30 19 55 13 41 48 01 18 43 17 47 46 24 20 33 32 31 34 04 07 51 49 35 27 21 08 36 Partition Partition(). n Divide N elements into  N/5  groups of 5 elements each, plus extra. n Brute force sort each of the 5-element groups. 22 45 29 28 14 10 44 39 23 09 06 52 50 38 05 11 37 26 15 03 25 54 53 40 02 16 53 30 19 12 13 48 41 18 01 24 47 46 43 17 31 34 33 32 20 07 51 49 35 04

11 28 Partition Partition(). n Divide N elements into  N/5  groups of 5 elements each, plus extra. n Brute force sort each of the 5-element groups. Find x = "median of medians" by Select() on  N/5  medians. 22 45 29 14 10 44 39 23 09 06 52 50 38 05 11 37 26 15 03 25 54 53 40 02 16 53 30 19 12 13 48 41 18 01 24 47 46 43 17 31 34 33 32 20 07 51 49 35 04 27 21 08 36

12 Select if (N is small) use mergesort Divide a[] into groups of 5, and let m 1, m 2,..., m N/5 be list of medians. x  Select(N/10, m 1, m 2,..., m N/5 ) k  rank(x) if (i == k) // Case 1 return x else if (i < k) // Case 2 b[]  all items of a[] less than x return Select(i th, k-1, b 1, b 2,..., b k-1 ) else if (i > k) // Case 3 c[]  all items of a[] greater than x return Select((i-k) th, N-k, c 1, c 2,..., c N-k ) Select (i th, N, a 1, a 2,..., a N ) median of medians

13 22 45 29 28 14 10 44 39 23 09 06 52 50 38 05 11 37 26 15 03 25 54 53 40 02 16 53 30 19 12 13 48 41 18 01 24 47 46 43 17 31 34 33 32 20 07 51 49 35 04 27 21 08 36 median of medians Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. n At least 1/2 of 5 element medians  x – at least   N / 5  / 2  =  N / 10  medians  x

14 22 45 29 28 14 10 44 39 23 09 06 52 50 38 05 11 37 26 15 03 25 54 53 40 02 16 53 30 19 12 13 48 41 18 01 24 47 46 43 17 31 34 33 32 20 07 51 49 35 04 27 21 08 36 median of medians Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. n At least 1/2 of 5 element medians  x – at least   N / 5  / 2  =  N / 10  medians  x n At least 3  N / 10  elements  x.

15 22 45 29 28 14 10 44 39 23 09 06 52 50 38 05 11 37 26 15 03 25 54 53 40 02 16 53 30 19 12 13 48 41 18 01 24 47 46 43 17 31 34 33 32 20 07 51 49 35 04 27 21 08 36 Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. n At least 1/2 of 5 element medians  x – at least   N / 5  / 2  =  N / 10  medians  x n At least 3  N / 10  elements  x. n At least 3  N / 10  elements  x. median of medians

16 Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. n At least 1/2 of 5 element medians  x – at least   N / 5  / 2  =  N / 10  medians  x n At least 3  N / 10  elements  x. n At least 3  N / 10  elements  x.  Select() called recursively (Case 2 or 3) with at most N - 3  N / 10  elements. C(N) = # comparisons on a file of size N. Now, solve recurrence. n Apply master theorem? n Assume N is a power of 2? n Assume C(N) is monotone non-decreasing?

17 Selection Analysis Analysis of selection recurrence. n T(N) = # comparisons on a file of size  N. n T(N) is monotone, but C(N) is not! Claim: T(N)  20cN. n Base case: N < 50. n Inductive hypothesis: assume true for 1, 2,..., N-1. n Induction step: for N  50, we have: For n  50, 3  N / 10   N / 4.

18 Linear Time Selection Postmortem Practical considerations. n Constant (currently) too large to be useful. n Practical variant: choose random partition element. – O(N) expected running time ala quicksort. n Open problem: guaranteed O(N) with better constant. Quicksort. n Worst case O(N log N) if always partition on median. n Justifies practical variants: median-of-3, median-of-5.

Download ppt "Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3."

Similar presentations