Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. Yingwu Zhu Chapter 9, p 213-220 Linear Time Selection Dr. Yingwu Zhu Chapter 9, p 213-220.

Similar presentations


Presentation on theme: "Dr. Yingwu Zhu Chapter 9, p 213-220 Linear Time Selection Dr. Yingwu Zhu Chapter 9, p 213-220."— Presentation transcript:

1 Dr. Yingwu Zhu Chapter 9, p 213-220
Linear Time Selection Dr. Yingwu Zhu Chapter 9, p

2 Terminology Order statistic
The i-th order statistic of a set of n elements is the i-th smallest element Median n is odd, (n+1)/2 –th smallest n is even, floor((n+1)/2), usually lower median Selection problem Given a set of n elements and i such that 1<=i<=n Find the i-th smallest element

3 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? 1 2 3 Chazelle: Amazingly enough, the road costs nothing to built. It is the driveways that cost money. Driveway cost is proportional to its distance to road. (Obviously, they will be perpendicular.) 4 5 6 7 8 9 10 11 12

4 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? n1 = nodes on top. n2 = nodes on bottom. 1 Decreases total cost by (n2-n1)  2 3 4 5 6 7 8 9 10 11 12

5 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? n1 = nodes on top. n2 = nodes on bottom. 1 2 3 4 5 6 7 8 9 10 11 12

6 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? n1 = nodes on top. n2 = nodes on bottom. 1 2 3 4 5 6 7 8 9 10 11 12

7 Where to Build a Road? Given (x,y) coordinates of N houses, where should you build road parallel to x-axis to minimize construction cost of building driveways? Solution: put street at median of y coordinates. 1 2 3 4 5 6 7 8 9 10 11 12

8 Order Statistics Given N linearly ordered elements, find ith smallest element. Min: i = 1 Max: i = N Median: i = (N+1) / 2 and (N+1) / 2 O(N) for min or max. O(N log N) comparisons by sorting. O(N log i) with heaps. Can we do in worst-case O(N) comparisons? Surprisingly, yes. (Blum, Floyd, Pratt, Rivest, Tarjan, 1973) Assumption to make presentation cleaner. All items have distinct values.

9 Randomized Select Similar to quicksort, but throw away useless "half" at each iteration. Select ith smallest element from A[L..H]. if (L == H) return A[L] x  random_partition(A, L, H) k  x-L+1 if (i == k) return x else if (i < k) return Rand_Select(A, L, x-1, i) else if (i > k) return Select(A, x+1, H, i-k) Rand_Select (A, L, H, i) x = pivot position

10 Summary The expected running time: O(n)
The worst-case running time: O(n2) However, because it is randomized, no particular input elicits the worst-case behavior

11 The worst-case linear time selection: partition
Divide N elements into N/5 groups of 5 elements each, plus extra. 22 45 14 28 29 44 39 09 23 10 52 50 05 06 38 11 15 03 26 37 53 25 54 40 02 12 16 30 19 55 13 41 48 01 18 43 17 47 46 24 20 33 32 31 34 04 07 51 49 35 27 21 08 36 N = 54

12 The worst-case linear time selection: partition
Divide N elements into N/5 groups of 5 elements each, plus extra. Brute force sort each of the 5-element groups. 22 45 14 28 29 44 39 09 23 10 52 50 05 06 38 11 15 03 26 37 53 25 54 40 02 12 16 30 19 55 13 41 48 01 18 43 17 47 46 24 20 33 32 31 34 04 07 51 49 35 27 21 08 36 22 45 29 28 14 10 44 39 23 09 06 52 50 38 05 11 37 26 15 03 25 54 53 40 02 16 53 30 19 12 13 48 41 18 01 24 47 46 43 17 31 34 33 32 20 07 51 49 35 04

13 The worst-case linear time selection: partition
Divide N elements into N/5 groups of 5 elements each, plus extra. Brute force sort each of the 5-element groups. Find x = "median of medians" by Select() on N/5 medians. 14 09 05 03 02 12 01 17 20 04 36 32 is also median the algorithm could return 22 10 06 11 25 16 13 24 31 07 27 28 28 23 38 15 40 19 18 43 32 35 08 29 39 50 26 53 30 41 46 33 49 21 45 44 52 37 54 53 48 47 34 51

14 The worst-case linear time selection: partition
Divide N elements into N/5 groups of 5 elements each, plus extra. Brute force sort each of the 5-element groups. Find x = "median of medians" by Select() on N/5 medians. Partition the array using x 14 09 05 03 02 12 01 17 20 04 36 32 is also median the algorithm could return 22 10 06 11 25 16 13 24 31 07 27 28 28 23 38 15 40 19 18 43 32 35 08 29 39 50 26 53 30 41 46 33 49 21 45 44 52 37 54 53 48 47 34 51

15 The worst-case linear time selection: partition
Divide N elements into N/5 groups of 5 elements each, plus extra. Brute force sort each of the 5-element groups. Find x = "median of medians" by Select() on N/5 medians. Partition the array using x , follow 3 cases in Rand_Select() 14 09 05 03 02 12 01 17 20 04 36 32 is also median the algorithm could return 22 10 06 11 25 16 13 24 31 07 27 28 28 23 38 15 40 19 18 43 32 35 08 29 39 50 26 53 30 41 46 33 49 21 45 44 52 37 54 53 48 47 34 51

16 Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. At least 1/2 of 5 element medians  x at least  N / 5 / 2 = N / 10 medians  x 14 09 05 03 02 12 01 17 20 04 36 CLR Instructors manual median of medians 22 10 06 11 25 16 13 24 31 07 27 28 23 38 15 40 19 18 43 32 35 08 29 39 50 26 53 30 41 46 33 49 21 45 44 52 37 54 53 48 47 34 51

17 Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. At least 1/2 of 5 element medians  x at least  N / 5 / 2 = N / 10 medians  x At least 3 N / 10 elements  x. 14 09 05 03 02 12 01 17 20 04 36 median of medians 22 10 06 11 25 16 13 24 31 07 27 28 23 38 15 40 19 18 43 32 35 08 29 39 50 26 53 30 41 46 33 49 21 45 44 52 37 54 53 48 47 34 51

18 Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. At least 1/2 of 5 element medians  x at least  N / 5 / 2 = N / 10 medians  x At least 3 N / 10 elements  x. At least 3 N / 10 elements  x. 14 09 05 03 02 12 01 17 20 04 36 median of medians 22 10 06 11 25 16 13 24 31 07 27 28 23 38 15 40 19 18 43 32 35 08 29 39 50 26 53 30 41 46 33 49 21 45 44 52 37 54 53 48 47 34 51

19 Selection Analysis Crux of proof: delete roughly 30% of elements by partitioning. At least 1/2 of 5 element medians  x at least  N / 5 / 2 = N / 10 medians  x At least 3 N / 10 elements  x. At least 3 N / 10 elements  x.  Select() called recursively (Case 2 or 3) with at most N - 3 N / 10 elements. C(N) = # comparisons on a file of size N. Now, solve recurrence. Apply master theorem? Assume N is a power of 2? Assume C(N) is monotone non-decreasing? would be nice to assume C(N) is a power of 5 and a power of 10 to eliminate floors, but unfortunately not many such integers :) CLRS assumes that C(N) is monotone. Theirs is (I think). Ours is not.

20 Selection Analysis Analysis of selection recurrence.
T(N) = # comparisons on a file of size  N. T(N) is monotone, but C(N) is not! Claim: T(N)  20cN. Base case: N < 50. Inductive hypothesis: assume true for 1, 2, , N-1. Induction step: for N  50, we have: For n  50, 3 N / 10  N / 4.

21 Linear Time Selection Postmortem
Practical considerations. Constant (currently) too large to be useful. Practical variant: choose random partition element. O(N) expected running time ala quicksort. Open problem: guaranteed O(N) with better constant. Quicksort. Worst case O(N log N) if always partition on median. Justifies practical variants: median-of-3, median-of-5.


Download ppt "Dr. Yingwu Zhu Chapter 9, p 213-220 Linear Time Selection Dr. Yingwu Zhu Chapter 9, p 213-220."

Similar presentations


Ads by Google