Download presentation

Presentation is loading. Please wait.

Published byGeorgina Stanley Modified over 4 years ago

1
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics

2
Order statistics The i th order statistic in a set of n elements is the i th smallest element The minimum is thus the 1 st order statistic The maximum is the n th order statistic The median is the n/2 order statistic If n is even, there are 2 medians How can we calculate order statistics? What is the running time?

3
Order statistics – selection problem Select the i th smallest of n elements Naive algorithm: Sort. –Worst-case running time (n log n) using merge sort or heapsort (not quicksort). We will show: –A practical randomized algorithm with ( n ) expected running time –A cool algorithm of theoretical interest only with ( n ) worst-case running time

4
Recall: Quicksort The function Partition gives us the rank of the pivot If we are lucky, k = i. done! If not, at least get a smaller subarray to work with –k > i: i th smallest is on the left subarray –k < i : i th smallest is on the right subarray Divide and conquer –If we are lucky, k close to n/2, or desired # is in smaller subarray –If unlucky, desired # is in larger subarray (possible size n-1) x x x x x x x x x x rpq k

5
Randomized divide-and- conquer algorithm R AND -S ELECT (A, p, q, i) ⊳ i th smallest of A[ p.. q] if p = q & i > 1 then error! r R AND -P ARTITION (A, p, q) k r – p + 1 ⊳ k = rank(A[r]) if i = k then return A[ r] if i < k then return R AND -S ELECT ( A, p, r – 1, i ) else return R AND -S ELECT ( A, r + 1, q, i – k ) A[r] A[r] A[r] A[r] A[r] A[r] A[r] A[r] rpq k

6
Randomized Partition Randomly choose an element as pivot –Every time need to do a partition, throw a die to decide which element to use as the pivot –Each element has 1/n probability to be selected Rand-Partition(A, p, q){ d = random(); // draw a random number between 0 and 1 index = p + floor((q-p+1) * d); // p<=index<=q swap(A[p], A[index]); Partition(A, p, q); // now use A[p] as pivot }

7
Example pivot i = 6 7 7 10 5 5 8 8 11 3 3 2 2 13 k = 4 Select the 6 – 4 = 2nd smallest recursively. Select the i = 6th smallest: 3 3 2 2 5 5 7 7 11 8 8 10 13 Partition:

8
7 7 10 5 5 8 8 11 3 3 2 2 13 3 3 2 2 5 5 7 7 11 8 8 10 13 10 8 8 11 13 8 8 10 Complete example: select the 6 th smallest element. i = 6 k = 4 i = 6 – 4 = 2 k = 3 i = 2 < k k = 2 i = 2 = k Note: here we always used first element as pivot to do the partition (instead of rand-partition).

9
Intuition for analysis Lucky: C ASE 3 T(n)= T(9n/10) + (n) = (n) Unlucky: T(n)= T(n – 1) + (n) = (n 2 ) arithmetic series Worse than sorting! (All our analyses today assume that all elements are distinct.)

10
Running time of randomized selection For upper bound, assume i th element always falls in larger side of partition The expected running time is an average of all cases T(n) ≤ T(max(0, n–1)) + nif 0 : n–1 split, T(max(1, n–2)) + nif 1 : n–2 split, T(max(n–1, 0)) + nif n–1 : 0 split, Expectation

11
Substitution method Assume: T(k) ≤ ck for all k < n if c ≥ 4 Therefore, T(n) = O(n) Want to show T(n) = O(n). So need to prove T(n) ≤ cn for n > n 0

12
Summary of randomized selection Works fast: linear expected time. Excellent algorithm in practice. But, the worst case is very bad: (n 2 ). Q. Is there an algorithm that runs in linear time in the worst case? I DEA : Generate a good pivot recursively. A. Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973].

13
Worst-case linear-time selection if i = k then return x elseif i < k thenrecursively S ELECT the i th smallest element in the lower part elserecursively S ELECT the (i–k)th smallest element in the upper part S ELECT (i, n) 1.Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2.Recursively S ELECT the median x of the n/5 group medians to be the pivot. 3.Partition around the pivot x. Let k = rank(x). 4. Same as R AND - S ELECT

14
Choosing the pivot

15
1.Divide the n elements into groups of 5.

16
Choosing the pivot lesser greater 1.Divide the n elements into groups of 5. Find the median of each 5-element group by rote.

17
Choosing the pivot lesser greater 1.Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2.Recursively S ELECT the median x of the n/5 group medians to be the pivot. x

18
Analysis lesser greater x At least half the group medians are x, which is at least n/5 /2 = n/10 group medians.

19
Analysis lesser greater x At least half the group medians are x, which is at least n/5 /2 = n/10 group medians. Therefore, at least 3 n/10 elements are x. (Assume all elements are distinct.)

20
Analysis lesser greater x At least half the group medians are x, which is at least n/5 /2 = n/10 group medians. Therefore, at least 3 n/10 elements are x. Similarly, at least 3 n/10 elements are x.

21
At least 3 n/10 elements are x at most n-3 n/10 elements are x At least 3 n/10 elements are x at most n-3 n/10 elements are x The recursive call to S ELECT in Step 4 is executed recursively on at most n-3 n/10 elements. Analysis Need “at most” for worst-case runtime 3 n/10 Possible position for pivot

22
Use fact that a/b a/b-1 n-3 n/10 < n-3(n/10-1) 7n/10 + 3 3n/4 if n ≥ 60 The recursive call to S ELECT in Step 4 is executed recursively on at most 7n/10+3 elements. Analysis

23
Developing the recurrence if i = k then return x elseif i < k thenrecursively S ELECT the i th smallest element in the lower part elserecursively S ELECT the (i–k)th smallest element in the upper part S ELECT (i, n) 1.Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2.Recursively S ELECT the median x of the n/5 group medians to be the pivot. 3.Partition around the pivot x. Let k = rank(x). 4. T(n)T(n) (n)(n) T(n/5) (n)(n) T(7n/10 +3)

24
Solving the recurrence if c ≥ 20 and n ≥ 60 Assumption: T(k) ck for all k < n if n ≥ 60

25
Conclusions Since the work at each level of recursion is basically a constant fraction (19/20) smaller, the work per level is a geometric series dominated by the linear work at the root. In practice, this algorithm runs slowly, because the constant in front of n is large. The randomized algorithm is far more practical. Exercise: Try to divide into groups of 3 or 7. Exercise: Think about an application in sorting.

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google