Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.

Similar presentations


Presentation on theme: "Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although."— Presentation transcript:

1 Ch. 7 - QuickSort Quick but not Guaranteed

2 Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although O(n lg n) in their time complexity, have fairly large constants and tend to move data around more than desirable (e.g., equal-key items may not maintain their relative position from input to output). We introduce another algorithm with better constants, but a flaw: its worst case in O(n 2 ). Fortunately, the worst case is “rare enough” so that the speed advantages work an overwhelming amount of the time… and it is O(n lg n) on average. 6/13/2015291.404

3 Ch.7 - QuickSort Like in MERGESORT, we use Divide-and-Conquer: 1.Divide: partition A[p..r] into two subarrays A[p..q-1] and A[q+1..r] such that each element of A[p..q-1] is ≤ A[q], and each element of A[q+1..r] is ≥ A[q]. Compute q as part of this partitioning. 2.Conquer: sort the subarrays A[p..q-1] and A[q+1..r] by recursive calls to QUICKSORT. 3.Combine: the partitioning and recursive sorting leave us with a sorted A[p..r] – no work needed here. An obvious difference is that we do most of the work in the divide stage, with no work at the combine one. 6/13/2015391.404

4 Ch.7 - QuickSort The Pseudo-Code 6/13/2015491.404

5 Ch.7 - QuickSort 6/13/2015591.404

6 Ch.7 - QuickSort Proof of Correctness: PARTITION We look for a loop invariant and we observe that at the beginning of each iteration of the loop (l.3-6) for any array index k : 1.If p ≤ k ≤ i, then A[k] ≤ x ; 2.If i+1 ≤ k ≤ j-1, then A[k] > x ; 3.If k = r, then A[k] = x. 4.If j ≤ k ≤ r-1, then we don’t know anything about A[k]. 6/13/2015691.404

7 Ch.7 - QuickSort The Invariant Initialization. Before the first iteration: i=p-1, j=p. No values between p and i ; no values between i+1 and j-1. The first two conditions are trivially satisfied; the initial assignment satisfies 3. Maintenance. Two cases –1. A[j] > x. –2. A[j] ≥ x. 6/13/2015791.404

8 Ch.7 - QuickSort The Invariant Termination. j=r. Every entry in the array is in one of the three sets described by the invariant. We have partitioned the values in the array into three sets: less than or equal to x, greater than x, and a singleton containing x. Running time of PARTITION on A[p..r] is  (n), where n = r – p + 1. 6/13/2015891.404

9 Ch.7 - QuickSort QUICKSORT : Performance – a quick look. We first look at (apparent) worst-case partitioning: T(n) = T(n-1) + T(0) +  (n) = T(n-1) +  (n). It is easy to show – using substitution - that T(n) =  (n 2 ). We next look at (apparent) best-case partitioning: T(n) = 2T(n/2) +  (n). It is also easy to show (case 2 of the Master Theorem) that T(n) =  (n lg n). Since the disparity between the two is substantial, we need to look further… 6/13/2015991.404

10 Ch.7 - QuickSort QUICKSORT : Performance – Balanced Partitioning 6/13/20151091.404

11 Ch.7 - QuickSort QUICKSORT : Performance – the Average Case As long as the number of “good splits” is bounded below as a fixed percentage of all the splits, we maintain logarithmic depth and so O(n lg n) time complexity. 6/13/20151191.404

12 Ch.7 - QuickSort QUICKSORT : Performance – Randomized QUICKSORT We would like to ensure that the choice of pivot does not critically impair the performance of the sorting algorithm – the discussion to this point would indicate that randomizing the choice of the pivot should provide us with good behavior (if at all possible with the data-set we are trying to sort). We introduce 6/13/20151291.404

13 Ch.7 - QuickSort QUICKSORT : Performance – Randomized QUICKSORT And the recursive procedure becomes: Every call to RANDOMIZED-PARTITION has introduced the (constant) extra overhead of a call to RANDOM. 6/13/20151391.404

14 Ch.7 - QuickSort QUICKSORT : Performance – Rigorous Worst Case Analysis Since we do not, a priori, have any idea of what the splits of the subarrays will be, we have to represent a possible “worst case” (we already have an O(n 2 ) bound from the “bad split” example – so it could be worse… although we hope not). The worst case leads to the recurrence T(n) = max 0≤q≤n-1 (T(q) + T(n – q - 1)) +  (n), where we remember that the pivot does not appear at the next level (down) of the recursion. 6/13/20151491.404

15 Ch.7 - QuickSort QUICKSORT : Performance – Rigorous Worst Case Analysis We have to come up with a “guess” and the basis for the guess is our likely “bad split case”: it tells us we cannot hope for any better than  (n 2 ). So we just hope it is no worse… Guess T(n) ≤ cn 2 for some c > 0 and start doing algebra for the induction: T(n) ≤ max 0≤q≤n-1 (T(q) + T(n – q - 1)) +  (n) ≤ max 0≤q≤n-1 (cq 2 + c(n – q - 1) 2 ) +  (n). Differentiate cq 2 + c(n – q - 1) 2 twice with respect to q, to obtain 4c > 0 for all values of q. 6/13/20151591.404

16 Ch.7 - QuickSort QUICKSORT : Performance – Rigorous Worst Case Analysis Since the expression represents a quadratic curve, concave up, it reaches it maximum at one of the endpoints q = 0 and q = n – 1. As we evaluate, we find max 0≤q≤n-1 (cq 2 + c(n – q - 1) 2 ) +  (n)≤ c max 0≤q≤n-1 (q 2 + (n – q - 1) 2 ) +  (n)≤ c (n – 1) 2 +  (n) = cn 2 – 2cn + 1 +  (n) ≤ cn 2 by choosing c large enough to overcome the positive constant in  (n). 6/13/20151691.404

17 Ch.7 - QuickSort QUICKSORT : Performance – Expected RunTime Understanding partitioning. 1.Each time PARTITION is called, it selects a pivot element and this pivot element is never included in successive calls: the total number of calls to PARTITION is n. 2.Each call to PARTITION costs O(1) plus an amount of time proportional to the number of iterations of the for loop. 3.Each iteration of the for loop (in line 4) performs a comparison, comparing the pivot to another element in A. 4.We need to count the number of times l. 4 is executed. 6/13/20151791.404

18 Ch.7 - QuickSort QUICKSORT : Performance – Expected RunTime Lemma 7.1. Let X be the number of comparisons performed in l. 4 of PARTITION over the entire execution of QUICKSORT on an n -element array. Then the running time of QUICKSORT is O(n + X). Proof: the observations on the previous slide. We need to find X, the total number of comparisons performed over all calls to PARTITION. 6/13/20151891.404

19 Ch.7 - QuickSort QUICKSORT : Performance – Expected RunTime 1.Rename the elements of A as z 1, z 2, …, z n, so that z i is the i th smallest element of A. 2.Define the set Z ij = {z i, z i+1,…, z j }. 3.Question: when does the algorithm compare z i and z j ? 4.Answer: at most once – notice that all elements in every (sub)array are compared to the pivot once, and will never be compared to the pivot again (since the pivot is removed from the recursion). 5.Define X ij = I{z i is compared to z j }, the indicator variable of this event. Comparisons are over the full run of the algorithm. 6/13/20151991.404

20 Ch.7 - QuickSort QUICKSORT : Performance – Expected RunTime 6.Since each pair is compared at most once, we can write 7.Taking expectations of both sides: 8.We need to compute Pr{z i is compared to z j }. 9.We will assume all z i and z j are distinct. 10.For any pair z i, z j, once a pivot x is chosen so that z i < x < z j, z i and z j will never be compared again (why?). 6/13/20152091.404

21 Ch.7 - QuickSort QUICKSORT : Performance – Expected RunTime 11.If z i is chosen as a pivot before any other item in Z ij, then z i will be compared to every other item in Z ij. 12.Same for z j. 13. z i and z j are compared if and only if the first element to be chosen as a pivot from Z ij is either z i or z j. 14.What is that probability? Until a point of Z ij is chosen as a pivot, the whole of Z ij is in the same partition, so every element of Z ij is equally likely to be the first one chosen as a pivot. 6/13/20152191.404

22 Ch.7 - QuickSort QUICKSORT : Performance – Expected RunTime 15.Because Z ij has j – i + 1 elements, and because pivots are chosen randomly and independently, the probability that any given element is the first one chosen as a pivot is 1/(j-i+1). It follows that: 16. Pr{z i is compared to z j } = Pr{z i or z j is first pivot chosen from Z ij } = Pr{z i is first pivot chosen from Z ij }+ Pr{ z j is first pivot chosen from Z ij } = 1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1). 6/13/20152291.404

23 Ch.7 - QuickSort QUICKSORT : Performance – Expected RunTime 17.Replacing the right-hand-side in 7, and grinding through some algebra: And the result follows. 6/13/20152391.404


Download ppt "Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although."

Similar presentations


Ads by Google