Data Structures & Algorithms

Data Structures & Algorithms
QuickSort 1

QuickSort One of the top 20 algorithms
Invented by C. A. R. Hoare in 1960 Widely implemented Widely studied Desirable features: Works in-place (small auxiliary stack) N lg N time on average Very short inner loop 2

QuickSort One of the top 20 algorithms
Invented by C. A. R. Hoare in 1960 Widely implemented Widely studied Downside: Not stable N2 time worst case Susceptible to bad implementation 3

QuickSort Divide-and-Conquer method Partition array into two parts
Small elements Large elements Sort each part independently Partitioning is the central component: Element a[i] is in its final place Element a[j] <= a[i] for all j < i Element a[j] >= a[i] for all j > i 4

QuickSort Divide-and-Conquer method Partition array into two parts
Small elements Large elements Sort each part independently void quickSort(Item a[], int l, int r) { if (r <= l) return; int i = partition(a, l, r); quickSort(a, l, i-1); quickSort(a, i+1, r); } 5

QuickSort Partition procedure is crucial to success.
If the two parts are “about equal” in size, then quickSort is N lg N time and fast. First cut: pick a[r] to be the pivot element. void quickSort(Item a[], int l, int r) { if (r <= l) return; int i = partition(a, l, r); quickSort(a, l, i-1); quickSort(a, i+1, r); } 6

QuickSort Partition procedure is crucial to success.
Pick a[r] to be the pivot element. int partition(Item a[], int l, int r) { int i = l-1, j = r; Item v = a[r]; for (;;) { while (a[++i] < v); while (a[--j] > v) if (j == l) break; if (i >= j) break; exch(a[i], a[j]); } exch(a[i], a[r]); return i; 7

QuickSort Partitioning
L G O R I T H M A L G H I R T O M A L G O R I T H M A L G H I M T O R A L G O R I T H M A L G H I M T O R return 5 A L G H R I T O M A L G H I M T O R A L G H R I T O M A L G H I M T O R A L G H R I T O M A L G H I M T O R A L G H I R T O M A H G L I M T O R A L G H I R T O M A H G L I M T O R 8

QuickSort Partitioning
H G L I M T O R A G H I L M T O R A H G I L M T O R A G H I L M T O R A H G I L M T O R return 3 A G H I L M T O R A H G I L M T O R A G H I L M O T R A H G I L M T O R A G H I L M O T R A H G I L M T O R A G H I L M O R T A G H I L M T O R A G H I L M O R T A G H I L M T O R return 1 A G H I L M O R T 9

QuickSort Property 7.1: QuickSort uses about N2/2 comparisons in the worst case. Prf: On a sorted file, QuickSort will make partitions of size N-1 and 0 on a file of length N, as the pivot is already in place. Partitioning takes N comparisons, so the total is N + (N-1) + (N-2) + … , which is (N+1)N/2 10

QuickSort Property 7.2: QuickSort uses about 2 ln N comparisons in the average case. Prf: On a random file, QuickSort is equally likely to make the left partition of any size k between 0 and N-1, with the right partition of size N-k-1. The expected cost is then C(N) = N+1 + (1/N)Sum(C(k)+C(N-k-1)) for N > 1, with C(0) = C(1) = 0. 11

QuickSort Property 7.2: QuickSort uses about
2 ln N comparisons in the average case. Prf: The expected cost is then C(N) = N+1 + (1/N)Sum(C(k)+C(N-k-1)) = N+1 + (2/N)Sum(C(k)) since each C(k) is counted twice (once on the left, once on the right). Telescoping, we get NC(N) = (N+1)C(N-1) + 2N 12

QuickSort Property 7.2: QuickSort uses about
2 ln N comparisons in the average case. Prf: Dividing by N(N+1) we get C(N)/(N+1) = C(N-1)/N + 2/(N+1) = C(2)/3 + Sum(2/(k+1)) 2<k<N+1 which is about 2 ln N using an integral approximation. 13

QuickSort Stack size can be an issue with any recursive algorithm.
QuickSort has a maximum stack size proportional to lg N for random files, but for degenerate cases it can be N. (What are these degenerates?) Policy of putting the larger partition on the stack ensures that the stack is of minimal size, lg N in the worst case. 14

QuickSort Small subarrays are another issue. This is generally the case with recursive algorithms – they often generate many small cases, so efficiency for small cases affects running time. For QuickSort, the approach is to use InsertionSort for small subarrays. How small? Typical value of “small” is less than 10 or so. 15

QuickSort The most critical issue is partitioning the input array into more or less equal parts. For QuickSort, the approach is to use Median-of-Three partitioning. Idea is that the array is “sampled” to find a good pivot – one random sample is probably good, but three is usually much better. Why not 5? Or 10? Or 1000? Sampling also costs time, so strike a balance – three turns out pretty well. 16

QuickSort Duplicate keys can also be pesky!
QuickSort isn’t bad, it just doesn’t recognize the case where all the elements in a subarray are equal. Hence, there is room for improvement. The goal is to partition the array into three parts – less than the pivot, equal to the pivot, and greater than the pivot. (i.e., just expand the middle part of size 1 that was the pivot to include all the elements equal to the pivot). 17

QuickSort Approach by Bentley and McIlroy (1993) keeps keys equal to the pivot found in the left subarray on the far left, and those found in the right subarray to the far right, until the location for the pivot is found. Then the equal key elements are swapped within their side to be next to the pivot. 18

QuickSort Approach by Bentley and McIlroy (1993) Key same as pivot
Key distinct from pivot Item in final position 19

Medians and Selection A common problem is to find the median.
Can be done easily by sorting… But this is expensive – Special case of selection problem (pick kth smallest element in set) Since we must examine every element (the k-1 that are smaller and the N-k-1 that are larger), not much harder to return all the k smallest elements of a file. 20

Medians and Selection Faster ways to do selection:
SelectionSort, but stop after first k sorted Takes time kN, OK for small k. Other methods run in time N log k (later…) QuickSort can do it in linear time on average for all values of k! 21

Medians and Selection Faster ways to do selection:
void select(Item a[], int l, int r, int k) { if (r <= l) return; int i = partition(a, l, r); if (i > k) select(a, l, i-1, k); if (i < k) select(a, i+1, r, k-i-1); } 22

Selection Select k Select k Select k-i2 Select k-i2 Select k-i2-i4 23

Medians and Selection Property 7.4: QuickSort based selection is linear time on average. The full proof is complicated, but roughly a large array should be divided about in half each partitioning, taking time about N + N/2 + N/4 + … = 2N. 24

Summary Covered QuickSort algorithm
analysis (best, worst, average cases) pivot selection small arrays duplicate keys Selection problem quicksort approach to solution 25

Data Structures & Algorithms

Similar presentations

Presentation on theme: "Data Structures & Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Structures & Algorithms

Similar presentations

Presentation on theme: "Data Structures & Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback