Sorting Popular algorithms: Selection sort * Insertion sort* Bubble sort* Quick sort* Comb-sort Shell-sort Heap sort* Merge sort* Counting-sort Radix-sort Bucket-sort Tim-sort Many algorithms for sorting in parallel also exist.
Selection Sort Sorting Algorithm #1: Selection sort Very easy to understand and implement. Not very efficient.
Selection Sort Running time? Θ(n2) – best, worst, average // Selection sort in Java public static void sort(int[] a){ int minPos, temp; for (int i=0; i<=a.length-2; i++){ // Find the position of the value that belongs in position i minPos = i; for (int j=i+1; j<=a.length-1; j++) if (a[j] < a[minPos]) minPos = j; // Swap the values in positions i and min temp = a[i]; a[i] = a[minPos]; a[minPos] = temp; } Running time? Θ(n2) – best, worst, average
Selection Sort
Insertion Sort Sorting Algorithm #2: Insertion sort Also very easy to understand and implement. Also not very efficient.
Insertion Sort Initial version: Running time? public static void insertionSort(int[] a) { int j; for (int i=1; i<=a.length-1; i++) { j=i; while (j>=1) { if (a[j] < a[j-1]) { temp=a[j-1]; a[j-1]=a[j]; a[j]=temp; } j=j-1; Running time? Θ(n2) – best, worst, average Analysis is same as for Selection-sort
Insertion Sort Second version: Running time? // This one eliminates the boolean variable public static void insertionSort(int[] a) { int j; for (int i=1; i<=a.length-1; i++) { j=i; while ((j>=1) && (a[j]<a[j-1])) { temp=a[j-1]; a[j-1]=a[j]; a[j]=temp; j = j – 1; } Running time? Θ(n2) – worst (list in reverse order) Θ(n) – best (list already sorted)
Insertion Sort More Technically, assuming the list is already sorted… On the ith iteration of the outer loop, as i goes from 1 to a.length-1, the inner loop executes exactly 1 time. This gives a total of n iterations of the inner loop. Again, note that we only counted the number of iterations of the inner loop.
Insertion Sort Third version: Running time? // Another slight improvement in efficiency public static void insertionSort(int[] a) { int j, v; for (int i=1; i<=a.length-1; i++) { j=i; v = a[j]; while ((j>=1) && (v<a[j-1])) { a[j]=a[j-1]; j=j-1; } a[j] = v; Running time? Θ(n2) – worst (list in reverse order) Θ(n) – best (list already sorted)
Bubble Sort Sorting Algorithm #3: Bubble sort Also very easy to understand and implement. Also not very efficient. Several minor variations and enhancements are possible.
Bubble Sort Initial version: Running time? public static void bubbleSort1(int[] a) { int temp; for (int i=1; i<=a.length-1; i++) { for (int j=0; j<a.length-i; j++) { if (a[j] > a[j+1]) { temp = a[j]; a[j] = a[j+1]; a[j+1] = temp; } Running time? Θ(n2) – best, worst, average Analysis is same as for Selection-sort
Bubble Sort Second version: (fewer bubbles) Running time? // This version stops when a pass occurs with no swaps. public static void bubbleSort1(int[] a) { int i, temp; boolean doMore; i = 1; doMore = true; while ((i<=a.length-1) && (doMore)) { doMore = false; for (int j=0; j<a.length-i; j++) if (a[j] > a[j+1]) { temp = a[j]; a[j] = a[j+1]; a[j+1] = temp; } i = i + 1; Running time? Θ(n2) – worst (list in reverse order) Θ(n) – best (list already sorted)
Bubble Sort More Technically, assuming the list is already sorted… On the 1st iteration of the outer loop, inner loop executes exactly n times. The outer loops only executes 1. Again, note that we only counted the number of iterations of the inner loop.
Quick Sort Sorting Algorithm #4: Quick sort Proposed by C.A.R. Hoare in 1962. Divide-and-conquer algorithm. More efficient than selection, insertion, or bubble sort, on average. Worst case is just as bad - Θ(n2) Very practical. The others are very practical too for the right data, i.e., almost sorted.
≤ x x x ≥ x x Quick Sort Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray x elements in upper subarray. Conquer: Recursively sort the two subarrays. Combine: Trivial. Key: Linear-time partitioning subroutine. ≤ x x x ≥ x x
exchange A[i] A[ j] exchange A[ p] A[i] return i Quick Sort Partitioning algorithm from the book: PARTITION(A, p, q) A[ p . . q] x A[ p] pivot = A[ p] i p for j p + 1 to q do if A[ j] x then i i + 1 exchange A[i] A[ j] exchange A[ p] A[i] return i
Quick Sort 6 10 13 5 8 3 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 6 5 3 10 8 13 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 6 5 3 10 8 13 2 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 6 5 3 10 8 13 2 11 6 5 3 2 8 13 10 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 6 5 3 10 8 13 2 11 6 5 3 2 8 13 10 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 6 5 3 10 8 13 2 11 6 5 3 2 8 13 10 11 i j
Quick Sort 6 10 13 5 8 3 2 11 6 5 13 10 8 3 2 11 6 5 3 10 8 13 2 11 6 5 3 2 8 13 10 11 2 5 3 6 8 13 10 11 i
QUICKSORT(A, p, r) if p < r then q PARTITION(A, p, r) QUICKSORT(A, p, q–1) QUICKSORT(A, q+1, r) Initial call: QUICKSORT(A, 1, n)
Quick Sort
Quick Sort i j What would partition do in the worst-case? 3 5 8 10 11 13 21 35 i j
Quick Sort – Recursion Tree T(n) = T(0) + T(n–1) + cn T(n)
Worst Case Recursion Tree T(n) = T(0) + T(n–1) + cn cn T(0) T(n–1)
Quick Sort – Recursion Tree T(n) = T(0) + T(n–1) + cn cn T(0) c(n–1) T(0) T(n–2)
Quick Sort – Recursion Tree T(n) = T(0) + T(n–1) + cn cn T(0) c(n–1) T(0) c(n–2) T(0) … (1)
Quick Sort – Recursion Tree T(n) = T(0) + T(n–1) + cn cn T(0) c(n–1) T(0) c(n–2) T(0) … (1)
Quick Sort – Recursion Tree T(n) = T(0) + T(n–1) + cn cn (1) c(n–1) T(n) = (n) + (n2) = (n2) (1) c(n–2) h = n (1) … (1)
Quick Sort Best case analysis: If we get lucky, partition splits the array evenly T(n) = 2T(n/2) + Θ(n) = Θ(nlgn) (same as Merge-Sort) What if the split is 1/10 : 9/10? T(n) = T(n/10) + T(n/10) + Θ(n) = Θ(nlgn) (left as an exercise – recursion tree) In fact, any split by a constant proportional amount will lead to Θ(nlgn)
Quick Sort What if alternates between best and worst cases: Solution: L(n) = 2U(n/2) + Θ(n) U(n) = L(n-1) + Θ(n) Solution: = (2L(n/2 - 1) + Θ(n/2)) + Θ(n) = 2L(n/2 - 1) + Θ(n) = Θ(nlgn) (left as an exercise)
Heap Sort In this context, the term heap has nothing to do with memory organization! Heap properties: Forms an almost-complete binary tree, i.e., completely filled on all levels, except possibly the lowest, which is filled from the left up to some point. The value at any given node is greater than the value of both it’s children (max-heap). The root will have the largest value. 2 4 1 7 8 9 3 14 10 16
Heap Sort An important operation on heaps is Max-Heapify, which pushes a value down the tree if it violates the heap property. 2 4 1 7 8 9 3 14 10 5
Heap Sort In such a case, Max-Heapify will swap the value with the larger of it’s two children and then repeat. 2 4 1 7 8 9 3 14 10 5
Heap Sort In such a case, Max-Heapify will swap the value with the larger of it’s two children and then repeat. 2 4 1 7 8 9 3 5 10 14
Heap Sort In such a case, Max-Heapify will swap the value with the larger of it’s two children and then repeat. 2 4 1 7 5 9 3 8 10 14
Heap Sort Sometimes Max-Heapify will push a value all the way to the leaf-level. 2 4 1 7 8 9 3 14 10
Heap Sort Sometimes Max-Heapify will push a value all the way to the leaf-level. 2 4 1 7 8 9 3 10 14
Heap Sort Sometimes Max-Heapify will push a value all the way to the leaf-level. 2 4 1 7 3 9 8 10 14
Heap Sort Sometimes Max-Heapify will push a value all the way to the leaf-level. 2 3 1 7 4 9 8 10 14
Heap Sort 2 4 1 7 8 9 3 14 10 16 For the above: 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 11 12 16 14 2 4 1 7 8 9 3 14 10 16 For the above: A.length = 12 A.heap-size = 10
Heap Sort The Max-Heapify procedure can then be specified as:
Heap Sort
Heap Sort
Heap Sort The Heap-Sort algorithm works by: Building a heap “Removing” the value at the root of the heap Replacing the root value with the value in the highest numbered position Re-heapifying the array, starting at the root
Heap Sort
Heap Sort
Heap Sort Build-Max-Heap Analysis: One call to Max-Heapify costs O(lgn) time. Build-Max-Heap makes O(n) such calls. Total is O(nlgn) cost. This gives an upper-bound, but one that is not tight.
Heap Sort
Heap Sort
Heap Sort 15
How to Compare Algorithms in Efficiency Empirical Analysis: Experimentation: Wall-clock time CPU time Requires many different inputs Can you predict performance before implementing the algorithm? Theoretical Analysis: Approximation by counting important operations Mathematical functions based on input size (N)
How Fast/Slow Can It Get? (10G Hz, assume 1010 operations/sec) Nlog2N N2 2N 10 33 100 1,024 (10-8 sec) 664 10,000 1.3 x 1030 (4 x1012 years) 1,000 9,966 1,000,000 Forever?? 132,877 100,000,000 Eternity??