2How fast can we sort? Insertion Sort: O(n2) Merge Sort, Quick Sort (expected), Heap Sort: O(nlgn)Can we sort faster than O(nlgn)?
3Comparison-Based Sort Can prove that any comparison-based sort MUST make Ω(nlgn) comparisons.Comparison-based sort: algorithm directly compares elements of the array to one another, makes some decision based upon <, =, >This means that heapsort and mergesort are optimal.Not always possible to prove that your algorithm is the best possible!Proof ideaUse a decision tree to represent comparisons of a sorting algorithmActual sorting algorithm maps to the decision tree
4Decision Tree Example Three elements to sort: a1, a2, a3 Decision tree must have n! leaves, one for each permutation of the input
5Decision Tree For any comparison-based sorting algorithm One tree exists for each input of length nAn algorithm “splits” at each decision/comparison unwinding the actual execution into a tree pathThe tree is for all possible execution traces for an input of length nHeight of the tree tells us the minimum number of comparisons necessary to sort the inputThere must be n! leavesBut we also have a binary tree; binary tree of height h has no more than 2h leavesThis means that n! 2h
6Height of Tree n! 2h lg(n!) lg(2h) Stirlings approximation says: Giving us:This means h is Ω(nlgn)
7Better than O(nlgn)?This means any comparison-based sorting algorithm must require at least time O(nlgn)But we can do better if we use a non-comparison-based sorting algorithm!Count SortRadix SortBucket SortPostman Sort
8Count Sort May run in O(n) time But requires assumptions about the size and nature of the inputInput: A[1..n] where a[i] is an integer in the range 1..kOutput: B[1..n], array input values sortedUses: C[1..k] auxiliary storageIdea: Using the random access of the array C, count up the number of times each input element appears and then collect them together
9Count Sort Algorithm Count-Sort(A,n) for i=1 to k do C[i]=0 ; Initialize to 0for j=1 to n do C[A[j]] ++ ; Countj=1for i=1 to k doif (C[i]>0) thenfor z=1 to C[i] doB[j]=ij++Ex: A=[ ] ; Initial valuesC=[ ] ; Initialize C to 0C=[ ] ; After count loopB=[ ] ; After last loopRuntime: O(n+k). If k is close to n, then runs in O(n) time.A bad example would be a list such as: A[1,2, ]
10Stable Count SortA sorting algorithm is stable if the occurrence of a value I appears in the same value in the output as they do in the inputTies broken by the rule of whichever appeared first in the input appears first in the outputDesired for various algorithms where sorting a key of a record (e.g. a zip code)Example:A[3,5a,9,2,4,5b,6]Sorts to : A[2,3,4,5a,5b,6,9] not A[2,3,4,5b,5a,6,9]Prior algorithm was not stable but we can modify to make it stable
11Stable Count SortKey idea: fetch from original array to make it stable.
12Radix Sort Works like the punch card readers of the early 1900’s Only works on input items with digitsE.g. numbers, lettersIdea: Sort on the least significant digit first; Array A of n elements, each element d digitsRadix-Sort(A,d,n)for i = 1 to dStable-Sort(A) with digit i as the key
14Radix Sort NotesInternal sort must be stable to preserve order of elements. Based on the observation that if all the lower order digits are already sorted, then we only need to sort on the highest order digit.Good choice to use if each digit is small in magnitudeIf k is the maximum value of a digitIf using stable count sort internallyStable-Count-Sort takes Θ(k+n) time for one passWith d passes total, runtime is Θ(dk+dn)Runtime is Θ(n) if d is a constant, k is same magnitude as n
15Bucket SortBucket sort works similarly to count sort, but uses a “bucket” to hold a range of inputs. This allows it to operate on real numbers.Runs in O(n) time with the following assumptions:Input elements are in the interval [0..1].If not, in many cases we can divide by some “max” value to scale the input to be between 0 and 1Input elements are evenly distributedUniform probability over [0..1]
16Bucket Sort Algorithm Idea: Algorithm: Divide [0..1] into n equal sized “buckets”Put each of the n inputs into the proper bucket; some may be empty and others may have more than 1 elementSort each bucket using insertion sortTo produce output, go through the buckets in order, listing the elements in eachAlgorithm:Bucket-Sort(A,n)for i=1 to n do ; implicitly make buckets 0..n-1Insert A[i] into bucket B[n*A[i]]for i =0 to n-1 dosort bucket B[i] using insertion sort ; < O(n) if small bucketconcatenate buckets B, B, .. B[n-1] in order
18Bucket Sort Notes Expected runtime: If any element in A comes from [0..1] with equal probability, then the probability that an element e is in bucket B[i] is 1/nExpect on average each bucket to have a single elementCall to insertion sort for each bucket will be extremely fast; essentially constant timeRest of the algorithm runs in O(n)Overall runtime O(n) if the uniform probabilty assumption holdsWorst case O(n2) if everything ends up in one bucket
19Postman SortAlgorithm proposed by Robert Ramey, August 1992 issue of C Programming JournalAnalysis left as a homework exerciseFrom the article:When a postal clerk receives a huge bag of letters he distributes them into other bags by state. Each bag gets sent to the indicated state. Upon arrival, another clerk distributes the letters in his bag into other bags by city. So the process continues until the bags are the size one man can carry and deliver. This is the basis for my sorting method which I call the postman's sort.
20Postman Sort Idea Given a large list of records to sort alphabetically Make one pass, reading each record into one of 26 lists depending on the first letter in the fieldFirst list contains all records starting with “A”Second list contains all records starting with “B”, etc.Recurse on each of the 26 sublists if it contains more than one element, but this time using the next letter as the field to split up the sublistsIn-order traversal of the resulting tree gives a sorted list
21Postman Example A = [bob, bill, boy, cow, dog] A = [bob, bill, boy] A = [cow]A = [dog]Empty forall other lettersioEmpty forall other lettersA = [bill]A = [bob, boy]byEmpty forall other lettersA = [bob]A = [boy]Runtime? Left as a homework exercise