Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan

Similar presentations


Presentation on theme: "CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan"— Presentation transcript:

1 CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan davek@cs

2 Outline Sorting: The Problem Space Sorting by Comparison –Lower bound for comparison sorts –Insertion Sort –Heap Sort –Merge Sort –Quick Sort External Sorting Comparison of Sorting by Comparison Outline

3 Sorting: The Problem Space General problem Given a set of N orderable items, put them in order Without (significant) loss of generality, assume: –Items are integers –Ordering is  (Most sorting problems map to the above in linear time.)

4 Lower Bound for Sorting by Comparison Sorting by Comparison –Only information available to us is the set of N items to be sorted –Only operation available to us is pairwise comparison between 2 items What is the best running time we can possibly achieve?

5 Decision Tree Analysis of Sorting by Comparison

6 Max depth of decision tree How many permutations are there of N numbers? How many leaves does the tree have? What’s the shallowest tree with a given number of leaves? What is therefore the worst running time (number of comparisons) by the best possible sorting algorithm?

7 Lower Bound for log(n!) Stirling’s approximation:

8 Insertion Sort Basic idea After k th pass, ensure that first k+1 elements are sorted On k th pass, swap (k+1) th element to left as necessary 7283596 2783596 2783596 Start After Pass 1 After Pass 2 2738596 After Pass 3 2378596 What if array is initially sorted? What if array is initially reverse sorted?

9 Why Insertion Sort is Slow Inversion: a pair (i,j) such that i Array[j] Array of size N can have  (N 2 ) inversions –average number of inversions in a random set of elements is N(N-1)/4 Insertion Sort only swaps adjacent elements –only removes 1 inversion!

10 HeapSort Sorting via Priority Queue (Heap) Basic idea: Shove items into a priority queue, take them out smallest to largest. Worst Case: Best Case:

11 MergeSort Merging Cars by key [Aggressiveness of driver]. Most aggressive goes first. MergeSort (Table [1..n]) Split Table in half Recursively sort each half Merge two sorted halves together Merge (T1[1..n],T2[1..n]) i1=1, i2=1 While i1<n, i2<n If T1[i1] < T2[i2] Next is T1[i1] i1++ Else Next is T2[i2] i2++ End If End While

12 MergeSort Analysis Running Time –Worst case? –Best case? –Average case? Other considerations besides running time?

13 QuickSort Basic idea: Pick a “pivot”. Divide into less-than & greater-than pivot. Sort each side recursively. Picture from PhotoDisc.com

14 QuickSort Partition 7283596 Pick pivot: Partition with cursors 7283596 <> 7283596 <> 2 goes to less-than

15 QuickSort Partition (cont’d) 7263598 <> 6, 8 swap less/greater-than 7263598 3,5 less-than 9 greater-than 7263598 Partition done. Recursively sort each side.

16 Analyzing QuickSort Picking pivot: constant time Partitioning: linear time Recursion: time for sorting left partition (say of size i) + time for right (size N-i-1) T(1) = b T(N) = T(i) + T(N-i-1) + cN where i is the number of elements smaller than the pivot

17 QuickSort Worst case Pivot is always smallest element. T(N) = T(i) + T(N-i-1) + cN T(N)= T(N-1) + cN = T(N-2) + c(N-1) + cN = T(N-k) + = O(N 2 )

18 Optimizing QuickSort Choosing the Pivot –Randomly choose pivot Good theoretically and practically, but call to random number generator can be expensive –Pick pivot cleverly “Median-of-3” rule takes Median(first, middle, last). Works well in practice. Cutoff –Use simpler sorting technique below a certain problem size (Weiss suggests using insertion sort, with a cutoff limit of 5-20)

19 QuickSort Best Case Pivot is always middle element. T(N) = T(i) + T(N-i-1) + cN T(N)= 2T(N/2 - 1) + cN

20 QuickSort Average Case Assume all size partitions equally likely, with probability 1/N details: Weiss pg 278-279

21 External Sorting When you just ain’t got enough RAM … –e.g. Sort 10 billion numbers with 1 MB of RAM. –Databases need to be very good at this MergeSort Good for Something! –Basis for most external sorting routines –Can sort any number of records using a tiny amount of main memory in extreme case, only need to keep 2 records in memory at any one time!

22 External MergeSort Split input into two tapes Each group of 1 records is sorted by definition, so merge groups of 1 to groups of 2, again split between two tapes Merge groups of 2 into groups of 4 Repeat until data entirely sorted log N passes

23 Better External MergeSort Suppose main memory can hold M records. Initially read in groups of M records and sort them (e.g. with QuickSort). Number of passes reduced to log(N/M) k-way mergesort reduces number of passes to log k (N/M) –Requires 2k output devices (e.g. mag tapes) But wait, there’s more … Polyphase merge does a k-way mergesort using only k+1 output devices (plus k th -order Fibonacci numbers!)

24 Sorting by Comparison Summary Sorting algorithms that only compare adjacent elements are  (N 2 ) worst case – but may be  (N) best case HeapSort -  (N log N) both best and worst case –Suffers from two test-ops per data move MergeSort -  (N log N) running time –Suffers from extra-memory problem QuickSort -  (N 2 ) worst case,  (N log N) best and average case –In practice, median-of-3 almost always gets us  (N log N) –Big win comes from {sorting in place, one test-op, few swaps}! Any comparison-based sorting algorithm is  (N log N) External sorting: MergeSort with  (log N/M) passes


Download ppt "CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan"

Similar presentations


Ads by Google