Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2003-2011 Curt Hill Sorting Ordering an array.

Similar presentations


Presentation on theme: "Copyright © 2003-2011 Curt Hill Sorting Ordering an array."— Presentation transcript:

1 Copyright © 2003-2011 Curt Hill Sorting Ordering an array

2 Copyright © 2003-2011 Curt Hill Considered Topics Simple sort schemes –Algorithms with some code More complicated sort schemes Performance considerations –Time to sort in terms of the number of records N –How the number of compares and moves relate to the size of N

3 Copyright © 2003-2011 Curt Hill Selection Sort Basic idea Scan the entire array Find the smallest element Move to the top Remove the top from further consideration Repeat until entire array is sorted

4 Copyright © 2003-2011 Curt Hill How it works 8 3 1 9 14 2 6 Top element Least element 1 3 8 9 14 2 6 Sorted part of array Unsorted part of array

5 Copyright © 2003-2011 Curt Hill Code void sort(int ar[], int size){ int temp; for(int i=0;i<size-1;i++){ temp = i; for (int j=i+1;j<size;j++) if(ar[temp]>ar[j]) temp = j; if(temp!=i) { int val =ar[i]; ar[i] = ar[temp]; ar[temp] = val; } //swap } // outer for }

6 Copyright © 2003-2011 Curt Hill Best and Worst Cases This is an unusual algorithm in that the best and worst case are almost the same Best case –Already sorted –No moves are needed –All the compares are still done Worst case –Inversely sorted –Same number of compares –N-1 moves

7 Copyright © 2003-2011 Curt Hill How it performs The first element is compared with all the other elements –N-1 compares The second element is compared with remaining –N-2 compares Compares: (N-1)+(N-2)+…1  (N-1) 2 /2 Moves N-1

8 Copyright © 2003-2011 Curt Hill Comparing running times Mostly we are not concerned with many of the little issues of this analysis –It is N-1 instead of N –There is a factor of N 2 /2 instead of N 2 When we have two different factors we always take the most expensive –N 2 compares instead of N moves Thus selection sort is O(N 2 )

9 Copyright © 2003-2011 Curt Hill Common Os Constant time O(c) or O(1) –Hashing is constant time Logarithmic time O(log 2 N) –Binary and tree searches Linear time O(N) –File scans, bad searches N log N, O(N log 2 N) – no other name –Good sorts N Squared O(N 2 ) –Bad sorts Polynomial O(N X ) –Expensive but doable Exponential O(e N ) –Intractable

10 Copyright © 2003-2011 Curt Hill Bubble Sort Basic idea Start at top Compare adjacent elements Exchange if out of order Repeat until a pass has no exchanges

11 Copyright © 2003-2011 Curt Hill First Pass 8 3 1 9 14 2 6 8 3 1 9 2 6 Small items bubble up slowly –One element per pass Large items sink quickly –Keep descending until they find a larger item or hit bottom 8 3 1 9 14 2 6 8 3 1 9 2 6

12 Copyright © 2003-2011 Curt Hill Code void sort (int ar[], int size){ bool swapped; do { swapped = false; for(int j = 0;j<size-1;j++) if(ar[j] < ar[j+1]){ int temp = ar[j]; ar[j] = ar[j+1]; ar[j+1] = temp; swapped = true; } // if } // do while(swapped); }

13 Copyright © 2003-2011 Curt Hill How it performs Bubble sort makes many moves but always a short distance It also does many redundant compares O(N 2 ) Big O notation makes this comparable with selection –Usually much worse –Have to be creative to make a worse sort

14 Copyright © 2003-2011 Curt Hill Best and Worst Cases Best case –Already sorted –One pass through does no exchanges and quits Worst case –Inversely sorted –The smallest only moves up one –N-1 passes –The case of all elements sorted except first element is in last slot is almost as bad

15 Copyright © 2003-2011 Curt Hill Bubble Again Consider two symmetric cases, sorted with one exception: largest or smallest as far away as possible –One takes two passes the other N-1 The problem is the direction of the scan –Items going in that direction move fast –Items going other direction slowly This suggests a fix

16 Copyright © 2003-2011 Curt Hill Shaker Sort Basic idea is same as bubble sort Scan top to bottom in odd passes Scan bottom to top in even passes

17 Copyright © 2003-2011 Curt Hill First and Second Passes 8 3 1 9 14 2 6 8 3 1 9 2 6 8 3 1 9 2 6 First pass goes top to bottom Second pass goes bottom to top

18 Copyright © 2003-2011 Curt Hill How it performs Insignificantly different The worst case occurs very infrequently The extra work to handle them complicates every run O(N 2 )

19 Copyright © 2003-2011 Curt Hill The Previous Problems The problem with both of these is the short distance things are moved They usually move in the right direction but seldom far enough One fix is to compare non-adjacent elements How?

20 Copyright © 2003-2011 Curt Hill Shell Sort Start with a gap g, where 1  g  N Do a sort pass comparing elements separated by the gap and exchanging if needed Decrease the gap in each pass –Do not divide size by 2 When the gap is one it is a bubble sort but most of the large distance moving has been done

21 Copyright © 2003-2011 Curt Hill First Pass 8 3 1 9 14 2 68 3 1 9 2 6 First: 8 and 1 exchanged Third: 14 and 2 exchanged Fourth: 6 and 14 exchanged Gap = 3 8 3 1 9 14 2 6 8 3 1 9 2 6

22 Copyright © 2003-2011 Curt Hill How it performs The analysis is extremely difficult Empirically the O(N 1.25 ) This makes it better for any all but insignificant table size than bubble or selection The break even point between O(N 1.25 ) and O(N log 2 N) is size=65000, however the constant factor on Shell is large so the break even point is much smaller Still inferior to the N log N sorts for large tables

23 Copyright © 2003-2011 Curt Hill Insertion Sort Partition the array into two pieces The first one and all the rest The first part of the array is already sorted Remove the first unsorted item Insert into the correct location of the sorted part

24 Copyright © 2003-2011 Curt Hill How it works 8 3 1 9 14 2 6 Sorted part of array Unsorted part of array 8 1 9 14 2 6 3 Remove 3 8 1 9 14 2 6 3 Insert

25 Copyright © 2003-2011 Curt Hill How it performs Best case is sorted Worst case is inversely sorted Yet another N 2 Moves N-1

26 Copyright © 2003-2011 Curt Hill Merge Sort Merge increasingly larger sorted runs into a single much larger run Start with runs of 1 Merge two runs into a temporary area Copy it back

27 Copyright © 2003-2011 Curt Hill Pass 1 8 3 1 9 14 2 6 Start: Each element is a run of 1 8 3 1 9 14 2 6 Run 1 Run 3 Run 2 Run 4 End of pass 1: runs of 2 Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7

28 Copyright © 2003-2011 Curt Hill Pass 2 8 3 1 9 14 2 6 Runs of 2 8 3 1 9 14 2 6 Run 1 Run 2 Runs of 4 Run 1 Run 2 Run 3 Run 4 1 3 8 14

29 Copyright © 2003-2011 Curt Hill Important points on Merge An item in a run can never be compared with any other element in the same run It generalizes to files nicely Requires extra copy space equal to the size of longest run In last pass that is entire array First of O(N log 2 N) sorts The insertion process generates many moves

30 Copyright © 2003-2011 Curt Hill Quick Sort A complicated but very fast sort –Usually the best of the in memory sorts Never compares two items twice Always moves things in the right direction Usually moves them a relatively long distance

31 Copyright © 2003-2011 Curt Hill Algorithm I The first item is called the pivot –It will be the middle element From the top look for an item that is larger From the bottom look for an item that is smaller The two items are respectively in the wrong “half” of the table –Recall the pivot will be the middle item Exchange the two

32 Copyright © 2003-2011 Curt Hill Algorithm II When searches collide move the pivot there Now have three partitions: –Lower – sort it by itself –Pivot – nothing more needs to be done –Higher – sort it by itself

33 Copyright © 2003-2011 Curt Hill Quick Sort 8 3 1 9 14 2 6 Start, pivot is 8 start looking 8 3 1 9 14 2 6 8 3 1 9 2 6 1 st exch 2 nd exch 8 3 1 9 14 2 6 Pivot exch 2 3 1 9 14 8 6 Done found Three partitions

34 Copyright © 2003-2011 Curt Hill Performance A pair of distinct keys are never compared twice The trick is partitioning the array into two separate pieces that never interact again (½ N) 2 + (½ N) 2 < N 2 –20 2 =400 –10 2 +10 2 = 200 O(N log 2 N)

35 Copyright © 2003-2011 Curt Hill More on Performance A happy accident is that the pivot may be placed in a CPU register –It is the only value compared to the entire array –This makes it free and quick to access Notice the recursive nature of this algorithm –The array is partitioned into two pieces –These are different sizes and the sort is recursively invoked on them

36 Copyright © 2003-2011 Curt Hill Best and Worst Case It does better on unsorted file than sorted –Counter-intuitive The worst case is the sorted or inversely sorted file –The chosen partition divides the table into two, not three, partitions –N 2 In this case

37 Copyright © 2003-2011 Curt Hill Improvements 1 The worst case makes one think about choosing a different pivot Any searching for a pivot will slow the average process with a search The case of a sorted array to be sorted is extremely unlikely –For 10 elements –2 chances in 3628800 for it to be already sorted

38 Copyright © 2003-2011 Curt Hill Improvements 2 The partitioning scheme is complicated enough that it does worse than simple sorts in very small arrays: 6-12 entries –Recursion to sort an table of length 3 is wasteful in memory and CPU cycles The only real improvement is to use a simpler sort when the partition size gets small –If the partition is small just use a simple N 2 sort

39 Two more thoughts Virtual memory can disrupt sorting when pieces of the array are paged out –True for any sort –If possible fix the pages Quick sort could use threads –Spawn a thread for one of the partitions if it were of sufficient size –Would need to be large to make thread overhead worth while Copyright © 2003-2011 Curt Hill

40 Heap Sort Builds a binary tree in the array The positions of the left and right sub-trees are implicit rather than needing pointers Also O(N log 2 N) sort Rather complicated Will not be shown

41 Copyright © 2003-2011 Curt Hill Heap Sort Performance Slowest of the O(N log 2 N) sorts Advantages: –Does not need recursion of quicksort –Does not need extra space of mergesort –Worst case is still O(N log 2 N) unlike other two

42 Summary Several sorts with varying performance: –N 2 : Selection, Bubble, Shaker, Insertion –N 1.25 : Shell –N log 2 N: Merge, Quick, Heap Copyright © 2003-2011 Curt Hill


Download ppt "Copyright © 2003-2011 Curt Hill Sorting Ordering an array."

Similar presentations


Ads by Google