Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sorting. Typically, most organizations spend large percentage of computing time (25% - 50%) on sorting Internal sorting the list is small enough to sort.

Similar presentations


Presentation on theme: "Sorting. Typically, most organizations spend large percentage of computing time (25% - 50%) on sorting Internal sorting the list is small enough to sort."— Presentation transcript:

1 Sorting

2 Typically, most organizations spend large percentage of computing time (25% - 50%) on sorting Internal sorting the list is small enough to sort entirely in main memory bubble sorting insertion sorting quick sorting shell sorting heap sorting bucket sorting radix sorting

3 Divide-And-Conquer Sorting Small instance.  n <= 1 elements.  n <= 10 elements.  We’ll use n <= 1 for now. Large instance.  Divide into k >= 2 smaller instances.  k = 2, 3, 4, … ?  What does each smaller instance look like?  Sort smaller instances recursively.  How do you combine the sorted smaller instances?

4 Insertion Sort k = 2 First n - 1 elements (a[0:n-2]) define one of the smaller instances; last element (a[n-1]) defines the second smaller instance. a[0:n-2] is sorted recursively. a[n-1] is a small instance.a[0]a[n-1] a[n-2]

5 Insertion Sort Combining : insert a[n-1] into the sorted a[0:n-2] Complexity is O(n 2 ). Usually implemented nonrecursively.a[0]a[n-1] a[n-2]

6 Insertion Sort for each j between 1 and n-1, insert list[j] into already sorted subfile list[0],···,list[j-1] Left Out of Order(LOO) relative disorder R i is LOO iff R i < max{R j } 0<j<i The insertion step is executed only for those records that are LOO Time complexity Best case: O(n), worst case: O(n 2 ) Does little work if the list is nearly sorted

7 Insertion Sort Example) - n = 5 - input sequence: (5, 4, 3, 2, 1) - all records R 0, R 1, R 2, R 3, R 4 are LOO

8 Insertion Sort Example) n = 5 input sequence (2, 3, 4, 5, 1) only R 4 is LOO

9 Insertion Sort void insertion_sort(element list[], int n) { /* perform a insertion sort of the list */ int i, j; element next; for (i = 1; i < n; i++) { next = list[i]; for (j = i - 1; (j >= 0) && (next.key < list[j].key); j--) list[j + 1] = list[j]; list[j + 1] = next; } 2 2 4 4 5 5 3 3 1 1 … list[] ij next card Current cards

10 Selection Sort k = 2 To divide a large instance into two smaller instances, first find the largest element. The largest element defines one of the smaller instances; the remaining n-1 elements define the second smaller instance.a[0]a[n-1] a[n-2]

11 Selection Sort The second smaller instance is sorted recursively. Append the first smaller instance (largest element) to the right end of the sorted smaller instance. Complexity is O(n 2 ). Usually implemented nonrecursively.a[0]a[n-1] a[n-2]

12 Bubble Sort Compare every list[i] and list[i+1] elements for i = 0, 1, ···, n-1; if out of order, then swap list[i] and list[i+1] Each pass filters up the largest key size is decreased by 1 after each pass If swap does not occur on the previous pass, then already sorted

13 Bubble Sort bubble_sort 1548350920 list 483159205043891520503489152050 last swap occurred 3489152050

14 Bubble Sort void bubble_sort(element list[], int n) { /* perform a bubble sort on the list */ int i, j; int flag = 1; element next; for (i = n - 1; flag > 0; i--) { flag = 0; for (j = 0; j < i; j++) if (list[j] > list[j + 1] { swap(&list[j], &list[j + 1]); flag = 1; } 2 2 4 4 5 5 3 3 1 1 … list[] jj+1

15 Bubble Sort If on any pass last swap occurs at j-th and (j+1)-th position then set i to the last value of j time complexity: worst case O(n 2 )

16 Bubble Sort Bubble sort may also be viewed as a k = 2 divide-and-conquer sorting method. Insertion sort, selection sort and bubble sort divide a large instance into one smaller instance of size n - 1 and another one of size 1. All three sort methods take O(n 2 ) time.

17 Divide And Conquer Divide-and-conquer algorithms generally have best complexity when a large instance is divided into smaller instances of approximately the same size. When k = 2 and n = 24, divide into two smaller instances of size 12 each. When k = 2 and n = 25, divide into two smaller instances of size 13 and 12, respectively.

18 Quick Sort Divide and conquer two phase split and control Use recursion stack is needed Best average time O(n·log 2 n)

19 Quick Sort Small instance has n <= 1. Every small instance is a sorted instance. To sort a large instance, select a pivot element from out of the n elements. Partition the n elements into 3 groups left, middle and right. The middle group contains only the pivot element. All elements in the left group are <= pivot. All elements in the right group are >= pivot. Sort left and right groups recursively. Answer is sorted left group, followed by middle group followed by sorted right group.

20 Choice of Pivot Pivot is leftmost element in list that is to be sorted.  When sorting a[6:20], use a[6] as the pivot.  Text implementation does this. Randomly select one of the elements to be sorted as the pivot.  When sorting a[6:20], generate a random number r in the range [6, 20]. Use a[r] as the pivot.

21 Choice of Pivot Median-of-Three rule. From the leftmost, middle, and rightmost elements of the list to be sorted, select the one with median key as the pivot.  When sorting a[6:20], examine a[6], a[13] ((6+20)/2), and a[20]. Select the element with median (i.e., middle) key.  If a[6].key = 30, a[13].key = 2, and a[20].key = 10, a[20] becomes the pivot.  If a[6].key = 3, a[13].key = 2, and a[20].key = 10, a[6] becomes the pivot.  If a[6].key = 30, a[13].key = 25, and a[20].key = 10, a[13] becomes the pivot.

22 Choice of Pivot When the pivot is picked at random or when the median-of-three rule is used, we can use the quick sort code of the text provided we first swap the leftmost element and the chosen pivot. pivot swap

23 Partitioning Into Three Groups Sort a = [6, 2, 8, 5, 11, 10, 4, 1, 9, 7, 3]. Leftmost element (6) is the pivot. When another array b is available:  Scan a from left to right (omit the pivot in this scan), placing elements <= pivot at the left end of b and the remaining elements at the right end of b.  The pivot is placed at the remaining position of the b.

24 Partitioning Example Using Additional Array 6285111041973 a b 2851110419736 Sort left and right groups recursively.

25 In-place Partitioning 1. Find leftmost element (bigElement) > pivot. 2. Find rightmost element (smallElement) < pivot. 3. Swap bigElement and smallElement provided bigElement is to the left of smallElement. 4. Repeat.

26 In-Place Partitioning X’X X : pivot 01n-1 swap the first element greater than pivot the first element smaller than pivot new pivot X’X ······ Xsorted

27 In-Place Partitioning X’’X X ······ X sorted

28 In-Place Partitioning Example 6285111041973 a 683 6235111041978 a 6111 6235110411978 a 6104 623514 11978 a 6104 bigElement is not to left of smallElement, terminate process. Swap pivot and smallElement. 42351411978 a 6106

29 Quick Sort Example) quicksort input file: 10 records (26,5,37,1,61,11,59,15,48,19) simulation of quicksort

30 Quick Sort void quicksort(element list[], int left, int right) { int pivot, i, j; element temp; if (left < right) { i = left; j = right + 1; pivot = list[left].key; do { do i++; while (list[i].key < pivot); do j--; while (list[j].key > pivot); if (i < j) SWAP(list[i], list[j], temp); } while (i < j); SWAP (list[left], list[j], temp); quicksort(list, left, j - 1); quicksort(list, j + 1, right); } 6 6 2 2 8 8 5 5 11 10 4 4 1 1 9 9 7 7 3 3 leftright i jpivot Pivot 과 j 번째 cell 과의 교환

31 Complexity Time complexity average case: O(n·log 2 n) split into “equal size” T(n): average time to sort n records T(n)  c·n + 2·T(n/2)  c·n + 2(c·n/2 + 2·T(n/4))  2·c·n + 4·T(n/4) ···  c·n·log 2 n + n·T(1) = O(n·log 2 n) worst case: O(n 2 ) when input list is already sorted

32 Complexity T(n) is maximum when either |left| = 0 or |right| = 0 following each partitioning: when the pivot is always the smallest or largest element. For the worst-case time, T(n) = T(n-1) + cn, n > 1 Use repeated substitution to get T(n) = O(n 2 ). The best case arises when |left| and |right| are equal (or differ by 1) following each partitioning. So the best-case complexity is O(n log n). To improve performance, define a small instance to be one with n <= 15 (say) and sort small instances using insertion sort.

33 Optimal Sorting Time How quickly can we sort a list on n objects? the best possible time: O(n·log 2 n) decision tree on a list (X 0,X 1,X 2 ) Stop [2,0,1]Stop [0,2,1]Stop [1,2,0]Stop [2,1,0] Stop [1,0,2]Stop [0,1,2] K 1  K 2 K 0  K 2 K 0  K 1 YesNo Yes No K 0  K 2 K 1  K 2 [1,0,2] [0,1,2] [1,2,0] [0,2,1] Yes No

34 Optimal Sorting Time Theorem) Any decision tree that sorts n distinct elements has a height of at least log 2 (n!) + 1 decision tree of n elements have n! leaves number of leaves of a BT of height k  2 k-1 height of the decision tree  log 2 (n!) + 1 Theorem) Any algorithm that sorts by comparisons only must have a worst case computing time of O(nlog 2 n) n!  (n/2) n/2 log 2 (n!)  (n/2)log 2 (n/2) = O(nlog 2 n)

35 Merge Sort k = 2 First ceil(n/2) elements define one of the smaller instances; remaining floor(n/2) elements define the second smaller instance. Each of the two smaller instances is sorted recursively. The sorted smaller instances are combined using a process called merge. Complexity is O(n log n). Usually implemented nonrecursively.

36 Merge Two Sorted Lists A = (2, 5, 6) B = (1, 3, 8, 9, 10) C = () Compare smallest elements of A and B and merge smaller into C. A = (2, 5, 6) B = (3, 8, 9, 10) C = (1)

37 Merge Two Sorted Lists A = (5, 6) B = (3, 8, 9, 10) C = (1, 2) A = (5, 6) B = (8, 9, 10) C = (1, 2, 3) A = (6) B = (8, 9, 10) C = (1, 2, 3, 5)

38 Merge Two Sorted Lists A = () B = (8, 9, 10) C = (1, 2, 3, 5, 6) When one of A and B becomes empty, append the other list to C. O(1) time needed to move an element into C. Total time is O(n + m), where n and m are, respectively, the number of elements initially in A and B.

39 Merge Sort [8, 3, 13, 6, 2, 14, 5, 9, 10, 1, 7, 12, 4] [8, 3, 13, 6, 2, 14, 5][9, 10, 1, 7, 12, 4] [8, 3, 13, 6][2, 14, 5] [8, 3][13, 6] [8][3][13][6] [2, 14][5] [2][14] [9, 10, 1][7, 12, 4] [9, 10][1] [9][10] [7, 12][4] [7][12]

40 Merge Sort [3, 8][6, 13] [3, 6, 8, 13] [8][3][13][6] [2, 14] [2, 5, 14] [2, 3, 5, 6, 8, 13, 14] [5] [2][14] [9, 10] [1, 9, 10] [1] [9][10] [7, 12] [4, 7, 12] [1, 4, 7, 9, 10,12] [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13,14] [4] [7][12]

41 Time Complexity Let t(n) be the time required to sort n elements. t(0) = t(1) = c, where c is a constant. When n > 1, t(n) = t(ceil(n/2)) + t(floor(n/2)) + dn, where d is a constant. To solve the recurrence, assume n is a power of 2 and use repeated substitution. t(n) = O(n log n)

42 Merge Sort Downward pass over the recursion tree. Divide large instances into small ones. Upward pass over the recursion tree.  Merge pairs of sorted lists. Number of leaf nodes is n. Number of non-leaf nodes is n-1.

43 Time Complexity Downward pass.  O(1) time at each node.  O(n) total time at all nodes. Upward pass.  O(n) time merging at each level that has a non-leaf node.  Number of levels is O(log n)  Total time is O(n log n)

44 Nonrecursive Version Eliminate downward pass. Start with sorted lists of size 1 and do pairwise merging of these sorted lists as in the upward pass.

45 Nonrecursive Merge Sort [8][3][13][6][2][14][5][9][10][1][7][12][4] [3, 8][6, 13][2, 14] [5, 9] [1, 10][7, 12] [4] [3, 6, 8, 13][2, 5, 9, 14][1, 7, 10, 12] [4] [2, 3, 5, 6, 8, 9, 13, 14][1, 4, 7, 10, 12] [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14]

46 Non-recursive Merge Sort 265771611159154819 5 26 1 77 11 61 15 59 19 48 1 5 26 77 11 15 59 61 1 5 11 15 26 59 61 77 1 5 11 15 19 26 48 59 61 77 19 48

47 Merge Sort (1/3) void merge(element initList[], element mergedList[], int i, int m, int n) { /* initList[1:m] 과 initList[m+1:n] 는 정렬된 리스트. 이들은 정렬된 리스트 mergedList[i:n] 으로 합병된다.*/ int j,k,t; j = m+1; k = i; while( i <= m && j <= n) { if (initList[i].key <= initList[j].key) mergeList[k++] = initList[i++]; else mergeList[k++] = initList[j++]; } if (i > m) /* mergedList[k:n] = initList[j:n]*/ for(t = j; t <= n; t++) mergeList[t] = initList[t]; else /* mergedList[k:n] = initList[i:m] */ for(t = i; t <= m; t++) mergeListpk+t-i] = initList[t]; } i mm+1 n [i:m][m+1:n] [i:n] merging j initList[] mergedList[] k

48 Merge Sort (2/3) void mergePass(element initList[], element mergedList[], int n, int s) { /* 길이가 s 인 서브리스트의 인접 쌍들이 // initList 에서부터 mergedList 로 합병된다. n 은 initList 에 있는 레코드 수이다. for(i = 1; i <= n-2*s+1; i+= 2*s) merge(initList, mergedList, i, i+s-1, i+2*s-1); if((i+s-1)<n) merge(initList, mergedList, i, i+s-1, n); else { for(j=i; j <= n; j++) mergedList[j] = initList[j]; } void mergeSort(element a[], int n) { int s = 1; /* 현재 segment 크기 */ element extra[MAX_SIZE]; while (s<n) { mergePass(a, extra, n, s); s *= 2; mergePass(extra, a, n, s); s *= 2; } ii+s-1i+2s-1 한 segment 의 크기 : s 개 s s …a[ ] extra[ ] 2s … merging i+s

49 265771611159154819 5 26 1 77 11 61 15 59 19 48 1 5 26 77 11 15 59 61 1 5 11 15 26 59 61 77 1 5 11 15 19 26 48 59 61 77 19 48 [1][2][3][4][5][6][7][8][9][10] [1][4][5][8][9][10] [1][8][9][10] [1][10] i+s-1 > n i+s-1 < n …a[ ] [n][n-2*s+1]

50 Complexity Sorted segment size is 1, 2, 4, 8, … Number of merge passes is ceil(log 2 n). Each merge pass takes O(n) time. Total time is O(n log n). Need O(n) additional space for the merge. Merge sort is slower than insertion sort when n <= 15 (approximately). So define a small instance to be an instance with n <= 15. Sort small instances using insertion sort. Start with segment size = 15.

51 Recursive Merge Sort

52 Natural Merge Sort Initial sorted segments are the naturally ocurring sorted segments in the input. Input = [8, 9, 10, 2, 5, 7, 9, 11, 13, 15, 6, 12, 14]. Initial segments are: [8, 9, 10] [2, 5, 7, 9, 11, 13, 15] [6, 12, 14] 2 (instead of 4) merge passes suffice. Segment boundaries have a[i] > a[i+1].

53 Natural Merge Sort 2619 11 59 5 26 77 15 19 48 1 5 11 26 59 61 77 1 5 11 15 19 26 48 59 61 77 15 48 1 61 5 26 1 11 59 61 15 19 48 2619 11 59 15 48 1 61 5 26

54 Heap Sort Utilize the max heap structure implement max heap by using array Time complexity average case : O(n·log 2 n) worst case : O(n·log 2 n) Adjust the binary tree to establish the heap time: O(d) where d: depth of tree

55 Heap Sort Example) heap sorting process - input list (26,5,77,1,61,11,59,15,48,19) Array interpreted as a binary tree 48 1 5 6111 26 59 77 1519 [1] [2] [3] [4][5] [6] [7] [8] [9] [10] 26 5 77 1 61 11 59 15 48 19 [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

56 Heap Sort 1 48 61 1911 77 26 59 155 Initial max heap construction

57 Heap Sort void adjust (element a[], int root, int n) { element temp; temp = a[root]; rootkey = a[root].key; child = 2*root; while(child<=n) { if((child<n) && (a[child].key < a[child+1].key)) child++; if(rootkey > a[child].key) break; else { a[child/2] = a[child]; child *= 2; } a[child/2] = temp; } void heapSort (element a[], int n) { int i, j; element temp; for(i=n/2; i>0; i--) /* 초기 heap 구축 */ adjust(a,i,n); for(i=n-1; i>0;i--) { SWAP(a[1], a[i+1],temp); adjust(a, 1, i); } 48 1 5 6111 26 59 77 1519 [1] [2] [3] [4][5] [6] [7] [8] [9] [10]

58 Heap Sort 1 48 61 1911 77 26 59 155 77 61 59 48 19 11 26 15 1 [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]5 77 5 1 15 48 1911 61 26 59 577

59 Heap Sort 15 48 1911 59 1 26 57761 15 19 511 48 1 26 776159

60 Heap Sort 15 19 51 26 11 776159 48

61 Heap Sort void adjust(element list[], int root, int n) { int child, rootkey; element temp; temp = list[root]; rootkey = list[root].key; child = 2 * root; /* left child */ while(child <= n) { if ((child < n) && (list[child].key < list[child+1].key)) child++; if (rootkey > list[child].key) break; else { list[child/2] = list[child]; child *= 2; } list[child/2] = temp; } void heapsort(element list[], int n) { /* perform a heapsort on the array */ int i, j; element temp; for (i = n / 2; i > 0; i--) /* initial heap construction */ adjust(list, i, n); for (i = n - 1; i > 0; i--) { /* heap adjust */ SWAP(list[1], list[i+1], temp); adjust(list, 1, i); }

62 Heap Sort Worst case time complexity: log 2 n + log 2 (n-1) + ··· + log 2 2 = O(n·log 2 n) h = log 2 n heap

63 Radix Sort A kind of distributive sort repeat the following 3 steps 1) comparison (least significant digit  most significant digit) 2) distribution 3) merging

64 Radix Sort Example) radix sort

65 Summary of Internal Sorting Insertion sorting works well when The list is already partially sorted n is small Merge sort has The best worst case behavior Requires more storage than heap sort Slightly more overhead than quick sort Quick sort Has the best average behavior worst case behavior is O(n 2 ) In practice, combine insertion sort, quick sort and merge sort so that Merge sort uses quick sort for sublists of size < 45 Quick sort uses insertion sort for sublists of size < 20


Download ppt "Sorting. Typically, most organizations spend large percentage of computing time (25% - 50%) on sorting Internal sorting the list is small enough to sort."

Similar presentations


Ads by Google