E.G.M. Petrakissorting1 Sorting  Put data in order based on primary key  Many methods  Internal sorting:  data in arrays in main memory  External.

Slides:



Advertisements
Similar presentations
Garfield AP Computer Science
Advertisements

Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Sorting. Sorting Considerations We consider sorting a list of records, either into ascending or descending order, based upon the value of some field of.
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 24 Sorting.
Heapsort By: Steven Huang. What is a Heapsort? Heapsort is a comparison-based sorting algorithm to create a sorted array (or list) Part of the selection.
Internal Sorting A brief review and some new ideas CS 400/600 – Data Structures.
1 Today’s Material Divide & Conquer (Recursive) Sorting Algorithms –QuickSort External Sorting.
Sorting Algorithms and Average Case Time Complexity
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
CS203 Programming with Data Structures Sorting California State University, Los Angeles.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting.
Data Structures Advanced Sorts Part 2: Quicksort Phil Tayco Slide version 1.0 Mar. 22, 2015.
FALL 2004CENG 351 Data Management and File Structures1 External Sorting Reference: Chapter 8.
Sorting Algorithms Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
CHAPTER 11 Sorting.
Course Review COMP171 Spring Hashing / Slide 2 Elementary Data Structures * Linked lists n Types: singular, doubly, circular n Operations: insert,
Sorting Chapter 10.
Rossella Lau Lecture 7, DCO20105, Semester A, DCO Data structures and algorithms  Lecture 7: Big-O analysis Sorting Algorithms  Big-O analysis.
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Sorting CS-212 Dick Steflik. Exchange Sorting Method : make n-1 passes across the data, on each pass compare adjacent items, swapping as necessary (n-1.
The Complexity of Algorithms and the Lower Bounds of Problems
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Sorting - 3 CS 202 – Fundamental Structures of Computer Science II.
CSE 373 Data Structures Lecture 15
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.
1 Joe Meehean.  Problem arrange comparable items in list into sorted order  Most sorting algorithms involve comparing item values  We assume items.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Sorting CS 105 See Chapter 14 of Horstmann text. Sorting Slide 2 The Sorting problem Input: a collection S of n elements that can be ordered Output: the.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Sorting CS 110: Data Structures and Algorithms First Semester,
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
1 CSE 326: Data Structures A Sort of Detour Henry Kautz Winter Quarter 2002.
Chapter 8 Sorting and Searching Goals: 1.Java implementation of sorting algorithms 2.Selection and Insertion Sorts 3.Recursive Sorts: Mergesort and Quicksort.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Sorting and Searching by Dr P.Padmanabham Professor (CSE)&Director
1 Searching and Sorting Searching algorithms with simple arrays Sorting algorithms with simple arrays –Selection Sort –Insertion Sort –Bubble Sort –Quick.
CompSci 100E 30.1 Other N log N Sorts  Binary Tree Sort  Basic Recipe o Insert into binary search tree (BST) o Do Inorder Traversal  Complexity o Create:
Chapter 9 Sorting 1. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
Data Structures - CSCI 102 Selection Sort Keep the list separated into sorted and unsorted sections Start by finding the minimum & put it at the front.
Internal and External Sorting External Searching
FALL 2005CENG 351 Data Management and File Structures1 External Sorting Reference: Chapter 8.
ICS201 Lecture 21 : Sorting King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
Sorting divide and conquer. Divide-and-conquer  a recursive design technique  solve small problem directly  divide large problem into two subproblems,
Review 1 Insertion Sort Insertion Sort Algorithm Time Complexity Best case Average case Worst case Examples.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Week 13 - Wednesday.  What did we talk about last time?  NP-completeness.
Chapter 9: Sorting1 Sorting & Searching Ch. # 9. Chapter 9: Sorting2 Chapter Outline  What is sorting and complexity of sorting  Different types of.
Sorting & Searching Geletaw S (MSC, MCITP). Objectives At the end of this session the students should be able to: – Design and implement the following.
Data Structures and Algorithms Instructor: Tesfaye Guta [M.Sc.] Haramaya University.
The Linear and Binary Search and more Lecture Notes 9.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
CENG 3511 External Sorting. CENG 3512 Outline Introduction Heapsort Multi-way Merging Multi-step merging Replacement Selection in heap-sort.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
Sorting.
Description Given a linear collection of items x1, x2, x3,….,xn
The Complexity of Algorithms and the Lower Bounds of Problems
8/04/2009 Many thanks to David Sun for some of the included slides!
CSE 326: Data Structures Sorting
CSE 373 Data Structures and Algorithms
Presentation transcript:

E.G.M. Petrakissorting1 Sorting  Put data in order based on primary key  Many methods  Internal sorting:  data in arrays in main memory  External sorting:  data in files on the disk

E.G.M. Petrakissorting2 Methods  Exchange sort  Bubble sort  Quicksort  Selection sort  Straight selection sort  Binary Tree sort  Insertion sort  Simple (linear) insertion sort  Shell sort  Address calculation sort  Merge sort  Radix sort

E.G.M. Petrakissorting3 Bubble Sort  Bubble sort: pass through the array several times  compare x[i] with x[i-1]  swap x[i], x[i-1] if they are not in the proper order  the less efficient algorithm !! for (i = 0, i < n – 1; i++) for (j = n – 1; j > i; j--) if (x[i] < x[i-1]) swap(x[i], x[i-1]);

x: : : : : : : : n -1 passes

E.G.M. Petrakissorting5 Observations  After the first iteration the maximum element is in its proper position (last)  After the i-th iteration all elements in positions greater than (n-i+1) are in proper position  they need not to be examined again  if during an iteration, no element changes position, the array is already sorted  n-1 iterations to sort the entire array

E.G.M. Petrakissorting6 Complexity  Worst case: n-i comparisons at each step i  Average case: k < n-1 iterations to sort the array  Best case: the input is sorted  takes O(n) comparisons

E.G.M. Petrakissorting7 Quicksort  Each iteration puts an element (pivot) in proper position  let a = x[0]: pivot  if a is in proper position:  x[j] <= a, for all j in [1,j-1]  X[j] > a, for all j in [j+1,n] a=x[0] (12) 25( )

E.G.M. Petrakissorting8 Algorithm  Repeat the same for the two sub-arrays on the left and on the right of pivot a until the sub-arrays have size 1 quiksort(lb, ub) { if (lb < ub) { partition(lb, ub, a); quicksort(lb, a-1); quicksort(a+1, ub); }  quicksort(1, n) sorts the entire array!

input: ( ) (12)25( ) 1225( ) 1225(483733)57(9286) 1225(3733)4857(9286) 1225(33)374857(9286) (9286) (86) sorted array

partition(int lb, int ub, int j) { int a=x[lb];int up=ub; int down=lb; do while (x[down] <= a) down++; while (x[up] > a) up--; if (up>down) swap (x[down],x[up]); while (up > down); swap(x[up],x[lb]); }

a=x[lb]=25: pivot downup downup downup downup downup downup up>down downup swap (12,57)

downup downup downup updown updown updownup < down? swap(12,25) …

E.G.M. Petrakissorting13 Complexity  Average case: let n = 2 m => m = log n and assume that the pivot goes to the middle of the array  “partition” splits the array into equal size arrays  comparisons: n + 2(n/2) + 4(n/4) + … n(n/n) = n m = n log n  complexity O(n log n)  Worst case: sorted input  “partition” splits the array into arrays of size 1, (n-1)  comparisons: n + (n-1) +… +2 => complexity O(n 2 )  Best case: none!  Advantages: very fast, suitable for files  locality of reference

E.G.M. Petrakissorting14 Selection Sort  Iteratively select max element and put it in proper position max  Complexity: O(n 2 ) always!

E.G.M. Petrakissorting15 Heapsort  Put elements in priority queue, iteratively select max element and put it in proper position  generalization of selection sort set pq; /* for i = 1 to n pqinsert(pq,x[i]); */ not needed for i = 1 to n pqmaxdelete(pq);  Complexity: build a heap + select n elements from heap O(n) + O(n log n)

E.G.M. Petrakissorting16 Binary Tree Sort  Put elements in binary search tree and “inorder”  average case: build + traverse tree O(n log n) + O(n)  worst case: initially sorted array O(n 2 ) inorder traversal:

E.G.M. Petrakissorting17 Insertion Sort  Input all elements, one after the other, in a sorted array x[0] sorted x[0..1] sorted x[0..2] sorted ,48 right 1 pos ,47,48 right 1 pos

E.G.M. Petrakissorting18 Complexity  Best case: sorted input O(n)  one comparison per element  Worst case: input sorted in reverse order O(n 2 )  i comparisons and i moves for the i-th element  Average case: random input O(n 2 )

E.G.M. Petrakissorting19 Shell Sort  Sort parts of the input separately  part: every k th element (k = 7, 5, 3, 1)  k parts, sort the k parts  decrease k: get larger part and sort again  repeat until k = 1: the entire array is sorted!  Each part is more or less sorted  sort parts using algorithm which performs well for already sorted input (e.g., insertion sort)

E.G.M. Petrakissorting20 k = 5, 3, 1 input k = 5 1 st step k = 3 2 nd step k = 1 3 rd step sorted output

E.G.M. Petrakissorting21 Observation  The smaller the k is, the larger the parts are  1 st part: x[0],x[0+k],x[0+2k],  2 nd part: x[1],x[1+k],x[1+2k],…  j th part: x[j-1],x[j-1+k],x[j-1+2k],…  k th part: x[k-1], x[2k-1],x[3k-1],….  i th element of j th part: x[(i-1)k+j]  Average case O(n 1.5 ) difficult to analyze

E.G.M. Petrakissorting22 Mergesort  Mergesort: split input into small parts, sort and merge the parts  parts of 1 element are sorted!  sorting of parts not necessary!  split input in parts of size n/2, n/4, … 1  m = log n parts  Complexity: always O(n log n)  needs O(n) additional space

[25][57][48][37][12][92][86][33] [2557][3748][1292][3386] 1 [ ][ ] 2 [ ]

E.G.M. Petrakissorting24 Radix Sort  Radix sort: sorting based on digit values  put elements in 10 queues based on less significant digit  output elements from 1 st up to 10 th queue  reinsert in queues based on second most significant digit  repeat until all digits are exhausted  m: maximum number of digit must be known  Complexity: always O(mn)  additional space: O(n)

E.G.M. Petrakissorting25 input: front rear queue[0]: queue[1]: queue[2]: queue[3]: 33 queue[4]: queue[5]: 25 queue[6]: 86 queue[7]: queue[8]: 48 queue[9]: front rear queue[0]: queue[1]: 12 queue[2]: 25 queue[3]: queue[4]: 48 queue[5]: 57 queue[6]: queue[7]: queue[8]: 86 queue[9]: 92 Sorted output:

E.G.M. Petrakissorting26 External sorting  Sorting of large files  data don’t fit in main memory  minimize number of disk accesses  Standard method: mergesort  split input file into small files  sort the small files (e.g., using quicksort)  these files must fit in the main memory  merge the sorted files (on disk)  the merged file doesn't fit in memory

[25][57][48][37][12][92][86][33] [2557][3748][1292][3386] 1 [ ][ ] 2 [ ] Example from main memory, Complexity: O(n log n)

E.G.M. Petrakissorting28 Observations  Initially m parts of size > 1  log m + 1 levels of merging  complexity: O(n log 2 m)  k-way merging: the parts are merged in groups of k [25] [57] [48] [37] [12] [92] [86] [33] [ ] [ ] k = 4

E.G.M. Petrakissorting29 External K-Way Merging  Disk accesses :  initially m data pages (parts), merged in groups of k  log k m levels of merging (passes over the file)  m accesses at every level  m log k m accesses total  Comparisons :  2-way merge: n comparisons at each level  k-way merge: kn comparisons at each level  kn log k m = (k-1)/log 2 k n log 2 n comparisons total

Sorting algorithms Complexity averageworst case Bubble Sort 0(n 2 ) 0(n 2 ) Insertion Sort 0(n 2 ) 0(n 2 ) Selection Sort 0(n 2 ) 0(n 2 ) Quick Sort 0(nlogn) 0(n 2 ) Shell Sort 0(n 1.5 ) ? Heap Sort 0(nlogn) 0(nlogn) Merge Sort 0(nlogn) 0(nlogn)

E.G.M. Petrakissorting31

E.G.M. Petrakissorting32 Observation  Selection sort: only O(n) transpositions  Mergesort: O(n) additional space  Quicksort: recursive, needs space, eliminate recursion

E.G.M. Petrakissorting33 Theorem  Sorting n elements using key comparisons only cannot take time less than Ω(nlogn) in the worst case  n: number or elements  Proof: the process of sorting is represented by a decision tree  nodes correspond to keys  arcs correspond to key comparisons

Decision tree : example input (k 1,k 2,k 3 ) k 1 <=k 2 (1,2,3) yes no k 2 <=k 3 (1,2,3) k 1 <=k 3 (2,1,3) yes no yes no stop (1,2,3) k 1 <=k 3 (1,3,2) stop (2,1,3) k 2 <=k 3 (2,3,1) yes no stop (1,3,2) stop (3,1,2) stop (2,3,1) stop (3,2,1)

E.G.M. Petrakissorting35 Complexity  The decision tree has at most n! leafs  d: min. depth of decision tree (full tree)  d: number of comparisons  at most 2 d leafs in full binary tree  2 d >= n! => d >= log n!  given: n! >=(n/2) n/2 => log 2 n! = n/2log n/2 = d in Ω(nlogn)