Sorting 15-211 Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
1 Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances.
CS4413 Divide-and-Conquer
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
Using Divide and Conquer for Sorting
Quicksort CSE 331 Section 2 James Daly. Review: Merge Sort Basic idea: split the list into two parts, sort both parts, then merge the two lists
Spring 2015 Lecture 5: QuickSort & Selection
Quicksort CS 3358 Data Structures. Sorting II/ Slide 2 Introduction Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case:
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
Quicksort COMP171 Fall Sorting II/ Slide 2 Introduction * Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case: O(N.
Chapter 7: Sorting Algorithms
Data Structures Data Structures Topic #13. Today’s Agenda Sorting Algorithms: Recursive –mergesort –quicksort As we learn about each sorting algorithm,
Lecture 7COMPSCI.220.FS.T Algorithm MergeSort John von Neumann ( 1945 ! ): a recursive divide- and-conquer approach Three basic steps: –If the.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Introduction to Algorithms Rabie A. Ramadan rabieramadan.org 4 Some of the sides are exported from different sources.
Quicksort Divide-and-Conquer. Quicksort Algorithm Given an array S of n elements (e.g., integers): If array only contains one element, return it. Else.
Chapter 4 Divide-and-Conquer Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
Merge sort, Insertion sort
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
Sorting Chapter 10.
Divide and Conquer Sorting
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
Design and Analysis of Algorithms – Chapter 51 Divide and Conquer (I) Dr. Ying Lu RAIK 283: Data Structures & Algorithms.
1 QuickSort Worst time:  (n 2 ) Expected time:  (nlgn) – Constants in the expected time are small Sorts in place.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
Sorting II/ Slide 1 Lecture 24 May 15, 2011 l merge-sorting l quick-sorting.
Computer Algorithms Lecture 10 Quicksort Ch. 7 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
Sorting and Lower bounds Fundamental Data Structures and Algorithms Ananda Guna January 27, 2005.
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “ Introduction to the Design & Analysis of Algorithms, ” 2 nd ed., Ch. 1 Chapter.
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 5 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
Heapsort. Heapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort (iterative, recursive?) * Bubble sort.
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
Survey of Sorting Ananda Gunawardena. Naïve sorting algorithms Bubble sort: scan for flips, until all are fixed Etc...
CS 61B Data Structures and Programming Methodology July 21, 2008 David Sun.
1 CSE 326: Data Structures A Sort of Detour Henry Kautz Winter Quarter 2002.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2.Solve smaller instances.
CENG 213 Data Structures Sorting Algorithms. CENG 213 Data Structures Sorting Sorting is a process that organizes a collection of data into either ascending.
Sorting and Lower Bounds Fundamental Data Structures and Algorithms Peter Lee February 25, 2003.
ICS201 Lecture 21 : Sorting King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
1 Ch.19 Divide and Conquer. 2 BIRD’S-EYE VIEW Divide and conquer algorithms Decompose a problem instance into several smaller independent instances May.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 122 – Data Structures Sorting.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Today’s Material Sorting: Definitions Basic Sorting Algorithms
Quicksort Quicksort is a well-known sorting algorithm that, in the worst case, it makes Θ(n 2 ) comparisons. Typically, quicksort is significantly faster.
Quicksort This is probably the most popular sorting algorithm. It was invented by the English Scientist C.A.R. Hoare It is popular because it works well.
Sorting: Implementation Fundamental Data Structures and Algorithms Margaret Reid-Miller 24 February 2004.
Fundamental Structures of Computer Science February 20, 2003 Ananda Guna Introduction to Sorting.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University 1 Chapter 7 Sorting Sort is.
Sorting Fundamental Data Structures and Algorithms Klaus Sutner February 17, 2004.
Sorting and Lower Bounds Fundamental Data Structures and Algorithms Klaus Sutner February 19, 2004.
Fundamental Data Structures and Algorithms
EE 312 Software Design and Implementation I
Presentation transcript:

Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004

Announcements  Reading:  Chapter 8  Quiz #1 available on Thursday  Quiz will be much easier to complete if you solved Homework 5

Introduction to Sorting

The Problem We are given a sequence of items a 1 a 2 a 3 … a n-1 a n We want to rearrange them so that they are in non-decreasing order. More precisely, we need a permutation f such that a f(1)  a f(2)  a f(3)  …  a f(n-1)  a f(n).

A Constraint Comparison Based Sorting While we are rearranging the items, we will only use queries of the form a i  a j Or variants thereof ( and so forth).

Comparison-based sorting The important point here is that the algorithm can can only make comparison such as if( a[i] < a[j] ) … We are not allowed to look at pieces of the elements a[i] and a[j]. For example, if these elements are numbers, we are not allowed to compare the most significant digits.

Flips and inversions An unsorted array inversion flip Two elements are inverted if A[i] > A[j] for i < j. Flip is an inversion of two consecutive elements.

Example In the above list, there are 9 inversions: (3, 2), (2, 1), (3, 1), (6, 5), (5, 4), (6, 4), (9, 8), (8, 7), (9, 7) Out of which, 6 are flips: (3, 2), (2, 1), (6, 5), (5, 4), (9, 8), (8, 7) 987

Naïve sorting algorithms  Bubble sort: scan for flips, until all are fixed What is the running time? Etc...

Naïve sorting algorithms  Bubble sort: scan for flips, until all are fixed What is the running time? O(n 2 ) Etc...

Insertion sort for i = 1 to n-1 do insert a[i] in the proper place in a[0:i-1] Invariant: after i steps, the sub-array A[0:i+1] is sorted

Insertion sort Sorted subarray

How fast is insertion sort? tmp = a[i]; for (j = i; j>0 && a[j-1]>tmp; j--) a[j] = a[j-1]; a[j] = tmp; To insert a[i] into a[0:i-1], slide all elements larger than a[i] to the right.

How fast is insertion sort? O(#inversions) sliding steps very fast if array is nearly sorted to begin with tmp = a[i]; for (j = i; j>0 && a[j-1]>tmp; j--) a[j] = a[j-1]; a[j] = tmp; To insert a[i] into a[0:i-1], slide all elements larger than a[i] to the right.

How many inversions?  In this case, there are (n-3)+(n-2)+(n-1)+n or inversions nn  Consider the worst case:...

How many inversions? What about the average case? Consider: p = x 1 x 2 x 3 … x n-1 x n For any such p, let rev(p) be its reversal. Then (x i,x j ) is an inversion either in p or in rev(p). There are n(n-1)/2 pairs (x i,x j ), hence the average number of inversions in a permutation is n(n-1)/4, or O(n 2 ).

How long does it take to sort?  Can we do better than O(n 2 )?  In the worst case?  In the average case?

Heapsort  Remember heaps:  complete binary tree, each parent smaller than the child  buildHeap has O(n) worst-case running time.  deleteMin has O(log n) worst-case running time.  Heapsort:  Build heap. O(n)  DeleteMin until empty. O(n log n)  Total worst case: O(n log n)

n 2 vs n log n

Sorting in O(n log n)  Heapsort establishes the fact that sorting can be accomplished in O(n log n) worst-case running time.

Heapsort in practice  The average-case analysis for heapsort is somewhat complex.  In practice, heapsort consistently tends to use nearly n log n comparisons.  So, while the worst case is better than n 2, other algorithms sometimes work better.

Shellsort  Shellsort, like insertion sort, is based on swapping inverted pairs.  Achieves O(n 3/2 ) running time (see book for details).  Idea: sort in phases, with insertion sort at the end  Remember: insertion sort is good when the array is nearly sorted

Shellsort  Definition: k-sort == sort by insertion items that are k positions apart.  therefore, 1-sort == simple insertion sort  Now, perform a series of k-sorts for k = m 1 > m 2 >... > 1

Shellsort algorithm After 4-sort After 2-sort After 1-sort Note: many inversions fixed in one exchange.

Recursive Sorting

Recursive sorting  Intuitively, divide the problem into pieces and then recombine the results.  If array is length 1, then done.  If array is length N>1, then split in half and sort each half.  Then combine the results.  An example of a divide-and-conquer algorithm.

Divide-and-conquer

Divide-and-conquer is important  We will see several examples of divide-and-conquer in this course.

Recursive sorting  If array is length 1, then done.  Otherwise, split into two smaller pieces.  Sort each piece.  Combine the sorted pieces.

Analysis of recursive sorting  Suppose it takes time T(N) to sort N elements.  Suppose also it takes time N to combine the two sorted arrays.  Then:  T(1) = 1  T(N) = 2T(N/2) + N, for N>1  Solving for T gives the running time for the recursive sorting algorithm.

Remember recurrence relations?  Systems of equations such as  T(1) = 1  T(N) = 2T(N/2) + N, for N>1 are called recurrence relations (or sometimes recurrence equations).

A solution  A solution for  T(1) = 1  T(N) = 2T(N/2) + N  is given by  T(N) = N log N + N  which is O(N log N).

Divide-and-Conquer Theorem  Theorem: Let a, b, c  0.  The recurrence relation  T(1) = b  T(N) = aT(N/c) + bN  for any N which is a power of c  has upper-bound solutions  T(N) = O(N)if a<c  T(N) = O(N log N)if a=c  T(N) = O(N log c a )if a>c a=2, b=1, c=2 for recursive sorting

Upper-bounds  Corollary:  Dividing a problem into p pieces, each of size N/p, using only a linear amount of work, results in an O(N log N) algorithm.

Upper-bounds  Proof of this theorem left for later.

Mergesort

 Mergesort is the most basic recursive sorting algorithm.  Divide array in halves A and B.  Recursively mergesort each half.  Combine A and B by successively looking at the first elements of A and B and moving the smaller one to the result array.  Note: Should be careful to avoid creating lots of intermediate arrays.

Mergesort

But don’t actually want to create all of these arrays!

Mergesort LLRR Use simple indexes to perform the split. Use a single extra array to hold each intermediate result.

Analysis of mergesort  Mergesort generates almost exactly the same recurrence relations shown before.  T(1) = 1  T(N) = 2T(N/2) + N - 1, for N>1  Thus, mergesort is O(N log N).

Quicksort

 Quicksort was invented in 1960 by Tony Hoare.  Although it has O(N 2 ) worst-case performance, on average it is O(N log N).  More importantly, it is the fastest known comparison-based sorting algorithm in practice.

Quicksort idea  Choose a pivot.

Quicksort idea  Choose a pivot.  Rearrange so that pivot is in the “right” spot.

Quicksort idea  Choose a pivot.  Rearrange so that pivot is in the “right” spot.  Recurse on each half and conquer!

Quicksort algorithm  If array A has 1 (or 0) elements, then done.  Choose a pivot element x from A.  Divide A-{x} into two arrays:  B = {yA | yx}  C = {yA | yx}  Quicksort arrays B and C.  Result is B+{x}+C.

Quicksort algorithm

Quicksort algorithm In practice, insertion sort is used once the arrays get “small enough”

Rearranging in place LR LR LR pivot

Rearranging in place LR RL LR

Quicksort is fast but hard to do  Quicksort, in the early 1960’s, was famous for being incorrectly implemented many times.  More about invariants next time.  Quicksort is very fast in practice.  Faster than mergesort because Quicksort can be done “in place”.

Informal analysis  If there are duplicate elements, then algorithm does not specify which subarray B or C should get them.  Ideally, split down the middle.  Also, not specified how to choose the pivot.  Ideally, the median value of the array, but this would be expensive to compute.  As a result, it is possible that Quicksort will show O(N 2 ) behavior.

Worst-case behavior If always pick the smallest (or largest) possible pivot then O(n 2 ) steps

Analysis of quicksort  Assume random pivot.  T(0) = 1  T(1) = 1  T(N) = T(i) + T(N-i-1) + cN, for N>1  where I is the size of the left subarray.

Worst-case analysis  If the pivot is always the smallest element, then:  T(0) = 1  T(1) = 1  T(N) = T(0) + T(N-1) + cN, for N>1   T(N-1) + cN  = O(N 2 )  See the book for details on this solution.

Best-case analysis  In the best case, the pivot is always the median element.  In that case, the splits are always “down the middle”.  Hence, same behavior as mergesort.  That is, O(N log N).

Average-case analysis  Consider the quicksort tree:

Average-case analysis  At each level of the tree, there are less than N nodes  So, time spent at each level is O(N)  On average, how many levels are there?  That is, what is the expected height of the tree?  If on average there are O(log N) levels, then quicksort is O(N log N) on average.

Average-case analysis  We’ll answer this question next time…

Summary of quicksort  A fast sorting algorithm in practice.  Can be implemented in-place.  But is O(N 2 ) in the worst case.  Average-case performance?