Sorting in Linear Time Lower bound for comparison-based sorting

Slides:



Advertisements
Similar presentations
Sorting in Linear Time Introduction to Algorithms Sorting in Linear Time CSE 680 Prof. Roger Crawfis.
Advertisements

Analysis of Algorithms
Sorting in linear time (for students to read) Comparison sort: –Lower bound:  (nlgn). Non comparison sort: –Bucket sort, counting sort, radix sort –They.
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Sorting in Linear Time Comp 550, Spring Linear-time Sorting Depends on a key assumption: numbers to be sorted are integers in {0, 1, 2, …, k}. Input:
Non-Comparison Based Sorting
CSE 3101: Introduction to the Design and Analysis of Algorithms
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
MS 101: Algorithms Instructor Neelima Gupta
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
Mudasser Naseer 1 5/1/2015 CSC 201: Design and Analysis of Algorithms Lecture # 9 Linear-Time Sorting Continued.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
Counting Sort Non-comparison sort. Precondition: n numbers in the range 1..k. Key ideas: For each x count the number C(x) of elements ≤ x Insert x at output.
Lower bound for sorting, radix sort COMP171 Fall 2005.
CSCE 3110 Data Structures & Algorithm Analysis
Lower bound for sorting, radix sort COMP171 Fall 2006.
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Comp 122, Spring 2004 Keys into Buckets: Lower bounds, Linear-time sort, & Hashing.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
Lecture 5: Master Theorem and Linear Time Sorting
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
1 Today’s Material Lower Bounds on Comparison-based Sorting Linear-Time Sorting Algorithms –Counting Sort –Radix Sort.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
CSE 373 Data Structures Lecture 15
1 Sorting in O(N) time CS302 Data Structures Section 10.4.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Fall 2015 Lecture 4: Sorting in linear time
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
1 Algorithms CSCI 235, Fall 2015 Lecture 17 Linear Sorting.
Analysis of Algorithms CS 477/677 Lecture 8 Instructor: Monica Nicolescu.
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Lower Bounds & Sorting in Linear Time
Sorting.
Linear-Time Sorting Continued Medians and Order Statistics
MCA 301: Design and Analysis of Algorithms
Introduction to Algorithms
Algorithm Design and Analysis (ADA)
Lecture 5 Algorithm Analysis
Linear Sorting Sections 10.4
Keys into Buckets: Lower bounds, Linear-time sort, & Hashing
Ch8: Sorting in Linear Time Ming-Te Chi
Counting (Pigeon Hole)
Lecture 5 Algorithm Analysis
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Linear Sorting Sorting in O(n) Jeff Chastine.
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Lower Bounds & Sorting in Linear Time
Linear Sorting Section 10.4
Linear-Time Sorting Algorithms
Lecture 5 Algorithm Analysis
Algorithms CSCI 235, Spring 2019 Lecture 18 Linear Sorting
CS 583 Analysis of Algorithms
The Selection Problem.
Lecture 5 Algorithm Analysis
Presentation transcript:

Sorting in Linear Time Lower bound for comparison-based sorting Counting sort Radix sort Bucket sort

Sorting So Far Insertion sort: Easy to code Fast on small inputs (less than ~50 elements) Fast on nearly-sorted inputs O(n2) worst case O(n2) average (equally-likely inputs) case O(n2) reverse-sorted case

Sorting So Far Merge sort: Divide-and-conquer: O(n lg n) worst case Split array in half Recursively sort subarrays Linear-time merge step O(n lg n) worst case Doesn’t sort in place

Sorting So Far Heap sort: Uses the very useful heap data structure Complete binary tree Heap property: parent key > children’s keys O(n lg n) worst case Sorts in place Fair amount of shuffling memory around

Sorting So Far Quick sort: Divide-and-conquer: O(n lg n) average case Partition array into two subarrays, recursively sort All of first subarray < all of second subarray No merge step needed! O(n lg n) average case Fast in practice O(n2) worst case Naïve implementation: worst case on sorted input Address this with randomized quicksort

How Fast Can We Sort? First, an observation: all of the sorting algorithms so far are comparison sorts The only operation used to gain ordering information about a sequence is the pairwise comparison of two elements Comparisons sorts must do at least n comparisons (why?) What do you think is the best comparison sort running time?

Decision Trees Each node is a pair of elements being compared Abstraction of any comparison sort. Represents comparisons made by a specific sorting algorithm on inputs of a given size. Abstracts away everything else: control and data movement. We’re counting only comparisons. Each node is a pair of elements being compared Each edge is the result of the comparison (< or >=) Leaf nodes are the sorted array

Insertion Sort 4 Elements as a Decision Tree Compare A[1] and A[2]

Insertion Sort 4 Elements as a Decision Tree Compare A[1] and A[2] < >= Compare A[2] and A[3]

Insertion Sort 4 Elements as a Decision Tree Compare A[1] and A[2] Compare A[2] and A[3]

The Number of Leaves in a Decision Tree for Sorting Lemma: A Decision Tree for Sorting must have at least n! leaves.

Lower Bound For Comparison Sorting Thm: Any decision tree that sorts n elements has height (n lg n) If we know this, then we know that comparison sorts are always (n lg n) Consider a decision tree on n elements We must have at least n! leaves The max # of leaves of a tree of height h is 2h

Lower Bound For Comparison Sorting So we have… n!  2h Taking logarithms: lg (n!)  h Stirling’s approximation tells us: Thus:

Lower Bound For Comparison Sorting N! = 1 × 2 × 3 × … × N/2 × … × N N factors each at most N. N/2 factors each at least N/2. £ NN N/2 £ N/2 N/2 log(N/2) £ log(N!) £ N log(N) = Q(N log(N)). So we have Thus the minimum height of a decision tree is (n lg n)

Lower Bound For Comparison Sorts Thus the time to comparison sort n elements is (n lg n) Corollary: Heapsort and Mergesort are asymptotically optimal comparison sorts But the name of this lecture is “Sorting in linear time”! How can we do better than (n lg n)?

Counting Sort – Sort small numbers Why it’s not a comparison sort: Assumption: input - integers in the range 0..k No comparisons made! Basic idea: determine for each input element x its rank: the number of elements less than x. once we know the rank r of x, we can place it in position r+1

Counting Sort The Algorithm Counting-Sort(A) Initialize two arrays B and C of size n and set all entries to 0 Count the number of occurrences of every A[i] for i = 1..n do C[A[i]] ← C[A[i]] + 1 Count the number of occurrences of elements <= A[i] for i = 2..n do C[i] ← C[i] + C[i – 1] Move every element to its final position for i = n..1 do B[C[A[i]] ← A[i] C[A[i]] ← C[A[i]] – 1

Counting Sort Example 2 3 5 A = 3 C = 2 3 1 C = 2 2 4 7 7 8 B = 2 4 7 A = 1 2 3 4 5 6 7 8 3 C = 0 1 2 3 4 5 2 3 1 0 1 2 3 4 5 C = 2 2 4 7 7 8 1 2 3 4 5 6 7 8 B = 2 4 7 8 C = 0 1 2 3 4 5 6

Counting Sort Example 2 5 3 2 3 3 A = C = 2 2 4 6 7 8 3 B = C = 1 2 2 1 2 3 4 5 6 7 8 2 5 3 2 3 3 A = 0 1 2 3 4 5 C = 2 2 4 6 7 8 1 2 3 4 5 6 7 8 3 B = 0 1 2 3 4 5 C = 1 2 2 4 6 7 8

Counting Sort Example 2 5 3 2 3 3 A = C = 2 2 4 6 7 8 3 3 B = C = 1 2 1 2 3 4 5 6 7 8 2 5 3 2 3 3 A = 0 1 2 3 4 5 C = 2 2 4 6 7 8 1 2 3 4 5 6 7 8 3 3 B = 0 1 2 3 4 5 C = 1 2 4 5 6 7 8

Counting Sort 1 CountingSort(A, B, k) 2 for i=1 to k 3 C[i]= 0; 4 for j=1 to n 5 C[A[j]] += 1; 6 for i=2 to k 7 C[i] = C[i] + C[i-1]; 8 for j=n downto 1 9 B[C[A[j]]] = A[j]; 10 C[A[j]] -= 1; Takes time O(k) Takes time O(n) What will be the running time?

Counting Sort Total time: O(n + k) But sorting is (n lg n)! Usually, k = O(n) Thus counting sort runs in O(n) time But sorting is (n lg n)! No contradiction--this is not a comparison sort (in fact, there are no comparisons at all!) Notice that this algorithm is stable If numbers have the same value, they keep their original order

Stable Sorting Algorithms A sorting algorithms is stable if for any two indices i and j with i < j and ai = aj, element ai precedes element aj in the output sequence. Input 21 51 23 61 71 41 42 22 Output 21 51 61 71 22 23 41 42 Observation: Counting Sort is stable.

Counting Sort Linear Sort! Cool! Why don’t we always use counting sort? Because it depends on range k of elements Could we use counting sort to sort 32 bit integers? Why or why not? Answer: no, k too large (232 = 4,294,967,296)

Radix Sort Why it’s not a comparison sort: Assumption: input has d digits each ranging from 0 to k Example: Sort a bunch of 4-digit numbers, where each digit is 0-9 Basic idea: Sort elements by digit starting with least significant Use a stable sort (like counting sort) for each stage

A idéia de Radix Sort não é nova

Para minha turma da faculdade foi muito fácil aprender Radix Sort IBM 083 punch card sorter

Radix Sort The Algorithm Radix Sort takes parameters: the array and the number of digits in each array element Radix-Sort(A, d) 1 for i = 1..d 2 do sort the numbers in arrays A by their i-th digit from the right, using a stable sorting algorithm

Radix Sort Example 329 457 657 839 436 720 355 720 355 436 457 657 329 839 720 329 436 839 355 457 657 329 355 436 457 657 720 839

Radix Sort Correctness and Running Time What is the running time of radix sort? Each pass over the d digits takes time O(n+k), so total time O(dn+dk) When d is constant and k=O(n), takes O(n) time Stable, Fast Doesn’t sort in place (because counting sort is used)

Bucket Sort Assumption: input - n real numbers from [0, 1) Basic idea: Create n linked lists (buckets) to divide interval [0,1) into subintervals of size 1/n Add each input element to appropriate bucket and sort buckets with insertion sort Uniform input distribution  O(1) bucket size Therefore the expected total time is O(n)

Distribute elements over buckets Bucket Sort Bucket-Sort(A) n  length(A) for i  0 to n do insert A[i] into list B[floor(n*A[i])] for i  0 to n –1 do Insertion-Sort(B[i]) Concatenate lists B[0], B[1], … B[n –1] in order Distribute elements over buckets Sort each bucket

Bucket Sort Example .78 .17 .39 .26 .72 .94 .21 .12 .23 .68 .68 .72 .78 .94 .39 .26 .23 .21 .17 .12 1 .12 .17 2 .21 .23 .26 3 .39 4 5 6 .68 7 .72 .78 8 9 .94

Bucket Sort – Running Time All lines except line 5 (Insertion-Sort) take O(n) in the worst case. In the worst case, O(n) numbers will end up in the same bucket, so in the worst case, it will take O(n2) time. Lemma: Given that the input sequence is drawn uniformly at random from [0,1), the expected size of a bucket is O(1). So, in the average case, only a constant number of elements will fall in each bucket, so it will take O(n) (see proof in book). Use a different indexing scheme (hashing) to distribute the numbers uniformly.

Summary Every comparison-based sorting algorithm has to take Ω(n lg n) time. Merge Sort, Heap Sort, and Quick Sort are comparison-based and take O(n lg n) time. Hence, they are optimal. Other sorting algorithms can be faster by exploiting assumptions made about the input Counting Sort and Radix Sort take linear time for integers in a bounded range. Bucket Sort takes linear average-case time for uniformly distributed real numbers.