1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.

Slides:



Advertisements
Similar presentations
©2001 by Charles E. Leiserson Introduction to AlgorithmsDay 9 L6.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 6 Prof. Erik Demaine.
Advertisements

Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Linear-time Median Def: Median of elements A=a 1, a 2, …, a n is the (n/2)-th smallest element in A. How to find median? sort the elements, output the.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
Median Finding, Order Statistics & Quick Sort
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
QuickSort 4 February QuickSort(S) Fast divide and conquer algorithm first discovered by C. A. R. Hoare in If the number of elements in.
Quicksort CSE 331 Section 2 James Daly. Review: Merge Sort Basic idea: split the list into two parts, sort both parts, then merge the two lists
Spring 2015 Lecture 5: QuickSort & Selection
Median/Order Statistics Algorithms
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
Updated QuickSort Problem From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j]
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Linear-Time Selection Randomized Selection (Algorithm) Design and Analysis of Algorithms I.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Quicksort.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Analysis of Algorithms CS 477/677
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
CS 206 Introduction to Computer Science II 12 / 08 / 2008 Instructor: Michael Eckmann.
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
Order Statistics. Order statistics Given an input of n values and an integer i, we wish to find the i’th largest value. There are i-1 elements smaller.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Order Statistics David Kauchak cs302 Spring 2012.
Order Statistics(Selection Problem)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
Problem Definition Given a set of "n" unordered numbers we want to find the "k th " smallest number. (k is an integer between 1 and n).
329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
Order Statistics(Selection Problem)
Randomized Algorithms
Medians and Order Statistics
Topic: Divide and Conquer
CS 3343: Analysis of Algorithms
Order Statistics Comp 550, Spring 2015.
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Chapter 9: Medians and Order Statistics
Topic: Divide and Conquer
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Order Statistics Comp 122, Spring 2004.
The Selection Problem.
Algorithms CSCI 235, Spring 2019 Lecture 19 Order Statistics
CS200: Algorithm Analysis
Medians and Order Statistics
Presentation transcript:

1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II

2 Finding the Median Last time, we showed that we can find the k th order statistic (i.e. the k th smallest element) in  (n) time, by repeatedly finding the minimum and discarding it. How long will it take to find the median using this strategy? Note that the position of the median (n/2) increases as n increases. T(n) = ? Conclusion: This method does not work as well for finding the median. Larger values of k take longer to find (although the order of growth is the same). Can we do better?

3 Randomized-Select Randomized-Select(A, lo, hi, i){Find the ith order statistic between lo and hi} if lo = hi then return A[lo] split  Randomized-Partition(A, lo, hi) length  (split - lo) + 1 if i <= length then return Randomized-Select(A, lo, split, i) else return Randomized-Select(A, split+1, hi, i-length) Idea: Partition the array as in Quick-sort. Recursively search the appropriate partition for the k th element.

4 Example A Find the 3rd order statistic: Randomized-Select(A, 1, 10, 3)

5 Running time of Randomized- Select Worst Case: As with QuickSort, we can get unlucky and partition the array into two pieces of size 1 and n-1, with the ith statistic in the larger side. T(n) = T(n-1) + n =  (n 2 ) cost of partition A good case: Partition into two equal parts: T(n) = T(n/2) + n (We will work this one out in class). Average case: Can show that T(n) <= cn, so T(n) = O(n)

6 Selection in Worst case linear time To make a selection in worst case linear time, we want to use an algorithm that guarantees a good split when we partition. To do this, we use the "median of median of c" algorithm. To start, we pick c, an integer constant >= 1. We write our input array, A, as a 2-D array with c rows, n/c columns. (If n/c is not an integer, we can pad the array with large numbers that won't change the result). Sort the columns of this new, 2D array.

7 Example A=[43, 5, 17, 91, 2, 42, 19, 72, 37, 3, 7, 15, 0, 63, 51, 73, 6, 30, 62, 10, 24, 26, 25, 28, 29]n = 25 Choose c = 5 Sort each column: B[1..c, 1..n/c] = B[1..5, 1..5] After sorting, the median row contains the median of each column. Sorting the columns takes  (c 2 (n/c)) =  (n) time.

8 Median-of-median-of-c continued We now call the Median-of-median-of-c algorithm again, on the single median row of B, with the same value of c as before. Write median row as B' = [17, 37, 15, 30, 26] Write B' as 2D array, with c= 5 rows and n/c = 1 column: Sort columns: Value at the middle row is mm, the median of medians. We use this as our pivot for the partition.

9 Showing that it gives a good split We can show that at least 1/4 of the elements are less than mm and at least 1/4 of the elements are greater than mm by imagining that the columns of B are sorted by the value of each median. (Note: we only imagine it, we don't actually do it). At least 1/4 are less than 26 At least 1/4 are greater than 26

10 Partitioning Partition A using mm = 26 as the pivot. Use a partition that keeps mm in the high part of the partition: "low" = 2, 5, 17, 3, 19, 0, 7, 15, 6, 10, 24, 25(12 items) "high" = 26, 43, 91, 37, 42, 72, 51, 63, 30, 62, 73, 28, 29(13 items) If the number of items in the low part of the partition = k, and the order statistic we are looking for is given by i, then if i <= k, iterate the entire procedure on the lower partition if i > k, iterate on the higher partition (looking for (i - k) th element).

11 Running time T(n) =  (n) + T(n/c) + T(3n/4) +  (n) Cost of sorting columns Cost of finding m-of-m-of-c on median row of B Worst case split. Cost of partition T(n) = T(n/c) + T(3n/4) +  (n) We can show that T(n) =  (n) for c >=5

12 Benefits of M-of-M-of-c Good order statistic algorithm Can use this with other algorithms. For example, we can use it with QuickSort to guarantee a good split and an nlgn order of growth. The linear time is not the result of constraining the problem (as we did with counting-sort). It is a comparison-based method!