Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.

Slides:



Advertisements
Similar presentations
한양대학교 정보보호 및 알고리즘 연구실 이재준 담당교수님 : 박희진 교수님
Advertisements

©2001 by Charles E. Leiserson Introduction to AlgorithmsDay 9 L6.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 6 Prof. Erik Demaine.
Chapter 9 continued: Quicksort
David Luebke 1 6/1/2014 CS 332: Algorithms Medians and Order Statistics Structures for Dynamic Sets.
Ack: Several slides from Prof. Jim Anderson’s COMP 202 notes.
Introduction to Algorithms Quicksort
Divide and Conquer (Merge Sort)
ITEC200 Week10 Sorting. pdp 2 Learning Objectives – Week10 Sorting (Chapter10) By working through this chapter, students should: Learn.
Foundations of Data Structures Practical Session #11 Sort properties, Quicksort algorithm.
Order Statistics Sorted
David Luebke 1 4/22/2015 CS 332: Algorithms Quicksort.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
1 Today’s Material Medians & Order Statistics – Ch. 9.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
Median Finding, Order Statistics & Quick Sort
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
Spring 2015 Lecture 5: QuickSort & Selection
Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)
Median/Order Statistics Algorithms
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Introduction to Algorithms Jiafen Liu Sept
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Order Statistics(Selection Problem)
Deterministic and Randomized Quicksort Andreas Klappenecker.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Introduction to Algorithms Prof. Charles E. Leiserson
Randomized Algorithms
Order Statistics(Selection Problem)
Randomized Algorithms
Medians and Order Statistics
CS 3343: Analysis of Algorithms
Order Statistics Comp 550, Spring 2015.
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Chapter 9: Medians and Order Statistics
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Order Statistics Comp 122, Spring 2004.
Chapter 9: Selection of Order Statistics
CS 583 Analysis of Algorithms
The Selection Problem.
CS200: Algorithm Analysis
Medians and Order Statistics
Presentation transcript:

Comp 122, Spring 2004 Order Statistics

order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements. Minimum: first order statistic. Maximum: n th order statistic. Median: half-way point of the set. »Unique, when n is odd – occurs at i = (n+1)/2. »Two medians when n is even. Lower median, at i = n/2. Upper median, at i = n/2+1. For consistency, median will refer to the lower median.

order - 3 Lin / Devi Comp 122 Selection Problem Selection problem: »Input: A set A of n distinct numbers and a number i, with 1 i n. »Output: the element x A that is larger than exactly i – 1 other elements of A. Can be solved in O(n lg n) time. How? We will study faster linear-time algorithms. »For the special cases when i = 1 and i = n. »For the general problem.

order - 4 Lin / Devi Comp 122 Minimum (Maximum) Minimum (A) 1. min A[1] 2. for i 2 to length[A] 3. do if min > A[i] 4. then min A[i] 5. return min Minimum (A) 1. min A[1] 2. for i 2 to length[A] 3. do if min > A[i] 4. then min A[i] 5. return min Maximum can be determined similarly. T(n) = (n). No. of comparisons: n – 1. Can we do better? Why not? Minimum(A) has worst-case optimal # of comparisons.

order - 5 Lin / Devi Comp 122 Problem Average for random input: How many times do we expect line 4 to be executed? »X = RV for # of executions of line 4. »X i = Indicator RV for the event that line 4 is executed on the i th iteration. »X = i=2..n X i »E[X i ] = 1/i. How? »Hence, E[X] = ln(n) – 1 = (lg n). Minimum (A) 1. min A[1] 2. for i 2 to length[A] 3. do if min > A[i] 4. then min A[i] 5. return min Minimum (A) 1. min A[1] 2. for i 2 to length[A] 3. do if min > A[i] 4. then min A[i] 5. return min

order - 6 Lin / Devi Comp 122 Simultaneous Minimum and Maximum Some applications need to determine both the maximum and minimum of a set of elements. »Example: Graphics program trying to fit a set of points onto a rectangular display. Independent determination of maximum and minimum requires 2n – 2 comparisons. Can we reduce this number? »Yes.

order - 7 Lin / Devi Comp 122 Simultaneous Minimum and Maximum Maintain minimum and maximum elements seen so far. Process elements in pairs. »Compare the smaller to the current minimum and the larger to the current maximum. »Update current minimum and maximum based on the outcomes. No. of comparisons per pair = 3. How? No. of pairs n/2. »For odd n: initialize min and max to A[1]. Pair the remaining elements. So, no. of pairs = n/2. »For even n: initialize min to the smaller of the first pair and max to the larger. So, remaining no. of pairs = (n – 2)/2 < n/2.

order - 8 Lin / Devi Comp 122 Simultaneous Minimum and Maximum Total no. of comparisons, C 3 n/2. »For odd n: C = 3 n/2. »For even n: C = 3(n – 2)/2 + 1 (For the initial comparison). = 3n/2 – 2 < 3 n/2.

order - 9 Lin / Devi Comp 122 General Selection Problem Seems more difficult than Minimum or Maximum. »Yet, has solutions with same asymptotic complexity as Minimum and Maximum. We will study 2 algorithms for the general problem. »One with expected linear-time complexity. »A second, whose worst-case complexity is linear.

order - 10 Lin / Devi Comp 122 Selection in Expected Linear Time Modeled after randomized quicksort. Exploits the abilities of Randomized-Partition (RP). »RP returns the index k in the sorted order of a randomly chosen element (pivot). If the order statistic we are interested in, i, equals k, then we are done. Else, reduce the problem size using its other ability. »RP rearranges the other elements around the random pivot. If i < k, selection can be narrowed down to A[1..k – 1]. Else, select the (i – k)th element from A[k+1..n]. (Assuming RP operates on A[1..n]. For A[p..r], change k appropriately.)

order - 11 Lin / Devi Comp 122 Randomized Quicksort: review Quicksort(A, p, r) if p < r then q := Rnd-Partition(A, p, r); Quicksort(A, p, q – 1); Quicksort(A, q + 1, r) fi Quicksort(A, p, r) if p < r then q := Rnd-Partition(A, p, r); Quicksort(A, p, q – 1); Quicksort(A, q + 1, r) fi Rnd-Partition(A, p, r) i := Random(p, r); A[r] A[i]; x, i := A[r], p – 1; for j := p to r – 1 do if A[j] x then i := i + 1; A[i] A[j] fi od; A[i + 1] A[r]; return i + 1 Rnd-Partition(A, p, r) i := Random(p, r); A[r] A[i]; x, i := A[r], p – 1; for j := p to r – 1 do if A[j] x then i := i + 1; A[i] A[j] fi od; A[i + 1] A[r]; return i A[p..r] A[p..q – 1] A[q+1..r] 5 5 Partition 5

order - 12 Lin / Devi Comp 122 Randomized-Select Randomized-Select(A, p, r, i) // select ith order statistic. 1. if p = r 2. then return A[p] 3. q Randomized-Partition(A, p, r) 4. k q – p if i = k 6. then return A[q] 7. elseif i < k 8. then return Randomized-Select(A, p, q – 1, i) 9. else return Randomized-Select(A, q+1, r, i – k) Randomized-Select(A, p, r, i) // select ith order statistic. 1. if p = r 2. then return A[p] 3. q Randomized-Partition(A, p, r) 4. k q – p if i = k 6. then return A[q] 7. elseif i < k 8. then return Randomized-Select(A, p, q – 1, i) 9. else return Randomized-Select(A, q+1, r, i – k)

order - 13 Lin / Devi Comp 122 Analysis Worst-case Complexity: » (n 2 ) – As we could get unlucky and always recurse on a subarray that is only one element smaller than the previous subarray. Average-case Complexity: » (n) – Intuition: Because the pivot is chosen at random, we expect that we get rid of half of the list each time we choose a random pivot q. »Why (n) and not (n lg n)?

order - 14 Lin / Devi Comp 122 Average-case Analysis Define Indicator RVs X k, for 1 k n. »X k = I{subarray A[p…q] has exactly k elements}. »Pr{subarray A[p…q] has exactly k elements} = 1/n for all k = 1..n. »Hence, E[X k ] = 1/n. Let T(n) be the RV for the time required by Randomized-Select (RS) on A[p…q] of n elements. Determine an upper bound on E[T(n)]. (9.1)

order - 15 Lin / Devi Comp 122 Average-case Analysis A call to RS may »Terminate immediately with the correct answer, »Recurse on A[p..q – 1], or »Recurse on A[q+1..r]. To obtain an upper bound, assume that the i th smallest element that we want is always in the larger subarray. RP takes O(n) time on a problem of size n. Hence, recurrence for T(n) is: » For a given call of RS, X k =1 for exactly one value of k, and X k = 0 for all other k.

order - 16 Lin / Devi Comp 122 Average-case Analysis (by linearity of expectation) (by Eq. (C.23)) (by Eq. (9.1))

order - 17 Lin / Devi Comp 122 Average-case Analysis (Contd.) The summation is expanded If n is odd, T(n – 1) thru T( n/2 ) occur twice and T( n/2 ) occurs once. If n is even, T(n – 1) thru T( n/2 ) occur twice.

order - 18 Lin / Devi Comp 122 Average-case Analysis (Contd.) We solve the recurrence by substitution. Guess E[T(n)] = O(n). Thus, if we assume T(n) = O(1) for n < 2c/(c – 4a), we have E[T(n)] = O(n).

order - 19 Lin / Devi Comp 122 Selection in Worst-Case Linear Time Algorithm Select: »Like RandomizedSelect, finds the desired element by recursively partitioning the input array. »Unlike RandomizedSelect, is deterministic. Uses a variant of the deterministic Partition routine. Partition is told which element to use as the pivot. »Achieves linear-time complexity in the worst case by Guaranteeing that the split is always good at each Partition. How can a good split be guaranteed?

order - 20 Lin / Devi Comp 122 Guaranteeing a Good Split We will have a good split if we can ensure that the pivot is the median element or an element close to the median. Hence, determining a reasonable pivot is the first step.

order - 21 Lin / Devi Comp 122 Choosing a Pivot Median-of-Medians: »Divide the n elements into n/5 groups. n/5 groups contain 5 elements each. 1 group contains n mod 5 < 5 elements. Determine the median of each of the groups. –Sort each group using Insertion Sort. Pick the median from the sorted list of group elements. Recursively find the median x of the n/5 medians. Recurrence for running time (of median-of- medians): »T(n) = O(n) + T( n/5 ) + ….

order - 22 Lin / Devi Comp 122 Algorithm Select Determine the median-of-medians x (using the procedure on the previous slide.) Partition the input array around x using the variant of Partition. Let k be the index of x that Partition returns. If k = i, then return x. Else if i < k, then apply Select recursively to A[1..k–1] to find the i th smallest element. Else if i > k, then apply Select recursively to A[k+1..n] to find the (i – k) th smallest element. (Assumption: Select operates on A[1..n]. For subarrays A[p..r], suitably change k. )

order - 23 Lin / Devi Comp 122 Worst-case Split Median-of-medians, x n/5 groups of 5 elements each. n/5 th group of n mod 5 elements. Arrows point from larger to smaller elements. Elements > x Elements < x

order - 24 Lin / Devi Comp 122 Worst-case Split Assumption: Elements are distinct. Why? At least half of the n/5 medians are greater than x. Thus, at least half of the n/5 groups contribute 3 elements that are greater than x. »The last group and the group containing x may contribute fewer than 3 elements. Exclude these groups. Hence, the no. of elements > x is at least Analogously, the no. of elements < x is at least 3n/10–6. Thus, in the worst case, Select is called recursively on at most 7n/10+6 elements.

order - 25 Lin / Devi Comp 122 Recurrence for worst-case running time T(Select) T(Median-of-medians) +T(Partition) +T(recursive call to select) T(n) O(n) + T( n/5 ) + O(n) + T(7n/10+6) = T( n/5 ) + T(7n/10+6) + O(n) Assume T(n) (1), for n 140. T(Median-of-medians)T(Partition)T(recursive call)

order - 26 Lin / Devi Comp 122 Solving the recurrence To show: T(n) = O(n) cn for suitable c and all n > 0. Assume: T(n) cn for suitable c and all n 140. Substituting the inductive hypothesis into the recurrence, »T(n) c n/5 + c(7n/10+6)+an cn/5 + c + 7cn/10 + 6c + an = 9cn/10 + 7c + an = cn +(–cn/10 + 7c + an) cn, if –cn/10 + 7c + an 0. n/(n–70) is a decreasing function of n. Verify. Hence, c can be chosen for any n = n 0 > 70, provided it can be assumed that T(n) = O(1) for n n 0. Thus, Select has linear-time complexity in the worst case. –cn/10 + 7c + an 0 c 10a(n/(n – 70)), when n > 70. For n 140, c 20a.