CS38 Introduction to Algorithms Lecture 7 April 22, 2014.

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Advertisements

Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
Lecture 4 Divide and Conquer for Nearest Neighbor Problem
Algorithm Design Paradigms
CS4413 Divide-and-Conquer
Algorithm Design Techniques: Divide and Conquer. Introduction Divide and conquer – algorithm makes at least two recursive calls Subproblems should not.
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
Divide and Conquer. Recall Complexity Analysis – Comparison of algorithm – Big O Simplification From source code – Recursive.
Spring 2015 Lecture 5: QuickSort & Selection
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
Median/Order Statistics Algorithms
Updated QuickSort Problem From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j]
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Divide-and-Conquer1 7 2  9 4   2  2 79  4   72  29  94  4.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
5 - 1 § 5 The Divide-and-Conquer Strategy e.g. find the maximum of a set S of n numbers.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Lecture 6 Divide and Conquer for Nearest Neighbor Problem Shang-Hua Teng.
CS2420: Lecture 11 Vladimir Kulyukin Computer Science Department Utah State University.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Divide and Conquer Applications Sanghyun Park Fall 2002 CSE, POSTECH.
Order Statistics(Selection Problem)
CSS106 Introduction to Elementary Algorithms M.Sc Askar Satabaldiyev Lecture 05: MergeSort & QuickSort.
Deterministic and Randomized Quicksort Andreas Klappenecker.
Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2.Solve smaller instances.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
Randomized Quicksort (8.4.2/7.4.2) Randomized Quicksort –i = Random(p, r) –swap A[p]  A[i] –partition A(p, r) Average analysis = Expected runtime –solving.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Analysis of Algorithms CS 477/677
Order Statistics.
Randomized Algorithms
Chapter 4: Divide and Conquer
Randomized Algorithms
Divide-and-Conquer 7 2  9 4   2   4   7
Topic: Divide and Conquer
Richard Anderson Lecture 13 Divide and Conquer
Order Statistics Comp 550, Spring 2015.
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
CSE 373 Data Structures and Algorithms
Topic: Divide and Conquer
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Lecture 15, Winter 2019 Closest Pair, Multiplication
Richard Anderson Lecture 14 Divide and Conquer
The Selection Problem.
Richard Anderson Lecture 14, Winter 2019 Divide and Conquer
Quicksort and Randomized Algs
Richard Anderson Lecture 14 Divide and Conquer
Given a list of n  8 integers, what is the runtime bound on the optimal algorithm that sorts the first eight? O(1) O(log n) O(n) O(n log n) O(n2)
Medians and Order Statistics
Presentation transcript:

CS38 Introduction to Algorithms Lecture 7 April 22, 2014

CS38 Lecture 72 Outline Divide and Conquer design paradigm –Mergesort –Quicksort –Selection –Closest pair both with random pivot deterministic selection

Divide and conquer General approach –break problem into subproblems –solve each subproblem recursively –combine solutions typical running time recurrence: April 22, 2014 T(1) = O(1) T(n) · a ¢ T(N/b) + O(n c ) size of subproblems # subproblems cost to split and combine

Solving D&C recurrences key quantity: log b a = D –if c < D then T(n) = O(n D ) –if c = D then T(n) = O(n D ¢ log n) –if c > D then T(n) = O(n c ) can prove easily by induction April 22, 2014CS38 Lecture 74 T(1) = O(1) T(n) · a ¢ T(N/b) + O(n c )

Why log b a? T(1) = O(1) T(n) · a ¢ T(N/b) + f(n)

First example: mergesort Input: n values; sort in increasing order. Mergesort algorithm: –split list into 2 lists of size n/2 each –recursively sort each –merge in linear time (how?) Running time: –T(1) = 1; T(n) = 2¢T(n/2) + O(n) Solution: T(n) = O(n log n) April 22, 2014CS38 Lecture 76

Second example: quicksort Quicksort: effort on split rather than merge Quicksort algorithm: –take first value x as pivot –split list into “ x” lists –recursively sort each Why isn’t this the running time recurrence: T(1) = 1; T(n) = 2¢T(n/2) + O(n) Worst case running time: O(n 2 ) (why?) April 22, 2014CS38 Lecture 77

Quicksort with random pivot Idea: hope that a i splits array into two subarrays of size ¼ n/2 –would lead to T(1) = 1; T(n) = 2¢T(n/2) + O(n) and then T(n) = O(n log n) April 22, 2014CS38 Lecture 78 Random-Pivot-Quicksort(array of n elements: a) 1.if n = 1, return(a) 2.pick i uniformly at random from {1,2,… n} 3.partition array a into “ a i ” arrays 4.Random-Pivot-Quicksort(“< a i ”) 5.Random-Pivot-Quicksort(“> a i ”) 6.return(“ a i ”)

Quicksort with random pivot we will analyze expected running time –suffices to count aggregate # comparisons –rename elements of a: x 1 · x 2 ·  · x n –when is x i compared with x j ? April 22, 2014CS38 Lecture 79 Random-Pivot-Quicksort(array of n elements: a) 1.if n = 1, return(a) 2.pick i uniformly at random from {1,2,… n} 3.partition array a into “ a i ” arrays 4.Random-Pivot-Quicksort(“< a i ”) 5.Random-Pivot-Quicksort(“> a i ”) 6.return(“ a i ”)

Quicksort with random pivot when is x i compared with x j ? (i< j) –consider elements [x i, x i+1, x i+2, …, x j-1, x j ] –if any blue element is chosen first, then no comparison (why?) –if either red element is chosen first, then exactly one comparison (why?) –probability x i and x j compared = 2/(j – i + 1) April 22, 2014CS38 Lecture 710

harmonic series:  i  1/i = O(log n) Quicksort with random pivot –probability x i and x j compared = 2/(j – i + 1) –so expected number of comparisons is  i  j 2/(j – i + 1) =  i   j  i  2/(j – i + 1) =  i   k=1 2/(k+1)  i   k=1 2/k =  i  O(log n) = O(n log n) April 22, 2014CS38 Lecture 711 n-1 n n n-i n n n

Quicksort with random pivot we proved: note: by Markov, prob. running time > 100 ¢ expectation < 1/100 April 22, 2014CS38 Lecture 712 Random-Pivot-Quicksort(array of n elements: a) 1.if n = 1, return(a) 2.pick i uniformly at random from {1,2,… n} 3.partition array a into “ a i ” arrays 4.Random-Pivot-Quicksort(“< a i ”) 5.Random-Pivot-Quicksort(“> a i ”) 6.return(“ a i ) Theorem: Random-Pivot-Quicksort runs in expected time O(n log n).

Selection Input: n values; find k-th smallest –minimum: k = 1 –maximum: k = n –median: k = b(n+1)/2c –running time for min or max? O(n) running time for general k? –using sorting: O(n log n) –using a min-heap: O(n + k log n) April 22, 2014CS38 Lecture 713

Selection running time for general k? –using sorting: O(n log n) –using a min-heap: O(n + k log n) –we will see: O(n) Intuition: like quicksort with recursion on only 1 subproblem April 22, know that 3 are smaller T(n) = T(n/c) + O(n) Solution: T(n) = O(n)

Selection with random pivot Bounding the expected # comparisons: –T(n,k) = expected # for k-th smallest from n –T(n) = max k T(n,k) –Observe: T(n) monotonically increasing April 22, 2014CS38 Lecture 715 Random-Pivot-Select(k; array of n elements: a) 1.pick i uniformly at random from {1,2,… n} 2.partition array a into “ a i ” arrays 3.s = size of “< a_i” array 4.if s = k-1, then return(a_i) 5.else if s a i ”) 6.else if s > k-1, Random-Pivot-Select(k, “<a i ”)

Selection with random pivot Bounding the expected # comparisons: –probability of choosing i-th largest = 1/n –resulting subproblems sizes are n-i, i-1 –upper bound expectation by taking larger April 22, 2014CS38 Lecture 716 Random-Pivot-Select(k; array of n elements: a) 1.pick i uniformly at random from {1,2,… n} 2.partition array a into “ a i ” arrays 3.s = size of “< a_i” array 4.if s = k-1, then return(a_i) 5.else if s a i ”) 6.else if s > k-1, Random-Pivot-Select(k, “<a i ”) ?

Selection with random pivot Claim: T(n) · n + 1/n¢[T(n/2)+T(n/2+1)+…+T(n-1)] + 1/n¢[T(n/2)+T(n/2+1)+…+T(n- 1)] April 22, Random-Pivot-Select(k; array of n elements: a) 1.pick i uniformly at random from {1,2,… n} 2.partition array a into “ a i ” arrays 3.s = size of “< a_i” array 4.if s = k-1, then return(a_i) 5.else if s a i ”) 6.else if s > k-1, Random-Pivot-Select(k, “<a i ”) to splitprobability pivot is i-th largest max (n-i, i-1) as i = 1…n

Selection with random pivot Claim: T(n) · 4n. Proof: induction on n. –assume true for 1… n-1 –T(n) · n + 2/n¢[T(n/2)+T(n/2+1)+…+T(n-1)] –T(n) · n + 2/n¢[4(n/2)+4(n/2+1)+…+4(n-1)] –T(n) · n + 8/n¢[(3/8)n 2 ] < 4n. April 22, 2014CS38 Lecture 718 T(n) · n + 1/n¢[T(n/2)+T(n/2+1)+…+T(n-1)] + 1/n¢[T(n/2)+T(n/2+1)+…+T(n- 1)]

Linear-time selection Clever way to achieve guarantee: –break array up into subsets of 5 elements –recursively compute median of medians of these sets –leads to T(n) = T((1/5)n) + T((7/10)n) + O(n) April 22, 2014CS38 Lecture 719 Select(k; array of n elements: a) 1.pick i from {1,2,… n} and partition array a into “ a i ” arrays *** guarantee that both arrays have size at most (7/10) n 1.s = size of “< a_i” array 2.if s = k-1, then return(a_i) 3.else if s a i ”) 4.else if s > k-1, Select(k, “<a i ”) solution is T(n) = O(n) because 1/5 + 7/10 < 1

Linear-time selection Median of medians example (n = 48): April 22, 2014CS38 Lecture

Linear-time selection find median in each column April 22, 2014CS38 Lecture

Linear-time selection find median in each column April 22, 2014CS38 Lecture

Linear-time selection find median of this subset of dn/5e = 10 April 22, 2014CS38 Lecture

Linear-time selection How many < this one? April 22, 2014CS38 Lecture

Linear-time selection How many · this one? dn/5e/2-1=10/2 -1=4 April 22, 2014CS38 Lecture

Linear-time selection Total · this one? at least (dn/5e/ 2 - 2)¢3 April 22, 2014CS38 Lecture

Linear-time selection Total ¸ this one? at least (dn/5e/ 2 - 2)¢3 April 22, 2014CS38 Lecture

Linear-time selection To find pivot: break array into subsets of 5 –find median of each 5 –find median of medians Claim: at most (7/10)n + 6 elements are smaller than pivot Proof: at least (dn/5e/ 2 - 2)¢3 ¸ 3n/ are larger or equal. April 22, 2014CS38 Lecture 728

Linear-time selection To find pivot: break array into subsets of 5 –find median of each 5 –find median of medians Claim: at most (7/10)n + 6 elements are larger than pivot Proof: at least (dn/5e/ 2 - 1)¢3 ¸ 3n/ are smaller. April 22, 2014CS38 Lecture 729

Linear-time selection Running time: –T(n) = O(1) if n < 140 –T(n) · T(n/5 + 1) + T(7/10 + 6) + cnotherwise –we claim that T(n) · 20cn April 22, 2014CS38 Lecture 730 Select(k; array of n elements: a) 1.pick i from {1,2,… n} using median of medians method 2.partition array a into “ a i ” arrays 3.s = size of “< a_i” array 4.if s = k-1, then return(a_i) 5.else if s a i ”) 6.else if s > k-1, Select(k, “<a i ”)

Linear-time selection Running time: –T(n) = O(1) if n < 140 –T(n) · T(n/5 + 1) + T(7/10 + 6) + cnotherwise Claim: T(n) · 20cn Proof: induction on n; base case easy T(n) · T(n/5 + 1) + T(7/10 + 6) + cn T(n) · 20c(n/5 + 1) + 20c(7/10 + 6) + cn T(n) · 19cn + 140c · 20cn provided n ¸ 140 April 22, 2014CS38 Lecture 731

Closest pair in the plane Given n points in the plane, find the closest pair April 22, 2014CS38 Lecture 732

Closest pair in the plane Given n points in the plane, find the closest pair –O(n 2 ) if compute all pairwise distances –1 dimensional case? April 22, 2014CS38 Lecture 733

Closest pair in the plane Given n points in the plane, find the closest pair –can try sorting by x and y coordinates, but: April 22,

Closest pair in the plane Divide and conquer approach: –split point set in equal sized left and right sets –find closest pair in left, right, + across middle April 22, 2014CS38 Lecture 735

Closest pair in the plane Divide and conquer approach: –split point set in equal sized left and right sets –find closest pair in left, right, + across middle April 22, 2014CS38 Lecture 736

Closest pair in the plane Divide and conquer approach: –split point set in equal sized left and right sets –time to perform split? –sort by x coordinate: O(n log n) –running time recurrence: T(n) = 2T(n/2) + time for middle + O(n log n) April 22, 2014CS38 Lecture 737 Is time for middle as bad as O(n 2 )?

Closest pair in the plane Claim: time for middle only O(n log n) !! key: we know d = min of April 22, distance between closest pair on left distance between closest pair on right à 2d ! Observation: only need to consider points within distance d of the midline

Closest pair in the plane scan left to right to identify, then sort by y coord. –still  (n 2 ) comparisons? –Claim: only need do pairwise comparisons 15 ahead in this sorted list ! April 22, 2014CS38 Lecture 739 Ã 2d !

Closest pair in the plane no 2 points lie in same box (why?) if 2 points are within ¸ 16 positions of each other in list sorted by y coord… … then they must be separated by ¸ 3 rows implies dist. > (3/2)¢ d April 22, 2014CS38 Lecture 740 Ã 2d ! d/2 by d/2 boxes

Closest pair in the plane Running time: T(2) = O(1); T(n) = 2T(n/2) + O(n log n) April 22, 2014CS38 Lecture 741 Closest-Pair(P: set of n points in the plane) 1.sort by x coordinate and split equally into L and R subsets 2.(p,q) = Closest-Pair(L) 3.(r,s) = Closest-Pair(R) 4.d = min(distance(p,q), distance(r,s)) 5.scan P by x coordinate to find M: points within d of midline 6.sort M by y coordinate 7.compute closest pair among all pairs within 15 of each other in M 8.return best among this pair, (p,q), (r,s)

Closest pair in the plane Running time: T(2) = a; T(n) = 2T(n/2) + bn¢log n set c = max(a/2, b) Claim: T(n) · cn¢log 2 n Proof: base case easy… T(n) · 2T(n/2) + bn¢log n · 2cn/2(log n - 1) 2 + bn¢log n < cn(log n)(log n - 1) + bn¢log n · cnlog 2 n April 22, 2014CS38 Lecture 742

Closest pair in the plane we have proved: can be improved to O(n log n) by being more careful about maintaining sorted lists April 22, 2014CS38 Lecture 743 Theorem: There is an O(n log 2 n) time algorithm for finding the closest pair among n points in the plane.