Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Advertisements

Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Quicksort Lecture 3 Prof. Dr. Aydın Öztürk. Quicksort.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
1 Today’s Material Medians & Order Statistics – Ch. 9.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 5.
Analysis of Algorithms CS 477/677 Sorting – Part B Instructor: George Bebis (Chapter 7)
Spring 2015 Lecture 5: QuickSort & Selection
CSE 2331/5331 Topic 5: Prob. Analysis Randomized Alg.
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized.
Quicksort Ack: Several slides from Prof. Jim Anderson’s COMP 750 notes. UNC Chapel Hill1.
Analysis of Algorithms
1 Introduction to Randomized Algorithms Md. Aashikur Rahman Azim.
Quicksort Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
7.Quicksort Hsu, Lih-Hsing. Computer Theory Lab. Chapter 7P Description of quicksort Divide Conquer Combine.
CS 253: Algorithms Chapter 7 Mergesort Quicksort Credit: Dr. George Bebis.
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different but similar analyses –Probabilistic analysis of a deterministic algorithm.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
Quicksort CIS 606 Spring Quicksort Worst-case running time: Θ(n 2 ). Expected running time: Θ(n lg n). Constants hidden in Θ(n lg n) are small.
Quicksort!. A practical sorting algorithm Quicksort  Very efficient Tight code Good cache performance Sort in place  Easy to implement  Used in older.
Analysis of Algorithms CS 477/677
Analysis of Algorithms CS 477/677
1 QuickSort Worst time:  (n 2 ) Expected time:  (nlgn) – Constants in the expected time are small Sorts in place.
Computer Algorithms Lecture 10 Quicksort Ch. 7 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
10 Algorithms in 20th Century Science, Vol. 287, No. 5454, p. 799, February 2000 Computing in Science & Engineering, January/February : The Metropolis.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Chapter 7 Quicksort Ack: This presentation is based on the lecture slides from Hsu, Lih- Hsing, as well as various materials from the web.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Introduction to Algorithms Jiafen Liu Sept
David Luebke 1 6/3/2016 CS 332: Algorithms Analyzing Quicksort: Average Case.
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Randomized Algorithms CSc 4520/6520 Design & Analysis of Algorithms Fall 2013 Slides adopted from Dmitri Kaznachey, George Mason University and Maciej.
Deterministic and Randomized Quicksort Andreas Klappenecker.
QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
Midterm Review 1. Midterm Exam Thursday, October 15 in classroom 75 minutes Exam structure: –TRUE/FALSE questions –short questions on the topics discussed.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
David Luebke 1 2/19/2016 Priority Queues Quicksort.
Sorting. 2 The Sorting Problem Input: A sequence of n numbers a 1, a 2,..., a n Output: A permutation (reordering) a 1 ’, a 2 ’,..., a n ’ of the input.
1 Chapter 7 Quicksort. 2 About this lecture Introduce Quicksort Running time of Quicksort – Worst-Case – Average-Case.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting Input: sequence of numbers Output: a sorted sequence.
Analysis of Algorithms CS 477/677
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Ch 7: Quicksort Ming-Te Chi
Lecture 3 / 4 Algorithm Analysis
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
Order Statistics Comp 550, Spring 2015.
CS 583 Analysis of Algorithms
CS 3343: Analysis of Algorithms
Ack: Several slides from Prof. Jim Anderson’s COMP 202 notes.
Chapter 7 Quicksort.
Order Statistics Comp 122, Spring 2004.
The Selection Problem.
Quicksort and Randomized Algs
Quicksort Quick sort Correctness of partition - loop invariant
CS200: Algorithm Analysis
Presentation transcript:

Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)

2 Randomizing Quicksort Randomly permute the elements of the input array before sorting OR... modify the PARTITION procedure –At each step of the algorithm we exchange element A[p] with an element chosen at random from A[p…r] –The pivot element x = A[p] is equally likely to be any one of the r – p + 1 elements of the subarray

3 Randomized Algorithms No input can elicit worst case behavior –Worst case occurs only if we get “unlucky” numbers from the random number generator Worst case becomes less likely –Randomization can NOT eliminate the worst-case but it can make it less likely!

4 Randomized PARTITION Alg.: RANDOMIZED-PARTITION (A, p, r) i ← RANDOM (p, r) exchange A[p] ↔ A[i] return PARTITION (A, p, r)

5 Randomized Quicksort Alg. : RANDOMIZED-QUICKSORT (A, p, r) if p < r then q ← RANDOMIZED-PARTITION (A, p, r) RANDOMIZED-QUICKSORT (A, p, q) RANDOMIZED-QUICKSORT (A, q + 1, r)

6 Formal Worst-Case Analysis of Quicksort T(n) = worst-case running time T(n) = max (T(q) + T(n-q)) +  (n) 1 ≤ q ≤ n-1 Use substitution method to show that the running time of Quicksort is O(n 2 ) Guess T(n) = O(n 2 ) –Induction goal: T(n) ≤ cn 2 –Induction hypothesis: T(k) ≤ ck 2 for any k < n

7 Worst-Case Analysis of Quicksort Proof of induction goal: T(n) ≤ max (cq 2 + c(n-q) 2 ) +  (n) = 1 ≤ q ≤ n-1 = c  max (q 2 + (n-q) 2 ) +  (n) 1 ≤ q ≤ n-1 The expression q 2 + (n-q) 2 achieves a maximum over the range 1 ≤ q ≤ n-1 at one of the endpoints max (q 2 + (n - q) 2 ) = (n - 1) 2 = n 2 – 2(n – 1) 1 ≤ q ≤ n-1 T(n) ≤ cn 2 – 2c(n – 1) +  (n) ≤ cn 2

8 Revisit Partitioning Hoare’s partition –Select a pivot element x around which to partition –Grows two regions A[p…i]  x x  A[j…r] A[p…i]  xx  A[j…r] ij

9 Another Way to PARTITION (Lomuto’s partition – page 146) Given an array A, partition the array into the following subarrays: –A pivot element x = A[q] –Subarray A[p..q-1] such that each element of A[p..q-1] is smaller than or equal to x (the pivot) –Subarray A[q+1..r], such that each element of A[p..q+1] is strictly greater than x (the pivot) The pivot element is not included in any of the two subarrays A[p…i] ≤ xA[i+1…j-1] > x pii+1 rj-1 unknown pivot j

10 Example at the end, swap pivot

11 Another Way to PARTITION (cont’d) Alg.: PARTITION(A, p, r) x ← A[r] i ← p - 1 for j ← p to r - 1 do if A[ j ] ≤ x then i ← i + 1 exchange A[i] ↔ A[j] exchange A[i + 1] ↔ A[r] return i + 1 Chooses the last element of the array as a pivot Grows a subarray [p..i] of elements ≤ x Grows a subarray [i+1..j-1] of elements >x Running Time:  (n), where n=r-p+1 A[p…i] ≤ xA[i+1…j-1] > x pii+1 rj-1 unknown pivot j

12 Randomized Quicksort (using Lomuto’s partition) Alg. : RANDOMIZED-QUICKSORT (A, p, r) if p < r then q ← RANDOMIZED-PARTITION (A, p, r) RANDOMIZED-QUICKSORT (A, p, q - 1) RANDOMIZED-QUICKSORT (A, q + 1, r) The pivot is no longer included in any of the subarrays!!

13 Analysis of Randomized Quicksort Alg. : RANDOMIZED-QUICKSORT (A, p, r) if p < r then q ← RANDOMIZED-PARTITION (A, p, r) RANDOMIZED-QUICKSORT (A, p, q - 1) RANDOMIZED-QUICKSORT (A, q + 1, r) The running time of Quicksort is dominated by PARTITION !! PARTITION is called at most n times (at each call a pivot is selected and never again included in future calls)

14 PARTITION Alg.: PARTITION(A, p, r) x ← A[r] i ← p - 1 for j ← p to r - 1 do if A[ j ] ≤ x then i ← i + 1 exchange A[i] ↔ A[j] exchange A[i + 1] ↔ A[r] return i + 1 O(1) - constant # of comparisons: X k between the pivot and the other elements Amount of work at call k: c + X k

15 Average-Case Analysis of Quicksort Let X = total number of comparisons performed in all calls to PARTITION: The total work done over the entire execution of Quicksort is O(nc+X)=O(n+X) Need to estimate E(X)

16 Review of Probabilities

17 Review of Probabilities (discrete case)

18 Random Variables Def.: (Discrete) random variable X: a function from a sample space S to the real numbers. –It associates a real number with each possible outcome of an experiment. X(j)

19 Random Variables E.g.: Toss a coin three times define X = “numbers of heads”

20 Computing Probabilities Using Random Variables

21 Expectation Expected value (expectation, mean) of a discrete random variable X is: E[X] = Σ x x Pr{X = x} –“Average” over all possible values of random variable X

22 Examples Example: X = face of one fair dice E[X] = 1  1/6 + 2  1/6 + 3  1/6 + 4  1/6 + 5  1/6 + 6  1/6 = 3.5 Example:

23 Indicator Random Variables Given a sample space S and an event A, we define the indicator random variable I{A} associated with A: –I{A} = 1 if A occurs 0 if A does not occur The expected value of an indicator random variable X A =I{A} is : E[X A ] = Pr {A} Proof: E[X A ] = E[I{A}] = 1  Pr{A} + 0  Pr{Ā} = Pr{A}

24 Average-Case Analysis of Quicksort Let X = total number of comparisons performed in all calls to PARTITION: The total work done over the entire execution of Quicksort is O(n+X) Need to estimate E(X)

25 Notation Rename the elements of A as z 1, z 2,..., z n, with z i being the i-th smallest element Define the set Z ij = {z i, z i+1,..., z j } the set of elements between z i and z j, inclusive z1z1 z2z2 z9z9 z8z8 z5z5 z3z3 z4z4 z6z6 z 10 z7z7

26 Total Number of Comparisons in PARTITION i+1 n i n-1 Total number of comparisons X performed by the algorithm: Define X ij = I {z i is compared to z j }

27 Expected Number of Total Comparisons in PARTITION Compute the expected value of X: by linearity of expectation the expectation of X ij is equal to the probability of the event “z i is compared to z j ” indicator random variable

28 Comparisons in PARTITION : Observation 1 Each pair of elements is compared at most once during the entire execution of the algorithm –Elements are compared only to the pivot point! –Pivot point is excluded from future calls to PARTITION

29 Comparisons in PARTITION: Observation 2 Only the pivot is compared with elements in both partitions! z1z1 z2z2 z9z9 z8z8 z5z5 z3z3 z4z4 z6z6 z 10 z7z Z 1,6 = {1, 2, 3, 4, 5, 6}Z 8,9 = {8, 9, 10}{7} pivot Elements between different partitions are never compared!

30 Comparisons in PARTITION Case 1: pivot chosen such as: z i < x < z j –z i and z j will never be compared Case 2 : z i or z j is the pivot –z i and z j will be compared –only if one of them is chosen as pivot before any other element in range z i to z j z1z1 z2z2 z9z9 z8z8 z5z5 z3z3 z4z4 z6z6 z 10 z7z7 Z 1,6 = {1, 2, 3, 4, 5, 6}Z 8,9 = {8, 9, 10}{7}

31 See why z 2 z 4 z 1 z 3 z 5 z 7 z 9 z 6 z2 will never be compared with z6 since z5 (which belongs to [z 2, z 6 ]) was chosen as a pivot first !

32 Probability of comparing z i with z j = 1/( j - i + 1) + 1/( j - i + 1) = 2/( j - i + 1) z i is compared to z j z i is the first pivot chosen from Z ij = Pr{} } z j is the first pivot chosen from Z ij Pr{} OR + There are j – i + 1 elements between z i and z j – Pivot is chosen randomly and independently – The probability that any particular element is the first one chosen is 1/( j - i + 1)

33 Number of Comparisons in PARTITION  Expected running time of Quicksort using RANDOMIZED-PARTITION is O(nlgn) Expected number of comparisons in PARTITION: (set k=j-i) (harmonic series)

34 Alternative Average-Case Analysis of Quicksort See Problem 7-2, page 16 Focus on the expected running time of each individual recursive call to Quicksort, rather than on the number of comparisons performed. Use Hoare partition in our analysis.

35 Alternative Average-Case Analysis of Quicksort (i.e., any element has the same probability to be chosen as pivot)

36 Alternative Average-Case Analysis of Quicksort

37 Alternative Average-Case Analysis of Quicksort

38 Alternative Average-Case Analysis of Quicksort

39 Alternative Average-Case Analysis of Quicksort - 1:n-1 splits have 2/n probability - all other splits have 1/n probability

40 Alternative Average-Case Analysis of Quicksort

41 Alternative Average-Case Analysis of Quicksort recurrence for average case: Q(n) =

42 Problem Consider the problem of determining whether an arbitrary sequence {x 1, x 2,..., x n } of n numbers contains repeated occurrences of some number. Show that this can be done in Θ(nlgn) time. –Sort the numbers Θ(nlgn) –Scan the sorted sequence from left to right, checking whether two successive elements are the same Θ(n) –Total Θ(nlgn)+Θ(n)=Θ(nlgn)

43 Ex (page 37) Can we use Binary Search to improve InsertionSort (i.e., find the correct location to insert A[j]?) Alg.: INSERTION-SORT(A) for j ← 2 to n do key ← A[ j ] Insert A[ j ] into the sorted sequence A[1.. j -1] i ← j - 1 while i > 0 and A[i] > key do A[i + 1] ← A[i] i ← i – 1 A[i + 1] ← key

44 Ex (page 37) Can we use binary search to improve InsertionSort (i.e., find the correct location to insert A[j]?) –This idea can reduce the number of comparisons from O(n) to O(lgn) –Number of shifts stays the same, i.e., O(n) –Overall, time stays the same... –Worthwhile idea when comparisons are expensive (e.g., compare strings)

45 Problem Analyze the complexity of the following function: F(i) if i=0 then return 1 return (2*F(i-1)) Recurrence: T(n)=T(n-1)+c Use iteration to solve it.... T(n)=Θ(n)

46 Problem What is the running time of Quicksort when all the elements are the same? –Using Hoare partition  best case Split in half every time T(n)=2T(n/2)+n  T(n)=Θ(nlgn) –Using Lomuto’s partition  worst case 1:n-1 splits every time T(n)=Θ(n 2 )