CS 3343: Analysis of Algorithms

CS 3343: Analysis of Algorithms
Lecture 9: Review for midterm 1 Analysis of quick sort 5/29/2018

Exam (midterm 1) Closed book exam
One cheat sheet allowed (limit to a single page of letter-size paper, double-sided) Thursday, Feb 23, class time + 5 minutes Basic calculator (no graphing) is allowed Do NOT use phones / tablets as calculators 5/29/2018

Materials covered Up to Lecture 8 (Feb 2) O, Θ, Ω
Compare order of growth Prove O, Θ, Ω by Definition Analyzing iterative algorithms Use loop invariant to prove correctness Know how to count the number of basic operations, and express the running time as a sum of a series Know how to compute the sum of geometric and arithmetic series Analyzing recursive algorithms Use induction to prove correctness Define running time using recurrence Solve recurrence using recursion tree / iteration method Solve recurrence using master method Proof using substitution method 5/29/2018

Asymptotic notations O: <= o: < Ω: >= ω: > Θ: =
(in terms of growth rate) 5/29/2018

Mathematical definitions
O(g(n)) = {f(n):  positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n)  n>n0} Ω(g(n)) = {f(n):  positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n)  n>n0} Θ(g(n)) = {f(n):  positive constants c1, c2, and n0 such that 0  c1 g(n)  f(n)  c2 g(n)  n  n0} 5/29/2018

Big-Oh Claim: f(n) = 3n2 + 10n + 5  O(n2) Proof by definition:
f(n) = 3n2 + 10n + 5  3n2 + 10n2 + 5 , n > 1  3n2 + 10n2 + 5n2, n > 1  18 n2,  n > 1 If we let c = 18 and n0 = 1, we have f(n)  c n2,  n > n0. Therefore by definition, f(n) = O(n2). 5/29/2018

Use limits to compare orders of growth
lim f(n) / g(n) = c > 0 ∞ Use limits to compare orders of growth f(n)  o(g(n)) f(n)  O(g(n)) f(n)  Θ (g(n)) n→∞ f(n)  Ω(g(n)) f(n)  ω (g(n)) L’ Hopital’s rule lim f(n) / g(n) = lim f(n)’ / g(n)’ Condition: If both lim f(n) and lim g(n) = ∞ or 0 n→∞ n→∞ Stirling’s formula (constant) 5/29/2018

Useful rules for logarithms
For all a > 0, b > 0, c > 0, the following rules hold logba = logca / logcb = lg a / lg b So: log10n = log2n / log2 10 logban = n logba So: log 3n = n log3 = (n) blogba = a So: 2log2n = n log (ab) = log a + log b So: log (3n) = log 3 + log n = (log n) log (a/b) = log (a) – log(b) So: log (n/2) = log n – log 2 = (log n) logba = 1 / logab logb1 = 0 5/29/2018

Useful rules for exponentials
For all a > 0, b > 0, c > 0, the following rules hold a0 = 1 (00 = ?) Answer: does not exist a1 = a a-1 = 1/a (am)n = amn (am)n = (an)m So: (3n)2 = 32n = (32)n =9n aman = am+n So: n2 n3 = n5 2n 22 = 2n+2 = 4 * 2n = (2n) 5/29/2018

More advanced dominance ranking
5/29/2018

Sum of arithmetic series
If a1, a2, …, an is an arithmetic series, then 5/29/2018

Sum of geometric series
if r < 1 if r > 1 if r = 1 5/29/2018

Sum manipulation rules
Example: 5/29/2018

Analyzing non-recursive algorithms
Decide parameter (input size) Identify most executed line (basic operation) worst-case = average-case? T(n) = i ti T(n) = Θ (f(n)) 5/29/2018

Analysis of insertion Sort
Statement cost time__ InsertionSort(A, n) { for j = 2 to n { c1 n key = A[j] c2 (n-1) i = j - 1; c3 (n-1) while (i > 0) and (A[i] > key) { c4 S A[i+1] = A[i] c5 (S-(n-1)) i = i c6 (S-(n-1)) } A[i+1] = key c7 (n-1) } } 5/29/2018

Inner loop stops when A[i] <= key, or i = 0
Best case Inner loop stops when A[i] <= key, or i = 0 1 i j Key sorted Array already sorted S =  j=1..n tj tj = 1 for all j S = n T(n) = Θ (n) 5/29/2018

Inner loop stops when A[i] <= key
Worst case Inner loop stops when A[i] <= key 1 i j Key sorted Array originally in reverse order sorted S =  j=1..n tj tj = j S =  j=1..n j = … + n = n (n+1) / 2 = Θ (n2) 5/29/2018

Inner loop stops when A[i] <= key
Average case Inner loop stops when A[i] <= key 1 i j Key sorted Array in random order S =  j=1..n tj tj = j / 2 in average S =  j=1..n j/2 = ½  j=1..n j = n (n+1) / 4 = Θ (n2) 5/29/2018

Use loop invariants to prove the correctness of Insertion Sort
Loop Invariant (LI): at the start of each iteration of the for loop, the subarray A[1..j-1] consists of the elements originally in A[1..j-1] but in sorted order. Proof by induction Initialization: the LI is true at the start of the 1st iteration (j=2), since A[1] is sorted by itself. Maintenance: if the LI is true at the start of the jth iteration (i.e., A[1..j-1] has all the elements originally in A[1..j-1] but in sorted order ), it remains true before the (j+1)th iteration (i.e., A[1..j] has all the elements originally in A[1..j] in sorted order), as the while loop finds the right position in A[1..j-1] to insert A[j]. Termination: when the loop terminates, j = n+1. By the LI, A[1..n] has all the elements originally in A[1..n] but in sorted order. Therefore the algorithm is correct. 5/29/2018

Analyzing recursive algorithms
Prove correctness using induction Define running time as a recurrence Solve recurrence Recursion tree (iteration) method Substitution method Master method 5/29/2018

Correctness of merge sort
MERGE-SORT A[1 . . n] If n = 1, done. Recursively sort A[ n/2 ] and A[ n/2 n ] . “Merge” the 2 sorted lists. Proof: Base case: if n = 1, the algorithm will return the correct answer because A[1..1] is already sorted. Inductive hypothesis: assume that the algorithm correctly sorts smaller suarrays, i.e., A[1.. n/2 ] and A[n/2+1..n]. Step: if A[1.. n/2 ] and A[n/2+1..n] are both correctly sorted, the whole array A[1.. n] is sorted after merging. Therefore, the algorithm is correct. 5/29/2018

Analyzing merge sort T(n) MERGE-SORT A[1 . . n] Θ(1) 2T(n/2)
f(n) MERGE-SORT A[1 . . n] If n = 1, done. Recursively sort A[ n/2 ] and A[ n/2 n ] . “Merge” the 2 sorted lists T(n) = 2 T(n/2) + Θ(n) 5/29/2018

Recursive Insertion Sort
RecursiveInsertionSort(A[1..n]) 1. if (n == 1) do nothing; 2. RecursiveInsertionSort(A[1..n-1]); 3. Find index i in A such that A[i] <= A[n] < A[i+1]; 4. Insert A[n] after A[i]; 5/29/2018

Binary Search BinarySearch (A[1..N], value) { if (N == 0)
return -1; // not found mid = (1+N)/2; if (A[mid] == value) return mid; // found else if (A[mid] > value) return BinarySearch (A[1..mid-1], value); else return BinarySearch (A[mid+1, N], value) } 5/29/2018

Recursion tree Solve T(n) = 2T(n/2) + n. n n n/2 n/2 n h = log n n/4
… … Q(1) #leaves = n Q(n) Total Q(n log n) 5/29/2018

Substitution method Recurrence: T(n) = 2T(n/2) + n.
Guess: T(n) = O(n log n). (eg. by recursion tree method) To prove, have to show T(n) ≤ c n log n for some c > 0 and for all n > n0 Proof by induction: assume it is true for T(n/2), prove that it is also true for T(n). This means: Fact: T(n) = 2T(n/2) + n Assumption: T(n/2)≤ cn/2 log (n/2) Need to Prove: T(n)≤ c n log (n) 5/29/2018

Proof To prove T(n) = O(n log n), we need to show that
T(n)  cn logn for some positive c and all sufficiently large n. Let’s assume this inequality is true for T(n/2), which means T(n/2)  cn/2 log(n/2) Substitute T(n/2) in the recurrence by the r.h.s. of the above inequality, we have T(n) = 2 T(n/2) + n  2 * cn/2 log (n/2) + n  cn (log n – 1) + n  cn log n – (cn – n)  cn log n for c ≥ 1 and all n ≥ 0. Therefore, by definition, T(n) = O(n log n). 5/29/2018

Master theorem T(n) = a T(n/b) + f (n) Key: compare f(n) with nlogba
CASE 1: f (n) = O(nlogba – e)  T(n) = Q(nlogba) . CASE 2: f (n) = Q(nlogba)  T(n) = Q(nlogba log n) . CASE 3: f (n) = W(nlogba + e) and a f (n/b) £ c f (n) T(n) = Q( f (n)) . Optional: extended case 2 Regularity Condition 5/29/2018

Analysis of Quick Sort 5/29/2018

Quick sort Another divide and conquer sorting algorithm – like merge sort Anyone remember the basic idea? The worst-case and average-case running time? Learn some new algorithm analysis tricks 5/29/2018

Quick sort Quicksort an n-element array:
Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray £ x £ elements in upper subarray. Conquer: Recursively sort the two subarrays. Combine: Trivial. £ x x ≥ x Key: Linear-time partitioning subroutine. 5/29/2018

Partition All the action takes place in the partition() function £ x x
Rearranges the subarray in place End result: two subarrays All values in first subarray  all values in second Returns the index of the “pivot” element separating the two subarrays p q r £ x x ≥ x 5/29/2018

Pseudocode for quicksort
QUICKSORT(A, p, r) if p < r then q  PARTITION(A, p, r) QUICKSORT(A, p, q–1) QUICKSORT(A, q+1, r) Initial call: QUICKSORT(A, 1, n) 5/29/2018

Idea of partition If we are allowed to use a second array, it would be easy 6 10 5 8 13 3 2 11 6 5 3 2 11 13 8 10 2 5 3 6 11 13 8 10 5/29/2018

Another idea Keep two iterators: one from head, one from tail 6 10 5 8
13 3 2 11 6 2 5 3 13 8 10 11 3 2 5 6 13 8 10 11 5/29/2018

In-place Partition 3 6 2 10 5 6 8 3 13 8 3 2 10 11 5/29/2018

Partition In Words Partition(A, p, r):
Select an element to act as the “pivot” (which?) Grow two regions, A[p..i] and A[j..r] All elements in A[p..i] <= pivot All elements in A[j..r] >= pivot Increment i until A[i] > pivot Decrement j until A[j] < pivot Swap A[i] and A[j] Repeat until i >= j Swap A[j] and A[p] Return j Note: different from book’s partition(), which uses two iterators that both move forward. 5/29/2018

Partition Code What is the running time of partition()?
Partition(A, p, r) x = A[p]; // pivot is the first element i = p; j = r + 1; while (TRUE) { repeat i++; until A[i] > x or i >= j; j--; until A[j] < x or j < i; if (i < j) Swap (A[i], A[j]); else break; } swap (A[p], A[j]); return j; What is the running time of partition()? partition() runs in (n) time 5/29/2018

p r 6 10 5 8 13 3 2 11 x = 6 i j 6 10 5 8 13 3 2 11 scan i j 6 2 5 8 13 3 10 11 swap i j Partition example 6 2 5 8 13 3 10 11 scan i j 6 2 5 3 13 8 10 11 swap i j 6 2 5 3 13 8 10 11 scan j i p q r 3 2 5 6 13 8 10 11 final swap 5/29/2018

6 10 5 8 11 3 2 13 Quick sort example 3 2 5 6 11 8 10 13 2 3 5 6 10 8 11 13 2 3 5 6 8 10 11 13 2 3 5 6 8 10 11 13 5/29/2018

Analysis of quicksort Assume all input elements are distinct.
In practice, there are better partitioning algorithms for when duplicate input elements may exist. Let T(n) = worst-case running time on an array of n elements. 5/29/2018

Worst-case of quicksort
Input sorted or reverse sorted. Partition around min or max element. One side of partition always has no elements. (arithmetic series) 5/29/2018

Worst-case recursion tree
T(n) = T(0) + T(n–1) + n 5/29/2018

T(n) = T(0) + T(n–1) + n T(n) 5/29/2018

T(n) = T(0) + T(n–1) + n n T(0) T(n–1) 5/29/2018

T(n) = T(0) + T(n–1) + n n T(0) (n–1) T(0) T(n–2) 5/29/2018

T(n) = T(0) + T(n–1) + n n T(0) (n–1) T(0) (n–2) T(0) T(0) 5/29/2018

T(n) = T(0) + T(n–1) + n height n height = n T(0) (n–1) T(0) (n–2) T(0) T(0) 5/29/2018

T(n) = T(0) + T(n–1) + n n n height = n T(0) (n–1) T(0) (n–2) T(0) T(0) 5/29/2018

T(n) = T(0) + T(n–1) + n n n height = n Q(1) (n–1) Q(1) (n–2) T(n) = Q(n) + Q(n2) = Q(n2) Q(1) Q(1) 5/29/2018

Best-case analysis (For intuition only!)
If we’re lucky, PARTITION splits the array evenly: T(n) = 2T(n/2) + Q(n) = Q(n log n) (same as merge sort) What if the split is always ? What is the solution to this recurrence? 5/29/2018

Analysis of “almost-best” case
5/29/2018

log10/9n … … … O(n) leaves Q(1) Q(1) 5/29/2018

log10n log10/9n … … O(n) leaves … Q(1) Q(n log n) Q(1) n log10n £ T(n) £ n log10/9n + O(n) 5/29/2018

Quicksort Runtimes Best-case runtime Tbest(n)  (n log n)
Worst-case runtime Tworst(n)  (n2) Worse than mergesort? Why is it called quicksort then? Its average runtime Tavg(n)  (n log n ) Better even, the expected runtime of randomized quicksort is (n log n) 5/29/2018

Randomized quicksort Randomly choose an element as pivot
Every time need to do a partition, throw a die to decide which element to use as the pivot Each element has 1/n probability to be selected Rand-Partition(A, p, r) d = random(); // a random number between 0 and 1 index = p + floor((r-p+1) * d); // p<=index<=r swap(A[p], A[index]); Partition(A, p, r); // now do partition using A[p] as pivot 5/29/2018

Running time of randomized quicksort
T(0) + T(n–1) + dn if 0 : n–1 split, T(1) + T(n–2) + dn if 1 : n–2 split, M T(n–1) + T(0) + dn if n–1 : 0 split, T(n) = The expected running time is an average of all cases Expectation 5/29/2018

5/29/2018

Solving recurrence Recursion tree (iteration) method
- Good for guessing an answer Substitution method - Generic method, rigid, but may be hard Master method - Easy to learn, useful in limited cases only - Some tricks may help in other cases 5/29/2018

Substitution method The most general method to solve a recurrence (prove O and  separately): Guess the form of the solution: (e.g. using recursion trees, or expansion) Verify by induction (inductive step). 5/29/2018

Expected running time of Quicksort
Guess We need to show that for some c and sufficiently large n Use T(n) instead of for convenience 5/29/2018

Need to show: T(n) ≤ c n log (n)
Fact: Need to show: T(n) ≤ c n log (n) Assume: T(k) ≤ ck log (k) for 0 ≤ k ≤ n-1 Proof: using the fact that if c ≥ 4. Therefore, by defintion, T(n) =  (nlogn) 5/29/2018

Tightly Bounding The Key Summation
Split the summation for a tighter bound What are we doing here? The lg k in the second term is bounded by lg n What are we doing here? Move the lg n outside the summation What are we doing here? 5/29/2018

The summation bound so far The lg k in the first term is bounded by lg n/2 What are we doing here? lg n/2 = lg n - 1 What are we doing here? Move (lg n - 1) outside the summation What are we doing here? 5/29/2018

The summation bound so far What are we doing here? Distribute the (lg n - 1) The summations overlap in range; combine them What are we doing here? The Guassian series What are we doing here? 5/29/2018

The summation bound so far Rearrange first term, place upper bound on second What are we doing here? What are we doing? Guassian series Multiply it all out What are we doing? 5/29/2018

5/29/2018

CS 3343: Analysis of Algorithms

Similar presentations

Presentation on theme: "CS 3343: Analysis of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 3343: Analysis of Algorithms

Similar presentations

Presentation on theme: "CS 3343: Analysis of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback