Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized.

Similar presentations


Presentation on theme: "Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized."— Presentation transcript:

1 Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized algorithm Example Problems/Algorithms –Hiring problem (Chapter 5) –Sorting/Quicksort (Chapter 7) Tools from probability theory –Indicator variables –Linearity of expectation

2 Probabilistic (Average-Case) Analysis Algorithm is deterministic; for a fixed input, it will run the same every time Analysis Technique –Assume a probability distribution for your inputs –Analyze item of interest over probability distribution Caveats –Specific inputs may have much worse performance –If distribution is wrong, analysis may give misleading picture

3 Randomized Algorithm “Randomize” the algorithm; for a fixed input, it will run differently depending on the result of random “coin tosses” Randomization examples/techniques –Randomize the order that candidates arrive –Randomly select a pivot element –Randomly select from a collection of deterministic algorithms Key points –Works well with high probability on every inputs –May fail on every input with low probability

4 Common Tools Indicator variables –Suppose we want to study random variable X that represents a composite of many random events –Define a collection of “indicator” variables X i that focus on individual events; typically X =  X i Linearity of expectations –Let X, Y, and Z be random variables s.t. X = Y + Z –Then E[X] = E[Y+Z] = E[Y] + E[Z] Recurrence Relations

5 Hiring Problem Input –A sequence of n candidates for a position –Each has a distinct quality rating that we can determine in an interview Algorithm –Current = 0; –For k = 1 to n If candidate(k) is better than Current, hire(k) and Current = k; Cost: –Number of hires Worst-case cost is n

6 Analyze Hiring Problem Assume a probability distribution –Each of the n! permutations is equally likely Analyze item of interest over probability distribution –Define random variables Let X = random variable corresponding to # of hires Let X i = “indicator variable” that i th interviewed candidate is hired –Value 0 if not hired, 1 if hired X =  i = 1 to n X i –E[X i ] = ? Explain why: –E[X] = E[  i = 1 to n X i ] =  i = 1 to n E[X i ] = ? Key observation: linearity of expectations

7 Alternative analysis Analyze item of interest over probability distribution –Let X i = indicator random variable that the i th best candidate is hired 0 if not hired, 1 if hired –Questions Relationship of X to X i ? E[X i ] = ?  i = 1 to n E[X i ] = ?

8 Questions What is the probability you will hire n times? What is the probability you will hire exactly twice? Biased Coin –Suppose you want to output 0 with probability ½ and 1 with probability ½ –You have a coin that outputs 1 with probability p and 0 with probability 1-p for some unknown 0 < p < 1 –Can you use this coin to output 0 and 1 fairly? –What is the expected running time to produce the fair output as a function of p?

9 Quicksort Algorithm Overview –Choose a pivot element –Partition elements to be sorted based on partition element –Recursively sort smaller and larger elements

10 Quicksort Walkthrough 17 12 6 23 19 8 5 10 6 8 5 10 17 12 23 19 5 6 8 17 12 19 23 6 8 12 17 23 6 17 5 6 8 10 12 17 19 23

11 Pseudocode Sort(A) { Quicksort(A,1,n); } Quicksort(A, low, high) { if (low < high) { pivotLocation = Partition(A,low,high); Quicksort(A,low, pivotLocation - 1); Quicksort(A, pivotLocation+1, high); }

12 Pseudocode int Partition(A,low,high) { pivot = A[high]; leftwall = low-1; for i = low to high-1 { if (A[i] < pivot) then { leftwall = leftwall+1; swap(A[i],A[leftwall]); } swap(A[high],A[leftwall+1]); } return leftwall+1; }

13 Worst Case for Quicksort

14 Average Case for Quicksort?

15 Intuitive Average Case Analysis 0 n/4 n/2 3n/4 n Anywhere in the middle half is a decent partition (3/4) h n = 1 => n = (4/3) h log(n) = h log(4/3) h = log(n) / log(4/3) < 2 log(n)

16 How many steps? At most 2log(n) decent partitions suffices to sort an array of n elements. But if we just take arbitrary pivot points, how often will they, in fact, be decent? Since any number ranked between n/4 and 3n/4 would make a decent pivot, half the pivots on average are decent. Therefore, on average, we will need 2 x 2log(n) = 4log(n) partitions to guarantee sorting.

17 Formal Average-case Analysis Let X denote the random variable that represents the total number of comparisons performed Let indicator variable X ij = the event that the ith smallest element and jth smallest element are compared –0 if not compared, 1 if compared X =  i=1 to n-1  j=i+1 to n X ij E[X] =  i=1 to n-1  j=i+1 to n E[X ij ]

18 Computing E[X ij ] E[X ij ] = probability that i and j are compared Observation –All comparisons are between a pivot element and another element –If an item k is chosen as pivot where i < k < j, then items i and j will not be compared E[X ij ] = –Items i or j must be chosen as a pivot before any items in interval (i..j)

19 Computing E[X] E[X] =  i=1 to n-1  j=i+1 to n E[X ij ] =  i=1 to n-1  j=i+1 to n 2/(j-i+1) =  i=1 to n-1  k=1 to n-i 2/(k+1) <=  i=1 to n-1 2 H n-i+1 <=  i=1 to n-1 2 H n = 2 (n-1)H n

20 Alternative Average-case Analysis Let T(n) denote the average time required to sort an n-element array –“Average” assuming each permutation equally likely Write a recurrence relation for T(n) –T(n) = Using induction, we can then prove that T(n) = O(n log n) –Requires some simplification and summation manipulation

21 Avoiding the worst-case Understanding quicksort’s worst-case Methods for avoiding it –Pivot strategies –Randomization

22 Understanding the worst case A B D F H J K A B D F H J A B D F H A B D F A B D A B A The worst case occur is a likely case for many applications.

23 Pivot Strategies Use the middle Element of the sub-array as the pivot. Use the median element of (first, middle, last) to make sure to avoid any kind of pre- sorting. What is the worst-case performance for these pivot selection mechanisms?

24 Randomized Quicksort Make chance of worst-case run time equally small for all inputs Methods –Choose pivot element randomly from range [low..high] –Initially permute the array

25 Hat Check Problem N people give their hats to a hat-check person at a restaurant The hat-check person returns the hats to the people randomly What is the expected number of people who get their own hat back?

26 Online Hiring Problem Input –A sequence of n candidates for a position –Each has a distinct quality rating that we can determine in an interview We know total ranking of interviewed candidates, but not with respect to candidates left to interview –We can hire only once Online Algorithm Hire(k,n) –Current = 0; –For j = 1 to k If candidate(j) is better than Current, Current = j; –For j = k+1 to n If candidate(j) is better than Current, hire(j) and return Questions: –What is probability we hire the best qualified candidate given k? –What is best value of k to maximize above probability?

27 Online Hiring Analysis Let S i be the probability that we successfully hire the best qualified candidate AND this candidate was the ith one interviewed Let M(j) = the candidate in 1 through j with highest score What needs to happen for S i to be true? –Best candidate is in position i: B i –No candidate in positions k+1 through i-1 are hired: O i –These two quantities are independent, so we can multiply their probabilities to get S i

28 Computing S B i = 1/n O i = k/(i-1) S i = k/(n(i-1)) S =  i>k S i = k/n  i>k 1/(i-1) is probability of success k/n (H n – H k ): roughly k/n (ln n – ln k) Maximized when k = n/e Leads to probability of success of 1/e


Download ppt "Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized."

Similar presentations


Ads by Google