Presentation is loading. Please wait.

Presentation is loading. Please wait.

Theory of Computational Complexity M1 Takao Inoshita Iwama & Ito Lab Graduate School of Informatics, Kyoto University.

Similar presentations


Presentation on theme: "Theory of Computational Complexity M1 Takao Inoshita Iwama & Ito Lab Graduate School of Informatics, Kyoto University."— Presentation transcript:

1 Theory of Computational Complexity M1 Takao Inoshita Iwama & Ito Lab Graduate School of Informatics, Kyoto University

2 Chapter 5 Balls, Bins and Random Graphs highlight of my part ・ Balls-and-Bins Problem ・ Poisson Approximation ・ Some Applications

3 5.1 Example : The Birthday Paradox ・ 30 people in the room ・ Some two people share the same birthday or no two people share? ・ The Birthday of each person is a random day from a 365-day year, chosen independently and uniformly at random. ・ It is easier to think about the configurations where people do not share a birthday.

4 calculate ・ One way to calculate this probability is to directly count the configurations where two people so do not share a birthday. ・ 30 days must be chosen from the 365 and these days can be assigned to the people in any order of the 30! Possible orders. ・ Whole event is pattern.

5 We can also calculate this probability one person at a time. This product is 0.2937…. In general (in case of m people and n possible birthdays)

6 If m is small compared to n, we can use below approximation. Therefore,

7 Hence the value for m at which the probabilities of both become ½ is approximately given below

8 Let E k be the event that kth person’s birthday does not match any of the birthday of the first k-1 people. Then the probability that the first k people fail to have distinct birthday is

9 If this probability is less than 1/2 With people, the probability is at least ½ that all birthday will be distinct.

10 Assumption : the first people all have distinct birthdays. Each person after that has probability at least of having the same birthday as one of the first people Hence, Once, there are people, the probability of their different birthday is at most 1/e

11 5.2. Balls into Bins 5.2.1 The Balls-and-Bins Model The birthday paradox is an example of Balls- and-Bins problem. We have m balls that are thrown into n bins, with the location of each ball chosen independently and uniformly at random from n bins. The birthday paradox means whether or not there is a bin with two balls.

12 If this probability is less than 1/2 With people, the probability is at least ½ that all birthday will be distinct. Balls-and-Bins problem In case of m balls and n bins, for some at least one of the bins is likely to have more than one ball in it.

13 Lemma 5.1 : When n balls are thrown independently and uniformly at random into n bins, the probability that the maximum load is more than is at most 1/n for n sufficiently large.

14 The probability that bin 1 receives at least M balls is at most Proof : This follows from a union bound; there are distinct sets of M balls, and for any set of M balls the probability that all land in bin 1 is. We now use the inequalities

15 Here the second inequality is a consequence of the following general bound on factorials: since

16 Applying a union bound again allows us to find that, for, the probability that any bin receives at least M balls is bounded above by for n sufficiently large

17 5.2.2. Application: Bucket Sort Bucket Sort is an example of a sorting algorithm that breaks the lower bound for standard comparison-based sorting and runs in expected linear time. Assumption : Input is restricted to a set of elements of integer chosen independently and uniformly at random from the range

18 00 0110 11 001 110 111 101 First step of Bucket Sort Buckets : linked lists

19 Second step of Bucket Sort is any standard quadratic time algorithm. Concatenating the sorted lists from each bucket in order gives us the sorted order for the elements.

20 Analyzing of Bucket Sort Assuming that each element can be placed in the appropriate bucket in O(1) time, the first step requires only O(n) time. The input is uniformly at random. So … The number of elements that land in a specific bucket follows a binomial distribution B(n, 1/n). Bucket Sort falls naturally into the balls-and-bins model. An element A bucket B(n, 1/n) each placement : O(1)

21 Analyzing of Bucket Sort Let Xj be the number of elements that land in the jth Bucket. We can sort the jth bucket in at most time for some constant c. The expected time spent sorting in the second stage is at most Since X1 is a binomial random variable B(n, 1/n), (from 3.2.1)

22 Analyzing of Bucket Sort Hence the total expected time spent in the second stage is at most 2cn. Bucket Sort runs in expected linear time.

23 5.3 The Poisson Distribution We want to get the excepted fraction of bins with r balls for any r. First, we consider a particular case of r = 0. The probability the first bin remains empty is ・・・・・・ 1/n

24 By symmetry this probability is the same for all bins. X : a random variable that represents the number of empty bins Xj : a random variable that is 1 when jth bin is empty and 0 otherwise Thus, the expected fraction of empty bins is approximately

25 In general case, the probability that a given bin has r balls is When m and n are large compared to r This probability p r is approximately

26 Definition 5.1: A discrete Poisson random variable X with parameter μ is given by the following probability distribution on j = 0,1,2,… the sum of probability in this distribution

27 the expectation of this random variable

28 Lemma 5.2: The sum of a finite number of independent Poisson random variables is a Poisson random variable.

29 Proof : The case of more random variables than two is simply handled by induction

30 Lemma 5.3: The moment generating function of a Poisson random variable with parameter μ is Proof : For any t,

31 Lemma 5.3 → Lemma 5.2 Given two independent Poisson random variable X and Y with means μ 1 and μ 2, we apply Theorem 5.3 which is the moment generating function of a Poisson random variable with mean μ 1 + μ 2. By theorem 4.2, the moment generating function uniquely defines the distribution, and hence the sum X + Y is a Poisson random variable with mean μ 1 + μ 2.

32 Theorem 5.4 : Let X be a Poisson random variable with parameter μ. 1. If x > μ, then 2. If x < μ, then

33 Proof of Theorem 5.4-1 For any t > 0 and x > μ, Plugging in the expression for the moment generating function of the Poisson distribution, we have Choosing gives,

34 Proof of Theorem 5.4-2 For any t < 0 and x < μ, Hence, Choosing gives,

35 5.3.1 Limit of the Binomial Distribution When throwing m balls randomly into b bins, the probability that a bin has r balls is approximately the Poisson distribution with mean m/b. Binomial distribution Poisson distribution When n is large and p is small, limit

36 Theorem 5.5 : Let X n be a binomial random variable with parameters n and p, where p is a function of n and is a constant that is independent of n. Then for any fixed k,

37 Let X m be the number of balls in a specific bin. Then X m is a binomial random variable with parameter m and 1/n. Consider the balls-and-bins problem that there are m balls and n bins, where m is a function of n and Apply Theorem 5.5 for balls-and-bins problem matching the before approximation.

38 Applications of Theorem 5.5 ・ Count the number of spelling or grammatical mistakes. ・ Consider the number of chocolate chips inside a chocolate chip cookie. ・ Continuous settings (in Chapter 8) spells mistake with p (small) success with n (large)

39 Proof of Theorem 5.5 We can write Then,

40 Proof of Theorem 5.5 Combining, we have

41 Proof of Theorem 5.5 In the limit, as n approaches infinity p approaches zero because the limiting value pn is the constant λ. It follows that and Since lies between these two values, the theorem follows.

42 5.4 The Poisson Approximation In the balls-and-bins problems, the probability about one bin depends on the probability of another bin. We want to treat bin load as independent Poisson random variables for easier analysis. The probability of an event using this Poisson approximation for all bins and multiplying it by gives an upper bound for the probability of the event when m balls are thrown into n bins.

43 Theorem 5.6 : The distribution of conditioned on is the same as regardless of the value of m. Independent Poisson random Variables with mean m/n the number of balls in the ith bin

44 Difference between bins-and-balls and the approximation ・・・・・・ ・・・・・・ E[balls] = m/n Sum of balls = mSum of balls = ? (Average = m) Exact bins-and-balls model Approximation

45 Proof : When throwing k balls into n bins, the probability that for any k 1,…,k n satisfying is given by

46 Now, for any k 1,…,k n with, consider the probability that condition on satisfying

47 The probability that is since the are independent Poisson random variables with mean m/n. Also, by Lemma 5.2, the sum of the is itself a Poisson random variable with mean m. Hence

48 Theorem 5.7 : Let f (x 1,…,x n ) be a nonnegative function. Then Proof :

49 Since is Poisson distributed with mean m, we now have We use the following loose bound on m!, which we prove as Lemma 5.8 : This yields

50 Lemma 5.8 : Proof : For Therefore The result now follows simply by exponentiating.

51 If the function is the indicator function that is 1 if some event occurs and 0 otherwise, then Theorem 5.7 gives bounds on the probability of events. In balls-and-bins problem, the number of balls in bins are taken to be independent Poisson random variables with mean m/n. the Poisson case m balls are thrown into n bins at random. the exact case

52 Corollary 5.9 : Any event that takes place with probability p in the Poisson case takes place with probability at most in the exact case. Proof : Let f be the indicator function of the event. In this case, E[f] is just probability that the event occurs, and the result follows immediately from Theorem 5.7

53 Corollary 5.9 : the Poisson case the exact case Any rare events in the Poisson case are also rare in the exact case.

54 Theorem 5.10 : Let f (x 1,…,x n ) be a nonnegative function such is either monotonically increasing or monotonically decreasing in m. Then

55 Corollary 5.11 : Let ε be an event whose probability is either monotonically increasing or monotonically decreasing in the number of balls If ε has probability p in the Poisson case, then ε has probability at most 2p in the exact case.

56 Lemma 5.12 : When n balls are thrown independently and uniformly at random into n bins, the maximum load is at least ln n/ln ln n with probability at least 1 – 1/n for n sufficiently large. Proof : In the Poisson case, the probability that bin 1 has load at least is at least 1/eM!, which is the probability it has load exactly M.

57 if, the probability that the maximum load is not at least M in exact case is at most In the Poisson case, all bins are independent, so the probability that no bin has load at least M is at least

58 It therefore suffices to show that or equivalently that From Lemma 5.8, it follows that when n is suitably large. Hence, for n suitably large,

59 5.4.1 Example : Coupon Collector’s Problem, Revisited The coupon collector’s problem can be thought of as a balls-and-bins problem. coupons Cereal boxes bins balls question If balls are thrown at random into bins, how many balls are thrown until all bins have at least one ball?

60 Apply before result for the balls-and-bins problem From section 2.4.1, the expected number of balls that must be thrown before each bin has at least one ball is nH(n). From section 3.3.1, when n ln n + cn balls are thrown the probability that not all bins have at least one ball is

61 Theorem 5.13 : Let x be the number of coupons observed before obtaining one of each of n types of coupons. Then, for any constant c,

62 Proof of Theorem 5.13 : Outline 1. We prove 2. We prove 3. We prove that if 1. and 2., the Poisson approximation is accurate. 4. If 3., Theorem 5.13 is right. m = n ln n + cn

63 Proof of Theorem 5.13 : 4, We look at the problem as a balls-and-bins problem. For the Poisson approximation, we suppose that number of balls in each bin is a Poisson random variable with mean ln n + c. The probability that a specific bin is empty is then Since all bins are independent under the Poisson approximation, the probability that no bin is empty is (for sufficiently large n)

64 Proof of Theorem 5.13 : 3. Let ε be the event that no bin is empty, and let X is the number of balls thrown. We use Pr(ε) splitting it as follows: (5. 7)

65 Proof of Theorem 5.13 : If and From Eqn (5. 7), and hence

66 Proof of Theorem 5.13 : Consider that X is a Poisson random variable with mean m, since it is a sum of independent Poisson random variables. From Theorem 5.4 1. We prove

67 Proof of Theorem 5.13 : For we use that for to show A similar argument holds if x < m, so

68 Proof of Theorem 5.13 : 2. We prove Since Pr(ε | X = k) is increasing in k,

69 Hence we have the bound This is the probability of the following experiment: we throw balls and there is still at least one empty bin, but after throwing an additional balls, all bins nonempty.

70 Emp ty ・・・・・・ ・・・・・・ stepnext by union bound Hence this difference is o(1) as well.

71 Since 1, 2, 3, 4, proving the theorem.


Download ppt "Theory of Computational Complexity M1 Takao Inoshita Iwama & Ito Lab Graduate School of Informatics, Kyoto University."

Similar presentations


Ads by Google