Presentation is loading. Please wait.

Presentation is loading. Please wait.

The normal approximation for probability histograms.

Similar presentations


Presentation on theme: "The normal approximation for probability histograms."— Presentation transcript:

1 The normal approximation for probability histograms

2 Introduction According to the law of averages, when a coin is tossed a large number of times, the percentage of heads will be close to 50%. Around 1700, the Swiss mathematician James Bernoulli put this on a rigorous mathematical footing. Twenty years later, Abraham de Moivre made a substantial improvement on Bernoulli’s work, by showing how to compute approximately the probability that the percentage of heads will fall in any given interval around 50%.

3 Introduction

4

5 Of course, de Moivre did not have a modern calculator. He needed to find a way to estimate all the coefficients. Indeed, he found a way by using the normal curve. For example, he found that the probability of getting exactly 50 heads in 100 tosses of a coin was about equal to the area under the normal curve between -0.1 and +0.1. In fact, he proved that the whole probability histogram for the number of heads is close to the normal curve when the number of tosses is large. Here we extend his idea to deal with the sum of draws made at random from any box of tickets.

6 Probability histograms A probability histogram is a graph that represents probability / chance, not data. When a chance process generates a number, the expected value and the standard error are a guide to where that number will be. But the probability histogram gives a complete picture.

7 Example Gamblers playing craps bet on the total number of spots shown by a pair of dice. The numbers range from 2 to 12. The odds depend on the chance of rolling each possible total. To find the probabilities, we simulated on the computer. We repeated this process 10,000 times.

8 Table for 100 throws of 2 dice

9 Example We draw a histogram for the data in the previous table: The total of 7 comes up 20 times, so the rectangle over 7 has an area of 20%. Note that 7 is the peak.

10 Example Next we draw a histogram for the first 1,000 repetitions: Then the histogram for all 10,000 repetitions:

11 Example All the previous histograms are empirical histograms. That is they are experimentally observed (or based on observations). Indeed, we have an ideal histogram (or probability histogram):

12 Remarks The probability histogram is made up of rectangles. The probability histogram represents probability / chance by area: The base of each rectangle is centered at a possible value for the sum of the draw, and the area of the rectangle equals the probability of getting that value. The total area of the histogram is 100%. Compare to the empirical histogram: The area of the blocks in the probability histogram represents the probability, while the area of the blocks in empirical histogram represents the percentage for the data.

13 Remarks

14 Another example Let’s look at the product of the numbers on a pair of dice, instead of the sum. The chance process is: roll a pair of dice and take the product of the numbers. Again, we use a computer to program the process, and make 10,000 repetitions.

15 Empirical histograms for the data

16

17

18 The probability histogram The empirical histogram for 10,000 repetitions looks almost exactly like the probability histogram.

19 Remark The graphs in this example is very different from the graphs in the previous example: the new histograms have gaps. The smallest value is 1, if both dice show 1. The biggest value is 36, if both dice show 6. But there is no way to get the product 7, 11, 13, 14,…. So there is no rectangle over 7, because the probability is zero. Same reason for the other gaps. This is why there are gaps in the histograms.

20 Compare probability histograms with normal curve We saw empirical histograms converge to probability histogram as repetitions go up. Now we compare the probability histogram with the normal curve to justify the approximation theory.

21 Example Suppose a coin is tossed 100 times. Then the probability histogram for the number of heads is the following:

22 Example There are two scales in the picture. The probability histogram is drawn relative to the inside one. The normal curve is drawn relative to the outside one(standard units). By a simple calculation, the expected number of heads is 50, and the SE is 5. Then we can convert the scale into standard units, and put the graphs in the same picture. Again, the point is that keep the area fixed when we convert the scales.

23 Example Next, we compare the probability histogram for the number of heads in 400 tosses of a coin with the normal curve:

24 Example With 900 tosses, the histogram is practically the same as the curve: In the early eighteenth century, de Moivre proved this convergence had to take place, by pure mathematical reasoning.

25 Sum of draws After the discussion about the process of tossing a coin, we now turn to the more general process: drawing tickets from a box. Keep in mind: the more the histogram of the numbers in the box differs from the normal curve, the more draws are needed before the approximation takes hold.

26 Example 1 Suppose we have a box with tickets: nine 0’s and one 1. That is 0,0,0,0,0,0,0,0,0,1. (A lopsided box.) Then the probability histogram for the box is the following: It is also the histogram for only one draw. Indeed, the histogram for the sum will be lopsided too, until the number of draws gets fairly large.

27 Example 1 With 25 draws, the histogram is a lot higher than the normal curve on the left, lower on the right. (Because the box is lopsided.) The normal approximation does not apply.

28 Example 1 With 100 draws, the histogram follows the normal curve much better. But still, the histogram is higher on the left and lower on the right.

29 Example 1 At 400 draws, you have to look closely to see the difference. Although, the histogram is still a bit higher on the left and a bit lower on the right.

30 Example 2 Let’s look at another example. A box with tickets: 1,2,3. It is a much more common box than the previous one. The probability histogram for the sum of 25 draws is already close enough to the normal curve:

31 Example 2 With 50 draws, the histogram follows the curve very closely indeed: The normal approximation applies very well to this box.

32 Example 3 Our last example is the box with tickets: 1,2,9. Then the histogram for the box looks nothing like the normal curve: Again, it is the histogram for only one draw. Similarly, the histogram for the sum will be quite abnormal, until the number of draws gets fairly large.

33 Example 3 With 25 draws, the probability histogram for the sum is still quite different from the normal curve----it shows waves:

34 Example 3 With 50 draws, the waves are still there, but much smaller.

35 Example 3 By 100 draws, the probability histogram is indistinguishable from the normal curve.

36 Remark As we seen, for any event, the empirical histograms will converge to the ideal probability histogram as the repetition of the event goes up. Some of the processes, like tossing a coin and sum of draws, will converge to the normal curve as the number of times in the process goes up. This explains why the empirical histogram follows the normal curve when the number of repetitions and the number of draws both are large enough in some cases.

37 Product of draws The normal curve is tied to sums. But the probability histogram for a product will usually be quite different from the normal curve. Let’s look at an example about rolling a die.

38 Example Suppose a die will be rolled for 10 times. We draw the probability histogram for the product of 10 rolls. This is nothing like the normal curve:

39 Example

40 Making the number of rolls larger does not make the histogram more normal. With 25 rolls, the probability histogram is even worse:

41 Example

42 Conclusion We saw from the examples that with enough draws, the probability histogram for the sum will be close to the normal curve. Mathematicians have a name for this fact: the central limit theorem, which plays a central role in statistical theory.

43 The Central Limit Theorem When drawing at random with replacement from a box, the probability histogram for the sum will follow the normal curve, even if the contents of the box do not. The histogram must be put into standard units, and the number of draws must be reasonably large.

44 Remarks The theorem applies to sums but not to other operations like products. The theorem is the basis for many of the statistical procedures discussed in the rest of the course. For many cases, the probability histogram for the sum of 100 draws will be close enough to the normal curve. In fact, this will depend on the content of the box, see previous examples.

45 Remarks When the probability histogram follows the normal curve, it can be summarized by the expected value and standard error. The expected value pins the center of the probability histogram to the horizontal axis, and the standard error fixes its spread.

46 Normal approximation We have already known how to apply the normal approximation. Here we present an example to show you the logic and a technique to take care of endpoints, when the number of times is small or extra accuracy is wanted.

47 Example A coin will be tossed 100 times. Estimate the probability of getting: (a) exactly 50 heads. (b) between 45 and 55 heads inclusive. (c) between 45 and 55 heads exclusive.

48 Solutions Part (a). By a direct computation (or the previous example), the expected number of heads is 50 and the standard error is 5. If we look at the graph: The probability of getting exactly 50 heads equals the area of the rectangle over 50. The base of this rectangle goes from 49.5 to 50.5 on the number-of-heads scale.

49 Solutions In fact, the exact probability is 7.96%, to two decimals. So the approximation 7.97% is excellent.

50 Solutions Part (b). The probability of getting between 45 and 55 heads inclusive equals the area of the eleven rectangles over the values 45 through 55: That is the area under the histogram between 44.5 and 55.5 on the number-of-heads scale.

51 Solutions The exact probability is 72.87%, to two decimals. Excellent again!

52 Solutions Part (c). The probability of getting 45 to 55 heads exclusive equals the total area of the nine rectangles over the values 46 through 54. That is the area under the histogram between 45.5 and 54.5 on the number-of-heads scale, which correspond to -0.9 and 0.9 on the standard-units scale. By the normal approximation: The exact probability is 63.18%, to two decimals. Excellent close!

53 Remarks Often, the problem may only ask for the probability that the number of heads is between 45 and 55, without specifying whether endpoints are included or excluded. Then, we may use the compromise procedure:

54 Remarks Note that we use this approximation before. It splits the two end rectangles in half, and does not give quite as much precision as the method used in the example. Keeping track of the endpoints is called the “continuity correction”. This correction is worthwhile if the rectangles are big, or if a lot of precision is needed. Probability histograms are sometimes hard to work out, while areas under the normal curve are easy to look up in the normal table. If the probability histogram follows the normal curve, the approximation provides a good way to estimate the probability in the given interval.

55 Summary A probability histogram represents probability by area. The 1 st convergence: if the chance process for getting a sum is repeated many times, the empirical histogram for the observed values converges to the probability histogram. The 2 nd convergence: the central limit theorem----with reasonably large number of draws, the probability histogram for the sum will follow the normal curve in the standard units. Combine the two convergences: when both the number of repetitions and the number of draws are reasonably large, the empirical histogram for the sum will be close to the normal curve in some cases.

56 Summary Probability histograms which follow the normal curve can be summarized quite well by the expected value and SE. The expected value locates the center, and the SE measures the spread. The normal approximation consists in replacing the actual probability histogram by the normal curve, before computing the area. Often, the accuracy of the approximation can be improved by keeping track of the edges of the rectangles----the “continuity correction”.


Download ppt "The normal approximation for probability histograms."

Similar presentations


Ads by Google