# 5 Normal Probability Distributions

## Presentation on theme: "5 Normal Probability Distributions"— Presentation transcript:

5 Normal Probability Distributions
Elementary Statistics Larson Farber

Section 5.1 Introduction to Normal Distributions

Properties of a Normal Distribution
x The mean, median, and mode are equal Tell students that there are other bell shaped curves. The normal distributions are graphed with specific mathematical functions. Bell shaped and is symmetric about the mean The total area that lies under the curve is one or 100%

Properties of a Normal Distribution
Inflection point Inflection point x As the curve extends farther and farther away from the mean, it gets closer and closer to the x-axis but never touches it. Tell students that there are other bell shaped curves. The normal distributions are graphed with specific mathematical functions. By identifying the points of inflection, students can roughly determine the standard deviation . The points at which the curvature changes are called inflection points. The graph curves downward between the inflection points and curves upward past the inflection points to the left and to the right.

Means and Standard Deviations
Curves with different means, same standard deviation 10 11 12 13 14 15 16 17 18 19 20 Curves with different means, different standard deviations Have students find the means of 11, 15.5 and 21 for the top 3 curves. The standard deviation for each is one-half. For the lower 3 curves the means are 10, 15.5 and 21. The curve with the largest standard deviation is in the center. The one with the smallest is on the right. The middle curve on top has the same mean but different standard deviation from the middle curve on bottom. 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Empirical Rule About 68% of the area lies within 1 standard deviation of the mean 68% About 95% of the area lies within 2 standard deviations This rule has been discussed earlier. Emphasize that there is still 0.3% of the distribution falling outside the 3 standard deviation limits. About 99.7% of the area lies within 3 standard deviations of the mean

Determining Intervals
x 3.3 3.6 3.9 4.2 4.5 4.8 5.1 An instruction manual claims that the assembly time for a product is normally distributed with a mean of 4.2 hours and standard deviation 0.3 hour. Determine the interval in which 95% of the assembly times fall. A good chance to review probabilities. Find the probability an assembly time will be between 3.6 and Less than 4.5. Greater than 3.3 hours 95% of the data will fall within 2 standard deviations of the mean. 4.2 – 2 (0.3) = 3.6 and (0.3) = 4.8. 95% of the assembly times will be between 3.6 and 4.8 hrs.

Section 5.2 The Standard Normal Distribution

The Standard Score The standard score, or z-score, represents the number of standard deviations a random variable x falls from the mean. The test scores for a civil service exam are normally distributed with a mean of 152 and a standard deviation of 7. Find the standard z-score for a person with a score of: (a) (b) (c) 152 This concept was introduced in Chapter 2. The z-score is a measure of position. (a) (b) (c)

The Standard Normal Distribution
The standard normal distribution has a mean of 0 and a standard deviation of 1. Using z-scores any normal distribution can be transformed into the standard normal distribution. When each value of a normal distribution is standardized, the standard normal distribution is produced. If students are using tables, they must standardize all values to find probabilities. If students are using a technology tool, this will not be necessary. z –4 –3 –2 –1 1 2 3 4

Cumulative Areas The total area under the curve is one. z –3 –2 –1 1 2 3 As the value of z increases the cumulative area increases to one. The cumulative area is close to 0 for z-scores close to –3.49. The cumulative area for z = 0 is The cumulative area is close to 1 for z-scores close to 3.49.

The probability that z is at most –1.25 is 0.1056.
Cumulative Areas Find the cumulative area for a z-score of –1.25. 0.1056 z –3 –2 –1 1 2 3 Read down the z column on the left to z = –1.25 and across to the column under .05. The value in the cell is , the cumulative area. Tell students it is a good idea to sketch the curve and indicate the area to be found. The probability that z is at most –1.25 is

Finding Probabilities
To find the probability that z is less than a given value, read the cumulative area in the table corresponding to that z-score. Find P(z < –1.45). P (z < –1.45) = z This is a “less than” example. –3 –2 –1 1 2 3 Read down the z-column to –1.4 and across to .05. The cumulative area is

Finding Probabilities
To find the probability that z is greater than a given value, subtract the cumulative area in the table from 1. Find P(z > –1.24). 0.1075 0.8925 z This is a “greater than” example. Students must compute the complementary area. –3 –2 –1 1 2 3 The cumulative area (area to the left) is So the area to the right is 1 – = P(z > –1.24) =

Finding Probabilities
To find the probability z is between two given values, find the cumulative areas for each and subtract the smaller area from the larger. Find P(–1.25 < z < 1.17). This is a “between” example. Tell students to be sure to subtract the smaller area from the larger area since areas (and probabilities) cannot be negative. z –3 –2 –1 1 2 3 1. P(z < 1.17) = 2. P(z < –1.25) = 3. P(–1.25 < z < 1.17) = – =

Summary To find the probability that z is less than a given value, read the corresponding cumulative area. z -3 -2 -1 1 2 3 To find the probability is greater than a given value, subtract the cumulative area in the table from 1. Using the cumulative density function, the calculation of probabilities is greatly simplified to three possibilities. If you are using a 0-to z approach, skip these slides. With technologies use the CDF command to calculate cumulative densities. -3 -2 -1 1 2 3 z To find the probability z is between two given values, find the cumulative areas for each and subtract the smaller area from the larger. z -3 -2 -1 1 2 3

Finding Probabilities
Section 5.3 Normal Distributions Finding Probabilities

Probabilities and Normal Distributions
If a random variable, x is normally distributed, the probability that x will fall within an interval is equal to the area under the curve in the interval. IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. Find the probability that a person selected at random will have an IQ score less than 115. Recall that in a discrete probability distribution, we could use the area of the bar in the probability histogram to obtain the probability of the event. Here we can only find the probability that x will lie in a given interval. 100 115 To find the area in this interval, first find the standard score equivalent to x = 115.

Probabilities and Normal Distributions
Find P(x < 115). 100 115 Standard Normal Distribution SAME SAME The area is the same Find P(z < 1). 1 P(z < 1) = , so P(x <115) =

Application 0.8944 – 0.0475 = 0.8469 Normal Distribution
Monthly utility bills in a certain city are normally distributed with a mean of \$100 and a standard deviation of \$12. A utility bill is randomly selected. Find the probability it is between \$80 and \$115. Normal Distribution P(80 < x < 115) P(–1.67 < z < 1.25) – = The probability a utility bill is between \$80 and \$115 is

Section 5.4 Normal Distributions Finding Values

From Areas to z-Scores z –1 1 2 3 4 0.9803 –4 –3 –2
Find the z-score corresponding to a cumulative area of z = 2.06 corresponds roughly to the 98th percentile. 0.9803 Be sure to emphasize that here, the area is given. Tell students to choose the z score closest to the given area. The only exception is if the area falls exactly at the midpoint between two z-scores, use the midpoint of the z=scores. –4 –3 –2 –1 1 2 3 4 z Locate in the area portion of the table. Read the values at the beginning of the corresponding row and at the top of the column. The z-score is 2.06.

Finding z-Scores from Areas
Find the z-score corresponding to the 90th percentile. .90 z The closest table area is The row heading is 1.2 and column heading is .08. This corresponds to z = 1.28. A z-score of 1.28 corresponds to the 90th percentile.

Finding z-Scores from Areas
Find the z-score with an area of .60 falling to its right. .40 .60 z z With .60 to the right, cumulative area is .40. The closest area is The row heading is 0.2 and column heading is .05. The z-score is 0.25. A z-score of 0.25 has an area of .60 to its right. It also corresponds to the 40th percentile

Finding z-Scores from Areas
Find the z-score such that 45% of the area under the curve falls between –z and z. .275 .275 .45 –z z The area remaining in the tails is .55. Half this area is in each tail, so since .55/2 = .275 is the cumulative area for the negative z value and = .725 is the cumulative area for the positive z. The closest table area is and the z-score is The positive z score is 0.60. Because the normal distribution is symmetric, the z scores will have the same absolute value. As a result, you can find one z-score and use its opposite for the other.

From z-Scores to Raw Scores
To find the data value, x when given a standard score, z: The test scores for a civil service exam are normally distributed with a mean of 152 and a standard deviation of 7. Find the test score for a person with a standard score of: (a) (b) – (c) 0 (a) x = (2.33)(7) = Show students that the formula given is equivalent to the z-score formula. Some students prefer to use only one formula and others like to use both. Have students work these through before displaying the answers. Emphasize the meaning of z-scores. A z-score of 2.33 is a 2.33 standard deviations above the mean. (b) x = (–1.75)(7) = (c) x = (0)(7) = 152

Finding Percentiles or Cut-off Values
Monthly utility bills in a certain city are normally distributed with a mean of \$100 and a standard deviation of \$12. What is the smallest utility bill that can be in the top 10% of the bills? \$ is the smallest value for the top 10%. 90% 10% z Students find these “cut-off” problems easier if they think in terms of percentiles, which in turn are interpreted as cumulative areas. Find the cumulative area in the table that is closest to (the 90th percentile.) The area corresponds to a z-score of 1.28. To find the corresponding x-value, use x = (12) =

The Central Limit Theorem
Section 5.5 The Central Limit Theorem

Sampling Distributions
A sampling distribution is the probability distribution of a sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic is the sample mean, then the distribution is the sampling distribution of sample means. Sample Sample Sample Sample Sample Sample Each sample has the same n. Emphasize that sample means will vary from one sample to another but are not expected to be too far from the population mean. Other statistics such as the sample variance have their own sampling distributions that will be studied later. The sampling distribution consists of the values of the sample means,

The Central Limit Theorem
If a sample n  30 is taken from a population with any type distribution that has a mean = and standard deviation = x the sample means will have a normal distribution This theorem is the foundation for inferential statistics. As long as the sample has at least 30 values, the sampling distribution of the mean will be norm.al. The center of the sampling distribution is the same as the center of the distribution of individual values. The variation is smaller. The larger the sample size, the smaller the variation will be. and standard deviation

The Central Limit Theorem
If a sample of any size is taken from a population with a normal distribution with mean = and standard deviation = x the distribution of means of sample size n, will be normal with a mean standard deviation When the original population is normally distributed, the sample can be any size for a normal sampling distribution.

Application 69.2 mean Standard deviation
The mean height of American men (ages 20-29) is inches. Random samples of 60 such men are selected. Find the mean and standard deviation (standard error) of the sampling distribution. 69.2 mean Distribution of means of sample size 60, will be normal. Standard deviation

Interpreting the Central Limit Theorem
The mean height of American men (ages 20-29) is = 69.2”. If a random sample of 60 men in this age group is selected, what is the probability the mean height for the sample is greater than 70”? Assume the standard deviation is 2.9”. Since n > 30 the sampling distribution of will be normal mean standard deviation Find the z-score for a sample mean of 70:

Interpreting the Central Limit Theorem
Although the probability that one man might be more than 70 inches tall is P(z>0.28) = =.3897, the probability that the mean of a sample of 60 men will be greater than 70 is z 2.14 There is a probability that a sample of 60 men will have a mean height greater than 70”.

Application Central Limit Theorem
During a certain week the mean price of gasoline in California was \$1.164 per gallon. What is the probability that the mean price for the sample of 38 gas stations in California is between \$1.169 and \$1.179? Assume the standard deviation = \$0.049. Since n > 30 the sampling distribution of will be normal mean standard deviation Calculate the standard z-score for sample values of \$1.169 and \$1.179.

Application Central Limit Theorem
P( 0.63 < z < 1.90) = – = z .63 1.90 The probability is that the mean for the sample is between \$1.169 and \$1.179.

Normal Approximation to the Binomial
Section 5.6 Normal Approximation to the Binomial

Binomial Distribution Characteristics
• There are a fixed number of independent trials. (n) • Each trial has 2 outcomes, Success or Failure. • The probability of success on a single trial is p and the probability of failure is q p + q = 1 • We can find the probability of exactly x successes out of n trials. Where x = 0 or 1 or 2 … n. • x is a discrete random variable representing a count of the number of successes in n trials. The table for calculating probabilities is limited to specific values of p and values of n that do not exceed 20. This application will show how to calculate binomial probabilities when the table cannot be used and the binomial probability formula becomes too tedious. Even technology tools such as Minitab have limitations in calculating these probabilities.

Application 34% of Americans have type A+ blood. If 500 Americans are sampled at random, what is the probability at least 300 have type A+ blood? Using techniques of Chapter 4 you could calculate the probability that exactly 300, exactly 301… exactly 500 Americans have A+ blood type and add the probabilities. Or…you could use the normal curve probabilities to approximate the binomial probabilities. Review the formulas for calculating the mean and standard deviation of a binomial distribution. These must be found in order to specify the normal distribution. If np  5 and nq  5, the binomial random variable x is approximately normally distributed with mean

Why Do We Require np  5 and nq  5?
p = 0.25, q = .75 np = nq = 3.75 1 2 3 4 5 n = 20 p = 0.25 np = 5 nq = 15 1 2 We have to ensure a large enough sample size. The minimum size depends on n and on p as well. When p is closer to .5, the curve is more symmetric and we require a smaller sample to approximate the normal distribution. 3 4 5 6 7 8 9 1 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 n = 50 p = 0.25 np = 12.5 nq = 37.5 10 20 30 40 50

Binomial Probabilities
The binomial distribution is discrete with a probability histogram graph. The probability that a specific value of x will occur is equal to the area of the rectangle with midpoint at x. If n = 50 and p = 0.25 find Add the areas of the rectangles with midpoints at x = 14, x = 15, x = 16. = 0.265 0.111 0.089 0.065 14 15 16

Correction for Continuity
Use the normal approximation to the binomial to find 14 15 16 The continuous interval from 13.5 to 16.5 has approximately the same area as the rectangles whose centers are 14, 15 and 16. Values for the binomial random variable x are 14, 15 and 16.

Correction for Continuity
Use the normal approximation to the binomial to find 14 15 16 The continuous interval from 13.5 to 16.5 has approximately the same area as the rectangles whose centers are 14, 15 and 16. The interval of values under the normal curve is To ensure the boundaries of each rectangle are included in the interval, subtract 0.5 from a left-hand boundary and add 0.5 to a right-hand boundary.

Normal Approximation to the Binomial
Use the normal approximation to the binomial to find . Find the mean and standard deviation using binomial distribution formulas. Adjust the endpoints to correct for continuity P The normal probability can be used to approximate the discrete binomial probability. Convert each endpoint to a standard score.

Application The binomial phrase of “fewer than 140” means
A survey of Internet users found that 75% favored government regulations of “junk” . If 200 Internet users are randomly selected, find the probability that fewer than 140 are in favor of government regulation. Since np = 150  5 and nq = 50  5 use the normal approximation to the binomial. Students will agree that it is extremely impractical to use the binomial formulas for calculating the probability of exactly 0, exactly 1, exactly 2…exactly 139 successes. It is often helpful to have students list the possible values (for example 0, 1, 2…139). This helps determine the interval. the reason for not having to adjust the left hand limit is that the area is almost 0 at the extremes of the curve. Using the TI-83, binomcdf(200, .75, 139) the binomial probability is given as If n = 2000 however, the TI-83 gives a domain error message. This means to calculate the probability students will need to use the normal approximation for the binomial. The binomial phrase of “fewer than 140” means 0, 1, 2, 3…139. Use the correction for continuity to translate to the continuous variable in the interval Find P( x < 139.5).

Application A survey of Internet users found that 75% favored government regulations of “junk” . If 200 Internet users are randomly selected, find the probability that fewer than 140 are in favor of government regulation. Use the correction for continuity P(x < 139.5). Students will agree that it is extremely impractical to use the binomial formulas for calculating the probability of exactly 0, exactly 1, exactly 2…exactly 139 successes. It is often helpful to have students list the possible values (for example 0, 1, 2…139). This helps determine the interval. the reason for not having to adjust the left hand limit is that the area is almost 0 at the extremes of the curve. Using the TI-83, binomcdf(200, .75, 139) the binomial probability is given as If n = 2000 however, the TI-83 gives a domain error message. This means to calculate the probability students will need to use the normal approximation for the binomial. P( z < -1.71) = The probability that fewer than 140 are in favor of government regulation is approximately