Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Normal Distribution

Similar presentations


Presentation on theme: "The Normal Distribution"— Presentation transcript:

1 The Normal Distribution
Chapter 5 The Normal Distribution The Normal Distribution is the most important probability distribution in the study of statistics. Elementary Statistics Larson Farber

2 Properties of a Normal Distribution
x Inflection point The mean, median, and mode are equal Tell students that there are other bell shaped curves. The normal distributions are graphed with specific mathematical functions. By identifying the points of inflection, students can roughly determine the standard deviation . Bell shaped and symmetric about the mean The total area under the curve is one (1) or 100% The curve approaches but never touches the x- axis as it extends farther and farther away from the mean in both directions. The points at which the curvature changes are called inflection points.

3 Means and Standard Deviations
Curves with different means, same standard deviation 20 12 15 18 10 11 13 14 16 17 19 21 22 9 12 15 18 10 11 13 14 16 17 19 20 Curves with different means different standard deviations Have students find the means of 11, 15.5 and 21 for the top 3 curves. The standard deviation for each is one-half. For the lower 3 curves the means are 10, 15.5 and 21. The curve with the largest standard deviation is in the center. The one with the smallest is on the right. The middle curve on top has the same mean but different standard deviation from the middle curve on bottom.

4 Empirical Rule 68% About 68% of the area lies within 1 standard deviation of the mean About 95% of the area lies within 2 standard deviations This rule has been discussed earlier. Emphasize that there is still 0.3% of the distribution falling outside the 3 standard deviation limits. About 99.7% of the area lies within 3 standard deviations of the mean

5 Determining Intervals
3 2 1 4.2 4.5 4.8 5.1 3.9 3.6 3.3 x An instruction manual claims that the assembly time for a product is normally distributed with a mean of 4.2 hours and standard deviation 0.3 hours. Determine the interval in which 95% of the assembly times fall. A good chance to review probabilities. Find the probability an assembly time will be between 3.6 and Less than 4.5. Greater than 3.3 hours 95% of the data will fall within 2 standard deviations of the mean. (0.3) = and (0.3) = 4.8. 95% of the assembly times will be between 3.6 and 4.8 hrs.

6 The Standard Score The standard score, or z-score, represents the number of standard deviations a random variable x falls from the mean. The test scores for a civil service exam are normally distributed with a mean of 152 and standard deviation of 7. Find the standard z-score for a person with a score of: (a) (b) (c) 152 Have students work these through before displaying the answers. (a) (b) (c) Larson/Farber Ch 5

7 From z-Scores to Raw Scores
To find the data value, x when given a standard score, z: The test scores for a civil service exam are normally distributed with a mean of 152 and standard deviation of 7. Find the test score for a person with a standard score of (a) (b) (c) 0 Show students that the formula given is equivalent to the z-score formula. Some students prefer to use only one formula and others like to use both. Have students work these through before displaying the answers. Emphasize the meaning of z-scores. A z-score of 2.33 is a 2.33 standard deviations above the mean. (a) x = (2.33)(7) = (b) x = ( -1.75)(7) = (c) x = (0)(7) = 152 Larson/Farber Ch 5

8 The Standard Normal Distribution
The standard normal distribution has a mean of 0 and a standard deviation of 1. Using z- scores any normal distribution can be transformed into the standard normal distribution. 4 3 2 1 When each value of a normal distribution is standardized, the standard normal distribution is produced. If students are using tables, they must standardize all values to find probabilities. If students are using a technology tool, this will not be necessary. z Larson/Farber Ch 5

9 Cumulative Areas The total area under the curve is one. 1 2 3 -1 -2 -3 z As the value of z increases the cumulative area increases to one. The cumulative area is close to 0 for z-scores close to The cumulative area for z = 0 is The cumulative area is close to 1 for z scores close to 3.49.

10 The probability that z is at most -1.25 is 0.1056.
Cumulative Areas Find the cumulative area for a z-score of 0.1056 Tell students it is a good idea to sketch the curve and indicate the area to be found. 1 2 3 -1 -2 -3 z Read down the z column on the left to z = -1.2 and across to the column under .05. The value in the cell is , the cumulative area. The probability that z is at most is P ( z  -1.25) =

11 From Areas to z-scores z
4 3 2 1 From Areas to z-scores Find the z-score corresponding to a cumulative area of 0.9803 0.9803 z Be sure to emphasize that here, the area is given. Tell students to choose the z score closest to the given area. The only exception is if the area falls exactly at the midpoint between two z-scores, use the midpoint of the z=scores. Locate in the area portion of the table. Read the values at the beginning of the corresponding row and at the top of the column. The z-score is 2.06. z = 2.06 is roughly the 98th percentile.

12 Finding Probabilities
To find the probability that z is less than a given value, read the cumulative area in the table corresponding to that z-score. 1 2 3 -1 -2 -3 z Find P( z < -1.24) Read down the z-column to -1.2 and across to .04. The cumulative area is P ( z < 1.24) =

13 Finding Probabilities
To find the probability that z is greater than a given value, subtract the cumulative area in the table from 1. 1 2 3 -1 -2 -3 z Find P( z > -1.24) Required area 0.1075 0.8925 Students must compute the complementary area. The cumulative area (area to the left) is So the area to the right is = P( z > -1.24) = Larson/Farber Ch 5

14 Finding Probabilities
To find the probability z is between two given values, find the cumulative areas for each and subtract the smaller area from the larger. 1 2 3 -1 -2 -3 z Find P( < z < 1.17) Tell students to be sure to subtract the smaller area from the larger area since areas (and probabilities) cannot be negative. 1. P(z < 1.17) = 2. P(z < -1.25) =0.1056 3. P( < z < 1.17) = =

15 Summary To find the probability that z is less than a given value, read the corresponding cumulative area. 1 2 3 -1 -2 -3 z To find the probability is greater than a given value, subtract the cumulative area in the table from 1. 1 2 3 -1 -2 -3 z Using the cumulative density function, the calculation of probabilities is greatly simplified to three possibilities. If you are using a 0-to z approach, skip these slides. With technologies use the CDF command to calculate cumulative densities. 1 2 3 -1 -2 -3 z To find the probability z is between two given values, find the cumulative areas for each and subtract the smaller area from the larger.

16 Probabilities and Normal Distributions
If a random variable, x is normally distributed, the probability that x will fall within an interval is equal to the area under the curve in the interval. IQ scores are normally distributed with a mean of 100 and standard deviation of 15. Find the probability that a person selected at random will have an IQ score less than 115. 115 100 To find the area in this interval, first find the standard score equivalent to x = 115.

17 Probabilities and Normal Distributions
115 100 Find P(x < 115) Normal Distribution SAME SAME Standard Normal Distribution 1 Find P(z < 1) P( z < 1) = , so P( x <115) = Larson/Farber Ch 5

18 Application Monthly utility bills in a certain city are normally distributed with a mean of $100 and a standard deviation of $12. A utility bill is randomly selected. Find the probability it is between $80 and $115. Normal Distribution P(80 < x < 115) P(-1.67 < z < 1.25) = The probability a utility bill is between $80 and $115 is

19 Finding Percentiles Monthly utility bills in a certain city are normally distributed with a mean of $100 and a standard deviation of $12. What is the smallest utility bill that can be in the top 10% of the bills? 90% 10% Students find these “cut-off” problems easier if they think in terms of percentiles, which in turn are interpreted as cumulative areas. z Find the cumulative area in the table that is closest to (the 90th percentile.) The area corresponds to a z-score of 1.28. To find the corresponding x-value, use x = (12) = $ is the smallest value for the top 10%.

20 Sampling Distributions
A sampling distribution is the probability distribution of a sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic is the sample mean, then the distribution is the sampling distribution of sample means. Sample Sample Each sample has the same n. Emphasize that sample means will vary form one sample to another but are not expected to be too far from the population mean. Other statistics such as the sample variance have their own sampling distributions that will be studied later. Sample Sample Sample Sample The sampling distribution consists of the values of the sample means,

21 The Central Limit Theorem
If a sample n  30 is taken from a population with any type distribution that has a mean = and standard deviation = x the sample means will have a normal distribution with a mean This theorem is the foundation for inferential statistics. As long as the sample has at least 30 values, the sampling distribution of the mean will be norm.al. The center of the sampling distribution is the same as the center of the distribution of individual values. The variation is smaller. The larger the sample size, the smaller the variation will be. and standard deviation

22 The Central Limit Theorem
If a sample of any size is taken from a population with a normal distribution with mean = and standard deviation= x the distribution of means of sample size n , will be normal with a mean standard deviation When the original population is normally distributed, the sample can be any size for a normal sampling distribution. Larson/Farber Ch 5

23 Application 69.2 The mean height of American men (ages 20-29) is
inches. Random samples of 60 such men are selected. Find the mean and standard deviation (standard error) of the sampling distribution. 69.2 mean Standard deviation Distribution of means of sample size 60 , will be normal.

24 Interpreting the Central Limit Theorem
The mean height of American men (ages 20-29) is  = 69.2”. If a random sample of 60 men in this age group is selected, what is the probability the mean height for the sample is greater than 70”? Assume the standard deviation is 2.9”. Since n > 30 the sampling distribution of will be normal mean standard deviation Find the z-score for a sample mean of 70:

25 Interpreting the Central Limit Theorem
2.14 P ( > 70) = P (z > 2.14) = = z There is a probability that a sample of 60 men will have a mean height greater than 70”.

26 Application Central Limit Theorem
During a certain week the mean price of gasoline in California was $1.164 per gallon. What is the probability that the mean price for the sample of 38 gas stations in California is between $1.169 and $1.179? Assume the standard deviation = $0.049. Since n > 30 the sampling distribution of will be normal mean standard deviation Calculate the standard z-score for sample values of $1.169 and $1.179.

27 Application Central Limit Theorem
.63 1.90 z P( 0.63 < z < 1.90) = = The probability is that the mean for the sample is between $1.169 and $1.179.

28 Normal Approximations to the Binomial
Characteristics of a Binomial Experiment There are a fixed number of trials. (n) The n trials are independent and repeated under identical conditions Each trial has 2 outcomes, Success or Failure. The probability of success on a single trial is p and the probability of failure is q p + q = 1 The central problem is to find the probability of x successes out of n trials. Where x = 0 or 1 or 2 … n. The table for calculating probabilities is limited to specific values of p and values of n that do not exceed 20. This application will show how to calculate binomial probabilities when the table cannot be used and the binomial probability formula becomes too tedious. Even technology tools such as Minitab have limitations in calculating these probabilities. x is a count of the number of successes in n trials.

29 Application 34% of Americans have type A+ blood. If 500 Americans are sampled at random, what is the probability at least 300 have type A+ blood? Using techniques of chapter 4 you could calculate the probability that exactly 300, exactly 301…exactly 500 Americans have A+ blood type and add the probabilities. Or…you could use the normal curve probabilities to approximate the binomial probabilities. Review the formulas for calculating the mean and standard deviation of a binomial distribution. These must be found in order to specify the normal distribution. If np  5 and nq  5, the binomial random variable x is approximately normally distributed with mean and

30 Why do we require np  5 and nq  5?
4 1 2 3 4 5 n = 5 p = 0.25, q = .75 np = nq = 3.75 n = 20 p = 0.25 np = 5 nq = 15 4 10 20 30 40 50 n = 50 p = 0.25 np = 12.5 nq = 37.5

31 Binomial Probabilities
The binomial distribution is discrete with a probability histogram graph. The probability that a specific value of x will occur is equal to the area of the rectangle with midpoint at x. If n = 50 and p = 0.25 find P (14 x  16) Add the areas of the rectangles with midpoints at x = 14, x = 15, x = 16. = 0.265 14 15 16 0.111 0.089 0.065 P (14 x  16) = 0.265 Larson/Farber Ch 5

32 Correction for Continuity
Use the normal approximation to the binomial to find P(14 x  16) if n = 50 and p = 0.25 Check that np= 12.5  5 and nq= 37.5  5. 14 15 16 The interval of values under the normal curve is  x  16.5. To ensure the boundaries of each rectangle are included in the interval, subtract 0.5 from a left-hand boundary and add 0.5 to a right-hand boundary.

33 Normal Approximation to the Binomial
Use the normal approximation to the binomial to find P(14 x  16) if n = 50 and p = 0.25 Find the mean and standard deviation using binomial distribution formulas. Adjust the endpoints to correct for continuity P(13.5  x  16.5) Convert each endpoint to a standard score P(0.33  z  1.31) = = Larson/Farber Ch 5

34 Application P(z < -1.71) = 0.0436
A survey of Internet users found that 75% favored government regulations on “junk” . If 200 Internet users are randomly selected, find the probability that fewer than 140 are in favor of government regulation. Since np=150  5 and nq = 50  5 use the normal approximation to the binomial. Use the correction for continuity P(x < 139.5) Students will agree that it is extremely impractical to use the binomial formulas for calculating the probability of exactly 0, exactly 1, exactly 2…exactly 139 successes. It is often helpful to have students list the possible values (for example 0, 1, 2…139). This helps determine the interval. the reason for not having to adjust the left hand limit is that the area is almost 0 at the extremes of the curve. Using the TI-83, binomcdf(200, .75, 139) the binomial probability is given as If n = 2000 however, the TI-83 gives a domain error message. This means to calculate the probability students will need to use the normal approximation for the binomial. P(z < -1.71) = The probability that fewer than 140 are in favor of government regulation is


Download ppt "The Normal Distribution"

Similar presentations


Ads by Google