Download presentation
Presentation is loading. Please wait.
Published byThomasina Garrison Modified over 9 years ago
1
The Standard Deviation as a Ruler and the Normal Model Chapter 6
2
Performance Scales: 1.Use the mean and standard deviation of a data set to fit it to a normal distribution. 2.Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. 3.Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate. Use calculators, spreadsheets, and tables to estimate areas under the normal curve. MAFS.912.S-ID.1.4. 4.Adapts and uses the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages in different and more complex ways.
3
Learning Goals 1.Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread of a variable. 2.Understand that standardizing uses the standard deviation as a ruler. 3.Know how to calculate the z-score of an observation and what it means. 4.Know how to compare values of two different variables using their z-scores. 5.Recognize when a Normal model is appropriate.
4
Learning Goals 6.Recognize when standardization can be used to compare values. 7.Be able to use Normal models and the 68-95-99.7 Rule to estimate the percentage of observations falling within 1, 2, or 3 standard deviations of the mean. 8.Know how to find the percentage of observations falling below any value in a Normal model using a Normal table or appropriate technology. 9.Know how to check whether a variable satisfies the Nearly Normal Condition by making a Normal Probability plot or histogram.
5
Learning Goal 1 Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread of a variable.
6
Learning Goal 1: Linear Transformation of Data Linear transformation – Shifting (moving left or right) the data or Rescaling (making the size larger or smaller) the data. – Changes the original variable x into the new variable x new given by x new = a + bx Adding the constant a shifts all values of x upward (right) or downward (left) by the same amount. Multiplying by the positive constant b changes the size of the values or rescales the data.
7
Learning Goal 1: Shifting Data Shifting data: – Adding (or subtracting) a constant amount to each value just adds (or subtracts) the same constant to (from) the mean. This is true for the median and other measures of position too. – In general, adding a constant to every data value adds the same constant to measures of center and percentiles, but leaves measures of spread unchanged.
8
Learning Goal 1: Shifting Data Example: Adding a Constant Given the data: 2, 4, 6, 8, 10 – Center: mean = 6, median = 6 – Spread: s = 3.2, IQR = 6 Add a constant 5 to each value, new data 7, 9, 11, 13, 15 – New center: mean = 11, median = 11 – New spread: s = 3.2, IQR = 6 Effects of adding a constant to each data value – Center increases by the constant 5 – Spread does not change – Shape of the distribution does not change
9
Learning Goal 1: Example - Subtracting a Constant The following histograms show a shift from men’s actual weights to kilograms above recommended weight. 1.No change in shape 2.No change in spread 3.Shift by 74 Shift Down
10
Learning Goal 1: Rescaling Data Rescaling data: – When we divide or multiply all the data values by any constant value, all measures of position (such as the mean, median and percentiles) and measures of spread (such as the range, IQR, and standard deviation) are divided and multiplied by that same constant value.
11
Learning Goal 1: Rescaling Example: Multiplying by a Constant Given the data: 2, 4, 6, 8, 10 – Center: mean = 6, median = 6 – Spread: s = 3.2, IQR = 6 Multiple a constant 3 to each value, new data: 6, 12, 18, 24, 30 – New center: mean = 18, median = 18 – New spread: s = 9.6, IQR = 18 Effects of multiplying each value by a constant – Center increases by a factor of the constant (times 3) – Spread increases by a factor of the constant (times 3) – Shape of the distribution does not change
12
Learning Goal 1: Example - Rescaling Data The men’s weight data set measured weights in kilograms. If we want to think about these weights in pounds, we would rescale the data: 1.No change in shape 2.Increase in spread 3.Shift by to the right Size (spread) increases Shifts up
13
Learning Goal 1: Summary of Linear Transformations Multiplying each observation by a positive number b multiples both measures of center (mean and median) and measures of spread (IQR and standard deviation) by b. Adding the same number a (either positive or negative) to each observation adds a to measures of center and to quartiles, but does not change measures of spread. Linear transformations do not change the shape of a distribution.
14
Learning Goal 1: Summary of Linear Transformations Linear transformations do not affect the shape of the distribution of the data. -for example, if the original data is right- skewed, the transformed data is right- skewed. Example, changing the units of data from minutes to seconds (multiplying by 60 sec/min). Shape remains the same
15
Learning Goal 1: Example (a) Suppose that each member of the team receives a $100,000 bonus for winning the NBA championship. How will this affect the center and spread? $100,000 = $0.1 mil Los Angeles Laker’s Salaries (2000)
16
Learning Goal 1: Example - Solution
17
Learning Goal 1: Example (b) Each player is offered a 10% increase base salary. How will this affect the center and spread?
18
Learning Goal 1: Example - Solution
19
Learning Goal 1: Your Turn Maria measures the lengths of 5 cockroaches that she finds at school. Here are her results (in inches): 1.42.21.11.61.2 a.Find the mean and standard deviation of Maria’s measurements (use calc). b.Maria’s science teacher is furious to discover that she has measured the cockroach lengths in inches rather than centimeters (There are 2.54 cm in 1 inch). She gives Maria two minutes to report the mean and standard deviation of the 5 cockroaches in centimeters. Find the mean and standard deviation in centimeters.
20
Learning Goal 1: Solution
21
Learning Goal 1: Class Problem We have a company with employees with the following salaries: 1200 900 1400 2100 1800 1000 1300700 170023001200 1.What is the mean and standard deviation of the company salaries? mean = _________st dev = _________ 2.Suppose we give everyone a $500 raise. What is the new mean and standard deviation? mean = _________st dev = _________ 3.Suppose we have to cut everyone’s pay by $500 due to the economy. What is the mean and standard deviation now? mean = _________st dev = _________ 1418.18503.62 1918.18503.62 918.18503.62
22
Learning Goal 1: Class Problem (continued) 1.What was the mean and standard deviation of the company salaries? mean = _________st dev = _________ 4.Suppose we give everyone a 30% raise. What is the new mean and standard deviation? mean = _________st dev = _________ 5.Suppose we cut everyone’s pay by 7%. What is the new mean and standard deviation? mean = _________st dev = _________ 1418.18503.62 1843.63654.71 1318.91468.37
23
Learning Goal 2 Understand that standardizing uses the standard deviation as a ruler.
24
Learning Goal 2: Comparing Apples to Oranges In order to compare data values with different units (apples and oranges), we need to make sure we are using the same scale. The trick is to look at how the values deviate from the mean. Look at whether the data point is above or below the mean, and by how much. Standard deviation measures that, the deviation of the data values from the mean.
25
Learning Goal 2: The Standard Deviation as a Ruler As the most common measure of variation, the standard deviation plays a crucial role in how we look at data. The trick in comparing very different-looking values is to use standard deviations as our ruler. The standard deviation tells us how the whole collection of values varies, so it’s a natural ruler for comparing an individual to a group.
26
Learning Goal 2: The Standard Deviation as a Ruler We compare individual data values to their mean, relative to their standard deviation using the following formula: We call the resulting values standardized values, denoted as z. They can also be called z-scores.
27
Learning Goal 3 Know how to calculate the z-score of an observation and what it means.
28
Learning Goal 3: Standardizing with z-scores When x is larger than the mean, z is positive. When x is smaller than the mean, z is negative.
29
Learning Goal 3: Standardizing with z-scores A z-score puts values on a common scale. Standardized values have no units. z-scores measure the distance of each data value is from the mean in standard deviations. z-scores farther from 0 are more extreme. z-scores beyond -2 or 2 are considered unusual. A negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean.
30
Learning Goal 3: Standardizing with z-scores Gives a common scale. – We can compare two different distributions with different means and standard deviations. Z-Score tells us how many standard deviations the observation falls away from the mean. Observations greater than the mean are positive when standardized and observations less than the mean are negative. This Z-Score tells us it is 2.15 Standard Deviations from the mean 2.15 SD Z=-2.15
31
Learning Goal 3: Standardizing with z-scores - Example
32
Learning Goal 3: Standardizing with z-scores – Your Turn Bob is 64 inches tall. The heights of men are unimodal symmetric with a mean of 69 inches and standard deviation of 2.5 inches. How does Bob’s height compare to other men.
33
Learning Goal 3: Standardizing with z-scores – Solution
34
Learning Goal 3: Problem Which one of the following is a FALSE statement about a standardized value (z-score)? a)It represents how many standard deviations an observation lies from the mean. b)It represents in which direction an observation lies from the mean. c)It is measured in the same units as the variable.
35
Learning Goal 3: Problem (answer) Which one of the following is a FALSE statement about a standardized value (z-score)? a)It represents how many standard deviations an observation lies from the mean. b)It represents in which direction an observation lies from the mean. c)It is measured in the same units as the variable.
36
Learning Goal 3: Problem
37
Learning Goal 3: Problem (answer)
38
Learning Goal 4 Know how to compare values of two different variables using their z-scores.
39
Learning Goal 4: Benefits of Standardizing Standardized values have been converted from their original units to the standard statistical unit of standard deviations from the mean (z-score). Thus, we can compare values that are measured on different scales, with different units, or from different populations.
40
Learning Goal 4: Standardizing – Comparing Distributions The men’s combined skiing event in the in the winter Olympics consists of two races: a downhill and a slalom. In the 2006 Winter Olympics, the mean slalom time was 94.2714 seconds with a standard deviation of 5.2844 seconds. The mean downhill time was 101.807 seconds with a standard deviation of 1.8356 seconds. Ted Ligety of the U.S., who won the gold medal with a combined time of 189.35 seconds, skied the slalom in 87.93 seconds and the downhill in 101.42 seconds. On which race did he do better compared with the competition?
41
Learning Goal 4: Standardizing – Comparing Distributions
42
Learning Goal 4: Standardizing – Your Turn Timmy gets a 680 on the math of the SAT. The SAT score distribution is Unimodal symmetric with a mean of 500 and a standard deviation of 100. Little Jimmy scores a 27 on the math of the ACT. The ACT score distribution is unimodal symmetric with a mean of 18 and a standard deviation of 6. Who does better? (Hint: standardize both scores then compare z-scores)
43
Learning Goal 4: Standardizing – Solution Timmy: Timmy’s z score is further away from the mean so he does better than Little Jimmy who’s only 1.5 SD’s from the mean Little Jimmy: Little Jimmy does better than average and is 1.5 SD’s from the mean but Timmy beats him because he is.3 SD further.
44
Learning Goal 4: Standardizing – Your Turn
45
Learning Goal 4: Standardizing – Class Problem A town’s January high temp averages 36 ̊F with a standard deviation of 10, while in July, the mean high temp is 74 ̊F with a standard deviation of 8. In which month is it more unusual to have a day with a high temp of 55 ̊F? Solution: It is more unusual to have a high temp. of 55 F in July, because July’s z-score is further from the mean.
46
Learning Goal 4: Standardizing – Combining z-scores Because z-scores are standardized values, measure the distance of each data value from the mean in standard deviations and have no units, we can also combine z-scores of different variables.
47
Learning Goal 4: Standardizing – Example: Combining z-scores In the 2006 Winter Olympics men’s combined event, Ted Ligety of the U.S. won the gold medal with a combined time of 189.35 seconds. Ivica Kostelic of Croatia skied the slalom in 89.44 seconds and the downhill in 100.44 seconds, for a combined time of 189.88 seconds. Considered in terms of combined z- scores, who should have won the gold medal?
48
Learning Goal 4: Solution Ted Ligety: Combined z-score: -1.41 Ivica Kostelic: Combined z-score: -1.65 Using standardized scores, overall Kostelic did better and should have won the gold.
49
Learning Goal 4: Combining z-scores - Your Turn The distribution of SAT scores has a mean of 500 and a standard deviation of 100. The distribution of ACT scores has a mean of 18 and a standard deviation of 6. Jill scored a 680 on the math part of the SAT and a 30 on the ACT math test. Jack scored a 740 on the math SAT and a 27 on the math ACT. Who had the better combined SAT/ACT math score?
50
Learning Goal 4: Combining z-scores - Solution Jill Combined math score: 3.8 Jack Combined math score: 3.9 Jack did better with a combined math score of 3.9, to Jill’s combined math score of 3.8.
51
Learning Goal 5 Recognize when a Normal model is appropriate.
52
Learning Goal 5: Smooth Curve Sometimes the overall pattern of a histogram is so regular that it can be described by a Smooth Curve. This can help describe the location of individual observations within the distribution.
53
Learning Goal 5: Smooth Curve The distribution of a histogram depends on the choice of classes, while with a smooth curve it does not. Smooth curve is a mathematical model of the distribution. – How? The smooth curve describes what proportion of the observations fall in each range of values, not the frequency of observations like a histogram. Area under the curve represents the proportion of observations in an interval. The total area under the curve is 1.
54
Learning Goal 5: Mathematical Model Histogram of a sample with the smoothed, density curve describing theoretically the population.
55
Learning Goal 5: The Normal Model There is no universal standard for z- scores, but there is a model that shows up over and over in Statistics. This model is called the Normal Model (You may have heard of “bell- shaped curves.”). Normal models are appropriate for distributions whose shapes are unimodal and roughly symmetric. These distributions provide a measure of how extreme a z-score is.
56
Learning Goal 5: The Normal Model Normal Model: One Particular class of distributions or model. 1.Symmetric 2.Single Peaked 3.Bell Shaped All have the same overall shape.
57
Learning Goal 5: The Normal Model or Normal Distribution The normal distribution is considered the most important distribution in all of statistics. It is used to describe the distribution of many natural phenomena, such as the height of a person, IQ scores, weight, blood pressure etc. 9-57
58
Learning Goal 5: The Normal Distribution The mathematical equation for the normal distribution is given below: where e 2.718, 3.14, = population mean, and = population standard deviation. Not required to know.
59
Learning Goal 5: Properties of the Normal Distribution When this equation is graphed for a given and , a continuous, bell- shaped, symmetric graph will result. Thus, we can display an infinite number of graphs for this equation, depending on the value of and . In such a case, we say we have a family of normal curves. Some representations of the normal curve are displayed in the following slides. 9-59
60
Learning Goal 5:Properties of the Normal Dist. Here, means are different ( = 10, 15, and 20) while standard deviations are the same ( = 3) Here, means are the same ( = 15) while standard deviations are different ( = 2, 4, and 6).
61
9-61 Normal distributions with the same mean but with different standard deviations. Normal distributions with the same mean but with different standard deviations. Learning Goal 5: Properties of the Normal Distribution
62
9-62 Normal distributions with different means but with the same standard deviation. Normal distributions with different means but with the same standard deviation.
63
Learning Goal 5: Properties of the Normal Distribution 9-63 Normal distributions with different means and different standard deviations. Normal distributions with different means and different standard deviations.
64
Learning Goal 5: Describing a Normal Dist. μ located at the center of the symmetrical curve σ controls the spread The exact curve for a particular normal distribution is described by its Mean (μ) and Standard Deviation (σ). Normal Distribution Notation: N(μ,σ)
65
Learning Goal 5: Properties of the Normal Distribution These normal curves have similar shapes, but are located at different points along the x-axis. Also, the larger the standard deviation, the more spread out the distribution, and the curves are symmetrical about the mean. A normal distribution is a continuous, symmetrical, bell- shaped distribution. 9-65
66
Learning Goal 5: Properties of the Normal Distribution Summary of the Properties of the normal Distribution: The curve is continuous. The curve is bell-shaped. The curve is symmetrical about the mean. The mean, median, and mode are located at the center of the distribution and are equal to each other. The curve is unimodal (single mode) The curve never touches the x-axis. The total area under the normal curve is equal to 1. 9-66
67
Learning Goal 5: Not Normal Curves Why a)Normal curve gets closer and closer to the horizontal axis, but never touches it. b)Normal curve is symmetrical. c)Normal curve has a single peak. d)Normal curve tails do not curve away from the horizontal axis.
68
Learning Goal 5: More Normal Distribution For a normal distribution the Mean (μ) is located at the center of the single peak and controls location of the curve on the horizontal axis. The standard deviation (σ) is located at the inflection points of the curve and controls the spread of the curve.
69
Learning Goal 5: Inflection Points The point on the curve where the curve changes from falling more steeply to falling less steeply (change in curvature – concave down to concave up). Located one standard deviation (σ) from the mean (μ). Allows us to visualize on any normal curve the width of one standard deviation. Inflection point
70
Learning Goal 5: More Normal Model There is a Normal model for every possible combination of mean and standard deviation. – We write N(μ,σ) to represent a Normal model with a mean of μ and a standard deviation of σ. We use Greek letters because this mean and standard deviation are not numerical summaries of the data. They are part of the model and a model is more like a population. They don’t come from the data. They are numbers that we choose to help specify the model. Such numbers are called parameters of the model.
71
Learning Goal 5: More Normal Model Summaries of data, like the sample mean and standard deviation, are written with Latin letters. Such summaries of data are called statistics. When we standardize Normal data, we still call the standardized value a z-score also, and we write
72
Learning Goal 5: Standardizing the Normal Distribution All normal distributions are the same general shape and share many common properties. Normal distribution notation: N(μ,σ). We can make all normal distributions the same by measuring them in units of standard deviation (σ) about the mean (μ), z-scores. This is called standardizing and gives us the Standard Normal Curve.
73
Learning Goal 5: Standardizing the Normal Distribution How do linear transformations apply to z- scores? When we convert a data value to a z- scores, we are shifting it by the mean (to set the scale at 0) and then rescaling by the standard deviation (to reset the standard deviation to 1). – Standardizing into z-scores does not change the shape of the distribution. – Standardizing into z-scores changes the center by making the mean 0. – Standardizing into z-scores changes the spread by making the standard deviation 1.
74
Learning Goal 5: Standardizing the Normal Dist. We can standardize a variable that has a normal distribution to a new variable that has the standard normal distribution using the z-score formula: Substitute your variable as x Subtract the mean from your variable Then divide by your Standard Deviation BAM! Pops out your z-score
75
Learning Goal 5: Standardizing the Normal Dist.
76
Standardizing Data into z-scores The Standard Normal Distribution
77
Learning Goal 5: Standardizing the Normal Dist.
78
Results in a Standardized Normal Distribution (curve) One Distribution → One set of areas under the curve → One Table
79
Learning Goal 5: Standardizing the Normal Distribution Subtracting Mu from each value X just moves the curve around, so values are centered on 0 instead of on Mu. Once the curve is centered, dividing each value by sigma>1 moves all values toward 0, smushing the curve.
80
Learning Goal 5: Standard Normal Dist. - Example
82
The area under the original curve and the standard normal curve are the same. Learning Goal 5: Standard Normal Dist. - Example
83
Learning Goal 5: The Standard Normal Curve Let x be a normally distributed variable with mean μ and standard deviation σ, and let a and b be real numbers with a < b. The percentage of all possible observations of x that lie between a and b is the same as the percentage of all possible observations of z that lie between (a −μ)/σ and (b−μ)/σ. This latter percentage equals the area under the standard normal curve between (a −μ)/σ and (b−μ)/σ.
84
Learning Goal 5: Standard Normal Curve and z-scores Same as with any Normal Distribution. A z-score gives us an indication of how unusual a value is because it tells us how far it is from the mean. A data value that sits right at the mean, has a z-score equal to 0. A z-score of 1 means the data value is 1 standard deviation above the mean. A z-score of –1 means the data value is 1 standard deviation below the mean.
85
Learning Goal 5: Standard Normal Curve and z-scores How far from 0 does a z-score have to be to be interesting or unusual? z-scores beyond -2 or 2 are considered unusual. Remember that a negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean.
86
Learning Goal 5: The Standard Normal Model Once we have standardized, we need only one model: – The N(0,1) model is called the Standard Normal model (or the Standard Normal distribution). Be careful—don’t use a Normal model for just any data set, since standardizing does not change the shape of the distribution.
87
Learning Goal 5: Properties of the Standard Normal Dist. Shape – normal curve Mean (μ) = 0 Standard Deviation (σ) = 1 Horizontal axis scale – Z score No vertical axis Notation: N(0, 1)
88
Learning Goal 5: Standard Normal Dist. Problem Which one of the following is a FALSE statement about the standard normal distribution? a)The mean is greater than the median. b)It is symmetric. c)It is bell-shaped. d)It has one peak.
89
Learning Goal 5: Standard Normal Dist. Problem (answer) Which one of the following is a FALSE statement about the standard normal distribution? a)The mean is greater than the median. b)It is symmetric. c)It is bell-shaped. d)It has one peak.
90
Learning Goal 5: Standard Normal Dist. Problem If you knew that the = 0 and = 3, which normal curve would match the data? a)Dataset 1 b)Dataset 2
91
Learning Goal 5: Standard Normal Dist. Problem (answer) If you knew that the = 0 and = 3, which normal curve would match the data? a)Dataset 1 b)Dataset 2
92
Learning Goal 5: Standard Normal Dist. Problem Which one of the following is a FALSE statement about the standard normal curve? a)Its standard deviation can vary with different datasets. b)It is bell-shaped. c)It is symmetric around 0. d)Its mean = 0.
93
Learning Goal 5: Standard Normal Dist. Problem (answer) Which one of the following is a FALSE statement about the standard normal curve? a)Its standard deviation can vary with different datasets. b)It is bell-shaped. c)It is symmetric around 0. d)Its mean = 0.
94
Learning Goal 5: Standard Normal Dist. Problem Suppose the lengths of sport-utility vehicles (SUV) are normally distributed with mean = 190 inches and standard deviation = 5 inches. Marshall just bought a brand-new SUV that is 194.5 inches long and he is interested in knowing what percentage of SUVs is longer than his. Using his statistical knowledge, he drew a normal curve and labeled the appropriate area of interest. Which picture best represents what Marshall drew? a)Plot A b)Plot B
95
Learning Goal 5: Standard Normal Dist. Problem (answer) Suppose the lengths of sport-utility vehicles (SUV) are normally distributed with mean = 190 inches and standard deviation = 5 inches. Marshall just bought a brand-new SUV that is 194.5 inches long and he is interested in knowing what percentage of SUVs is longer than his. Using his statistical knowledge, he drew a normal curve and labeled the appropriate area of interest. Which picture best represents what Marshall drew? a)Plot A b)Plot B
96
Learning Goal 6 Recognize when standardization can be used to compare values.
97
Learning Goal 6: Why We Standardize Standardizing allows us to compare distributions by giving them a common scale. If the distribution is Normal, then it can be standardized. ALWAYS check to make sure the Normal Model is appropriate before standardizing data or using z-scores.
98
Learning Goal 6: Nearly Normal Condition When we use the Normal model, we are assuming the distribution is Normal. We cannot check this assumption in practice, so we check the following condition: – Nearly Normal Condition: The shape of the data’s distribution is unimodal and symmetric. – This condition can be checked with a histogram or a Normal probability plot (to be explained later).
99
Learning Goal 6: Nearly Normal Condition Standardization (z-scores) can only be used when the Nearly Normal Condition is met.
100
Learning Goal 7 Be able to use Normal models and the 68-95-99.7 Rule to estimate the percentage of observations falling within 1, 2, or 3 standard deviations of the mean.
101
Learning Goal 7: The 68-95-99.7 Rule Normal models give us an idea of how extreme a value is by telling us how likely it is to find one that far from the mean. We can find these numbers precisely, but until then we will use a simple rule that tells us a lot about the Normal model… The 68-95-99.7 Rule or Empirical Rule
102
Learning Goal 7: The 68-95-99.7 Rule A very important property of any normal distribution is that within a fixed number of standard deviations from the mean, all normal distributions have the same fraction of their probabilities. We will illustrate for for 1 , 2 , and 3 from the mean . 9-102
103
Learning Goal 7: The 68-95-99.7 Rule One-sigma rule: Approximately 68% of the data values should lie within one standard deviation of the mean. That is, regardless of the shape of the normal distribution, the probability that a normal random variable will be within one standard deviation of the mean is approximately equal to 0.68. The next slide illustrates this. 9-103
104
Learning Goal 7: The 68-95-99.7 Rule 9-104 One sigma rule.
105
Learning Goal 7: The 68-95-99.7 Rule Two-sigma rule: Approximately 95% of the data values should lie within two standard deviations of the mean. That is, regardless of the shape of the normal distribution, the probability that a normal random variable will be within two standard deviations of the mean is approximately equal to 0.95. The next slide illustrates this. 9-105
106
Learning Goal 7: The 68-95-99.7 Rule 9-106 Two sigma rule.
107
Learning Goal 7: The 68-95-99.7 Rule Three-sigma rule: Approximately 99.7% of the data values should lie within three standard deviations of the mean. That is, regardless of the shape of the normal distribution, the probability that a normal random variable will be within three standard deviations of the mean is approximately equal to 0.997. The next slide illustrates this. 9-107
108
Learning Goal 7: The 68-95-99.7 Rule 9-108 Three sigma rule.
109
Learning Goal 7: The 68-95-99.7 Rule The following shows what the 68-95-99.7 Rule tells us:
110
Because all Normal distributions share the same properties, we can standardize our data to transform any Normal curve N( ) into the standard Normal curve N(0,1). And then use the 68-95-99.7 rule to find areas under the curve. N(0,1) => N(64.5, 2.5) Standardized height (no units) Learning Goal 7: The 68-95-99.7 Rule
111
Learning Goal 7: More 68-95-99.7% Rule You can further divide the area under the normal curve into the following parts.
112
Using the 68-95-99.7 Rule SOUTH AMERICAN RAINFALL The distribution of rainfall in South American countries is approximately normal with a (mean) µ = 64.5 cm and (standard deviation) σ = 2.5 cm. The next slide will demonstrate the empirical rule of this application.
113
N(64.5,2.5) 68% of the countries receive rain fall between 64.5(μ) – 2.5(σ) cm (62) and 64.5(μ)+2.5(σ) cm (67). – 68% = 62 to 67 95% of the countries receive rain fall between 64.5(μ) – 5(2σ) cm (59.5) and 64.5 (μ) + 5(2σ) cm (69.5). – 95% = 59.5 to 69.5 99.7% of the countries receive rain fall between 64.5(μ) – 7.5(3σ) cm (57) and 64.5(μ) + 7.5(3σ) cm (72). – 99.7% = 57 to 72
114
The middle 68% of the countries (µ ± σ) have rainfall between 62 – 67 cm The middle 95% of the countries (µ ± 2σ) have rainfall between 59.5 – 69.5 cm Almost all of the data (99.7%) is within 57 – 72 cm (µ ± 3σ)
115
Example: IQ Test The scores of a referenced population on the IQ Test are normally distributed with μ =100 and σ =15. 1)Approximately what percent of scores fall in the range from 70 to 130? 2)A score in what range would represent the top 16% of the scores?
116
Example: IQ Test 1)70 to 130 is μ± 2 σ, therefore it would 95% of the scores. 2)The top 16% of the scores is one σ above the μ, therefore the score would be 115. μ =100 σ =15
117
Your Turn: Runner’s World reports that the times of the finishes in the New York City 10- km run are normally distributed with a mean of 61 minutes and a standard deviation of 9 minutes. 1)Find the percent of runners who take more than 70 minutes to finish. 16% 2)Find the percent of runners who finish in less than 43 minutes. 2.5%
118
The First Three Rules for Working with Normal Models Make a picture. And, when we have data, make a histogram to check the Nearly Normal Condition to make sure we can use the Normal model to model the distribution.
119
Finding Normal Percentiles by Hand When a data value doesn’t fall exactly 1, 2, or 3 standard deviations from the mean, we can look it up in a table of Normal percentiles. Table Z in Appendix D provides us with normal percentiles, but many calculators and statistics computer packages provide these as well.
120
Finding Normal Percentiles by Hand (cont.) Table Z is the standard Normal table. We have to convert our data to z-scores before using the table. The figure shows us how to find the area to the left when we have a z-score of 1.80:
121
Standard Normal Distribution Table Gives area under the curve to the left of a positive z-score. Z-scores are in the 1 st column and the 1 st row – 1 st column – whole number and first decimal place – 1 st row – second decimal place
122
Table Z The table entry for each value z is the area under the curve to the LEFT of z.
123
USING THE Z TABLE You found your z-score to be 1.40 and you want to find the area to the left of 1.40. 1.Find 1.4 in the left-hand column of the Table 2.Find the remaining digit 0 as.00 in the top row 3.The entry opposite 1.4 and under.00 is 0.9192. This is the area we seek: 0.9192
124
Other Types of Tables
125
Using Left-Tail Style Table 1.For areas to the left of a specified z value, use the table entry directly. 2.For areas to the right of a specified z value, look up the table entry for z and subtract the area from 1. (can also use the symmetry of the normal curve and look up the table entry for –z). 3.For areas between two z values, z 1 and z 2 (where z 2 > z 1 ), subtract the table area for z 1 from the table area for z 2.
126
More using Table Z (left tailed table) Use table directly
127
Example: Find Area Greater Than a Given Z-Score Find the area from the standard normal distribution that is greater than -2.15
128
THE ANSWER IS 0.9842 Find the corresponding Table Z value using the z-score -2.15. The table entry is 0.0158 However, this is the area to the left of - 2.15 We know the total area of the curve = 1, so simply subtract the table entry value from 1 – 1 – 0.0158 = 0.9842 – The next slide illustrates these areas
130
Practice using Table A to find areas under the Standard Normal Curve 1.z<1.58 2.z<-.93 3.z>-1.23 4.z>2.48 5..5<z<1.89 6.-1.43<z<1.43 1..9429 (directly from table) 2..1762 (directly from table) 3..8907 (1-.1093 z<-1.23 or use symmetry z<1.23) 4..0066 (1-.9934 z<2.48 or use symmetry z<-2.48) 5..2791 (z<1.89=.9706 – z<.5=.6915) 6..8472 (z<1.43=.9236 – z<- 1.43=>0764)
131
Using the TI-83/84 to Find the Area Under the Standard Normal Curve Under the DISTR menu, the 2 nd entry is “normalcdf”. Calculates the area under the Standard Normal Curve between two z-scores (- 1.43<z<.96). Syntax normalcdf(lower bound, upper bound). Upper and lower bounds are z- scores. If finding the area > or < a single z- score use a large positive value for the upper bound (ie. 100) and a large negative value for the lower bound (ie. -100) respectively.
132
Practice use the TI-83/84 to find areas under the standard normal curve 1.z>-2.35 and z<1.52 2..85<z<1.56 3.-3.5<z<3.5 4.0<z<1 5.z<1.63 6.z>.85 7.z>2.86 8.z<-3.12 9.z>1.5 10.z<-.92 1..9264 2..1383 3..9995 4..3413 5..9484 6..1977 7..0021 8..0009 9..0668 10..1789
133
Using TI-83/84 to Find Areas Under the Standard Normal Curve Without Z-Scores The TI-83/84 can find areas under the standard normal curve without first changing the observation x to a z-score normalcdf(lower bound, upper bound, mean, standard deviation) If finding area use very large observation value for the lower and upper bound receptively. Example: N(136,18) 100<x<150 Answer:.7589 Example: N(2.5,.42) x>3.21 Answer:.0455
134
Procedure for Finding Normal Percentiles 1.State the problem in terms of the observed variable y. – Example : y > 24.8 2.Standardize y to restate the problem in terms of a z-score. – Example: z > (24.8 - μ)/σ, therefore z > ? 3.Draw a picture to show the area under the standard normal curve to be calculated. 4.Find the required area using Table Z or the TI- 83/84 calculator.
135
Example 1: The heights of men are approximately normally distributed with a mean of 70 and a standard deviation of 3. What proportion of men are more than 6 foot tall?
136
Answer: 1.State the problem in terms of y. (6’=72”) 2.Standardize and state in terms of z. 3.Draw a picture of the area under the curve to be calculated. 4.Calculate the area under the curve.
137
Example 2: Suppose family incomes in a town are normally distributed with a mean of $1,200 and a standard deviation of $600 per month. What are the percentage of families that have income between $1,400 and $2,250 per month?
138
Answer: 1.State the problem in terms of y. 2.Standardize and state in terms of z. 3.Draw a picture. 4.Calculate the area.
139
Your Turn: The Chapin Social Insight (CSI) Test evaluates how accurately the subject appraises other people. In the reference population used to develop the test, scores are approximately normally distributed with mean 25 and standard deviation 5. The range of possible scores is 0 to 41. 1.What percent of subjects score above a 32 on the CSI Test? 2.What percent of subjects score at or below a 13 on the CSI Test? 3.What percent of subjects score between 16 and 34 on the CSI Test?
140
Solution: 1)What percent of subjects score above a 32 on the CSI Test? 1.y>32 2. 3.Picture 4.8.1%
141
Solution: 2)What percent of subjects score at or below a 13 on the CSI Test? 1)y ≤ 13 2) 3)Picture 4).82%
142
Solution: 3)What percent of subjects score between 16 and 34 on the CSI Test? 1)16<y<34 2) 3)Picture 4)92.8%
143
From Percentiles to Scores: z in Reverse Sometimes we start with areas and need to find the corresponding z- score or even the original data value. Example: What z-score represents the first quartile in a Normal model?
144
z in Reverse Given a normal distribution proportion (area under the standard normal curve), find the corresponding observation value. Table Z – find the area in the table nearest the given proportion and read off the corresponding z-score. TI-83/84 Calculator – Use the DISTR menu, 3 rd entry invNorm. Syntax for invNorm(area,[μ,σ]) is the area to the left of the z-score (or Observation y) wanted (left-tail area).
145
From Percentiles to Scores: z in Reverse (cont.) Look in Table Z for an area of 0.2500. The exact area is not there, but 0.2514 is pretty close. This figure is associated with z = –0.67, so the first quartile is 0.67 standard deviations below the mean.
146
Inverse Normal Practice Using Table Z 1..3409 2..7835 3..9268 4..0552 Using TI-83/84 1..3409 2..7835 3..9268 4..0552 Proportion (area under curve, left tail) Z-Score Using Table Z 1.Z = -.41 2.Z =.78 3.Z = 1.45 4.Z = -1.60 Using the TI-83/84 1.Z = -.4100 2.Z =.7841 3.Z = 1.4524 4.Z = -1.5964
147
Procedure for Inverse Normal Proportions 1.Draw a picture showing the given proportion (area under the curve). 2.Find the z-score corresponding to the given area under the curve. 3.Unstandardize the z-score. 4.Solve for the observational value y and answer the question.
148
Example 1: SAT VERBAL SCORES SAT Verbal scores are approximately normal with a mean of 505 and a standard deviation of 110 How high must a student score in order to place in the top 10% of all students taking the verbal section of the SAT.
149
Analyze the Problem and Picture It. The problem wants to know the SAT score y with the area 0.10 to its right under the normal curve with a mean of 505 and a standard deviation of 110. Well, isn't that the same as finding the SAT score y with the area 0.9 to its left? Let's draw the distribution to get a better look at it.
150
1. Draw a picture showing the given proportion (area under the curve). y=505y = ?
151
2.Find Your Z-Score 1.Using Table Z - Find the entry closest to 0.90. It is 0.8997. This is the entry corresponding to z = 1.28. So z = 1.28 is the standardized value with area 0.90 to its left. 2.Using TI-83/84 – DISTR/invNorm(.9). It is 1.2816.
152
3. Unstandardize Now, you will need to unstandardize to transform the solution from the z, back to the original y scale. We know that the standardized value of the unknown y is z = 1.28. So y itself satisfies:
153
Solve the equation for y: The equation finds the y that lies 1.28 standard deviations above the mean on this particular normal curve. That is the "unstandardized" meaning of z = 1.28. Answer: A student must score at least 646 to place in the highest 10% 4. Solve for y and Summarize
154
Example 2: A four-year college will accept any student ranked in the top 60 percent on a national examination. If the test score is normally distributed with a mean of 500 and a standard deviation of 100, what is the cutoff score for acceptance?
155
Answer: 1.Draw picture of given proportion. 2.Find the z-score. From TI-83/84, invNorm(.4) is z = -.25. 3.Unstandardize: 4.Solve for y and answer the question. y = 475, therefore the minimum score the college will accept is 475.
156
Your Turn: Intelligence Quotients are normally distributed with a mean of 100 and a standard deviation of 16. Find the 90 th percentile for IQ’s.
157
Answer: 1.Draw picture of given proportion. 2. Find the z-score. From TI-83/84, invNorm(.9) is z = 1.28. 3.Unstandardize: 4.Solve for y and answer the question. y = 120.48, what this means; the 90 th percentile for IQ’s is 120.48. In other words, 90% of people have IQ’s below 120.48 and 10% have IQ’s above 120.48.
158
Are You Normal? How Can You Tell? When you actually have your own data, you must check to see whether a Normal model is reasonable. Looking at a histogram of the data is a good way to check that the underlying distribution is roughly unimodal and symmetric.
159
Are You Normal? How Can You Tell? (cont.) A more specialized graphical display that can help you decide whether a Normal model is appropriate is the Normal probability plot. If the distribution of the data is roughly Normal, the Normal probability plot approximates a diagonal straight line. Deviations from a straight line indicate that the distribution is not Normal.
160
The Normal Probability Plot 30 60 90 -2012 Z X A normal probability plot for data from a normal distribution will be approximately linear:
161
The Normal Probability Plot Left-SkewedRight-Skewed Rectangular 30 60 90 -2 012 Z X 30 60 90 -2 012 Z X 30 60 90 -2 012 Z X Nonlinear plots indicate a deviation from normality
162
Are You Normal? How Can You Tell? (cont.) Nearly Normal data have a histogram and a Normal probability plot that look somewhat like this example:
163
Are You Normal? How Can You Tell? (cont.) A skewed distribution might have a histogram and Normal probability plot like this:
164
Summary Assessing Normality (Is The Distribution Approximately Normal) 1.Construct a Histogram or Stemplot. See if the shape of the graph is approximately normal. 2.Construct a Normal Probability Plot (TI-83/84). A normal Distribution will be a straight line. Conversely, non-normal data will show a nonlinear trend.
165
Assess the Normality of the Following Data 9.7, 93.1, 33.0, 21.2, 81.4, 51.1, 43.5, 10.6, 12.8, 7.8, 18.1, 12.7 Histogram – skewed right Normal Probability Plot – clearly not linear
166
Normal Distribution Problem – Your Turn: Suppose a normal model describes the fuel efficiency of cars currently registered in your state. The mean is 24 mpg, with a standard deviation of 6 mpg. Sketch the normal model, illustrating the 68-95-99.7 rule.
167
Normal Distribution Problem – Your Turn: What percent of all cars get less than 15 mpg?
168
Normal Distribution Problem – Your Turn: What percent of all cars get between 20 and 30 mpg?
169
Normal Distribution Problem – Your Turn: What percent of cars get more than 40 mpg?
170
Normal Distribution Problem – Your Turn: Describe the fuel efficiency of the worst 20% of all cars?
171
Normal Distribution Problem – Your Turn: What gas mileage represents the third quartile?
172
Normal Distribution Problem – Your Turn: Describe the gas mileage of the most efficient 5% of all cars.
173
Normal Distribution Problem – Your Turn: What gas mileage would you consider unusual? Why?
174
Normal Distribution Problem – Your Turn: What percent of cars get under 20 mpg?
175
Normal Distribution Problem – Your Turn: An ecology group is lobbying for a national goal calling for no more than 10% of all cars to be under 20 mpg. If the standard deviation does not change what average fuel efficiency must be attained?
176
Normal Distribution Problem – Your Turn: Car manufacturers argue that they cannot raise the average that much – they believe they can only get to 26 mpg. What standard deviation would allow them to meet the “only 10% under 20 mpg” goal?
177
Normal Distribution Problem – Your Turn: What change in the fuel economy of cars would achieving that standard deviation bring about? What are the advantages and disadvantages?
178
Assignment Chapter 6 Notes Worksheet Chapter 6, Exercises pg. 129 – 133: #3‐17 odd, 23, 35‐45 odd. Read Ch-7, pg. 146 - 163
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.