Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Standard Deviation as a Ruler and the Normal Model

Similar presentations


Presentation on theme: "The Standard Deviation as a Ruler and the Normal Model"— Presentation transcript:

1

2 The Standard Deviation as a Ruler and the Normal Model
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Copyright © 2009 Pearson Education, Inc.

3 NOTE on slides / What we can and cannot do
The following notice accompanies these slides, which have been downloaded from the publisher’s Web site: “This work is protected by United States copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning. Dissemination or sale of any part of this work (including on the World Wide Web) will destroy the integrity of the work and is not permitted. The work and materials from this site should never be made available to students except by instructors using the accompanying text in their classes. All recipients of this work are expected to abide by these restrictions and to honor the intended pedagogical purposes and the needs of other instructors who rely on these materials.” We can use these slides because we are using the text for this course. Please help us stay legal. Do not distribute these slides any further. The original slides are done in orange / brown and black. My additions are in red and blue. Topics in green are optional.

4 Topics in this chapter Shifting and Rescaling Data
Standardized values (z-scores) Using the standard deviation as a ruler The Normal Model The Rule Finding normal percents and the reverse Normal Probability Plots The Normality Assumption

5 Division of Mathematics, HCC Course Objectives for Chapter 6
After studying this chapter, the student will be able to: Compare values from two different distributions using their z-scores. Use Normal models (when appropriate) and the Rule to estimate the percentage of observations falling within one, two, or three standard deviations of the mean. Determine the percentages of observations that satisfy certain conditions by using the Normal model and determine “extraordinary” values, and the reverse. Determine whether a variable satisfies the Nearly Normal condition by making a normal probability plot or histogram. Note: It is essential that this chapter be mastered. Almost everything in Unit 3 depends on it.

6 Let’s review our summary statistics
Workers in a particular office have the following annual salaries (in thousands) 62, 62, 58, 54, 50, 46, 44 What are the summary statistics (rounded)? Mean: 53.7 Median: 54 Range: 18 Standard Deviation: 7.34 The boss wants to reward everyone for a job well-done. He can give them a one-time bonus or an extra raise.

7 Option 1 - $5K Bonus The data become 67, 67, 63, 59, 55, 51, 49
The summary statistics become Mean: (was 53.7) Median: 59 (was 54) Range: 18 (was 18) Standard Deviation: (was 7.34) What has changed? By what? What has stayed the same? This is called Shifting

8 Option 2 – 5% raise The data become
65.1, 65.1, 60.9, 56.7, 52.5, 48.3, 46.2 The summary statistics become Mean: (was 53.7) Median: (was 54) Range: (was 18) Standard Deviation: (was 7.34) What has changed? By what? What has stayed the same? This is called Rescaling

9 Shifting Data Shifting data:
Adding (or subtracting) a constant to every data value adds (or subtracts) the same constant to measures of position. Adding (or subtracting) a constant to each value will increase (or decrease) measures of position: center, percentiles, max or min by the same constant. Its shape and spread - range, IQR, standard deviation - remain unchanged. When we gave the employees a $5K bonus, we shifted their salaries.

10 Another example – shifting data
80 men of a particular height and body frame were weighed. The average weight in kilograms is The NIH recommends that the average be 74 kg. Let’s shift by 74 kg. The new mean is 8.36 kg. Note also If I weigh 80 kg, then I am +6 kg with respect to normal weight. If I weigh 70 kg, then I am -4 kg with respect to normal weight.

11 Shifting Data (cont.) NIH example: The following histograms show a shift from men’s actual weights to kilograms above (or if negative, below) recommended weight:

12 Rescaling Data Rescaling data:
When we multiply (or divide) all the data values by any constant, all measures of position (such as the mean, median, and percentiles) and measures of spread (such as the range, the IQR, and the standard deviation) are multiplied (or divided) by that same constant. When we gave the employees a 5% raise, we rescaled their salaries.

13 Rescaling Data (cont.) NIH Example: The men’s weight data set measured weights in kilograms. If we want to think about these weights in pounds, we would rescale the data:

14 Summary of effects 5K bonus 5% raise Shifted Rescaled Mean
5K bonus 5% raise Shifted Rescaled Mean Up amt of shift Up same percent Median Range No change Standard Deviation

15 Summary of Effects Shifting: Adding (or subtracting) a constant to every data value adds (or subtracts) the same constant to measures of center, but leaves the measure of spread unchanged. Rescaling: When we multiply (or divide) every data value by a constant, all measures of center and spread are multiplied (or divided) by that same constant.

16 Let’s do one more thing We have our office salaries
62, 62, 58, 54, 50, 46, 44 Recall: Mean = 53.7, St. Dev. = 7.34 Let’s shift then so that the average is 0. We get 8.3, 8.3, 4.3, 0.3, -3.7, Mean is (approximately) 0. Now divide them by the standard deviation. We get 1.13, 1.13, .59, -.04, -.5, -1.05, -1.32 Mean is still 0, Standard Deviation is 1. We have “standardized” the salaries.

17 Benefits of Standardizing
Standardized values have been converted from their original units to the standard statistical unit of standard deviations from the mean. Thus, we can compare values that are measured on different scales, with different units, or from different populations. Compare: 62, , 58, 54, 50, 46, 44 1.13, 1.13, .59, -.04, -.5, -1.05, -1.32

18 The Standard Deviation as a Ruler
The trick in comparing very different-looking values is to use standard deviations as our rulers. The standard deviation tells us how the whole collection of values varies, so it’s a natural ruler for comparing an individual to a group. As the most common measure of variation, the standard deviation plays a crucial role in how we look at data.

19 Standardizing with z-scores
We compare individual data values to their mean, relative to their standard deviation using the following formula: We call the resulting values standardized values, denoted as z. They can also be called z-scores.

20 Standardizing with z-scores (cont.)
Standardized values have no units. z-scores measure the distance of each data value from the mean in standard deviations. That is, a z-score measures how many standard deviations we are from the mean. A negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean.

21 Standardizing with z-scores (cont.)
Standardizing data into z-scores shifts the data by subtracting the mean and rescales the values by dividing by their standard deviation. Standardizing into z-scores does not change the shape of the distribution. Standardizing into z-scores changes the center by making the mean 0. Standardizing into z-scores changes the spread by making the standard deviation 1.

22 When Is a z-score BIG? A z-score gives us an indication of how unusual a value is because it tells us how far it is from the mean. In particular, the z-score of a data point measures the number of standard deviations the data point is from the mean. Remember that a negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean. The larger a z-score is (negative or positive), the more unusual it is.

23 When Is a z-score Big? (cont.)
There is no universal standard for z-scores, but there is a model that shows up over and over in Statistics. This model is called the Normal model (You may have heard of “bell-shaped curves.”). Normal models are appropriate for distributions whose shapes are unimodal and roughly symmetric. These distributions provide a measure of how extreme a z-score is.

24 When Is a z-score Big? (cont.)
There is a Normal model for every possible combination of mean and standard deviation. We write N(μ,σ) to represent a Normal model with a mean of μ and a standard deviation of σ. We use Greek letters because this mean and standard deviation do not come from data—they are numbers (called parameters) that specify the model. Nothing is ever perfectly normal (or perfectly much of any “nice” distribution.) However, the normal model is useful for a wide variety of situations.

25 Example Suppose we asked 30 people the question: At what age did you get your first real job? We could construct a histogram and see if any pattern emerges. Source (next ten slides including this one): Marc Boyer and Martine Ferguson, slides for “Basic Statistics Course”, given in Fall 2008 at the FDA Center for Food Safety and Applied Nutrition.

26

27 Now suppose that we asked 300 people the same question.
Observe as the number of people we ask increases, the graph begins to take the shape of what the population would look like. The histogram now begins to take the shape of a normal or Gaussian distribution because the underlying distribution is normal.

28

29 Now suppose that we asked 3000 people
the same question. As the number of people increases the histogram appears more smooth. The histogram now looks like a Normal probability distribution.

30

31 Next we will see what the population looks like (plot of the distribution of values in the entire population). The previous histograms had a vertical scale that showed the percentage of observations in each category. Now the vertical axis doesn’t show the percent of observations since we have an infinite population. We must start thinking in terms of area under the curve.

32 The distribution of all values in the
population is no longer a histogram. The area under the entire curve represents the entire population, and the proportion of that area that falls between two values is the probability of observing a value in that interval.

33

34 The mean is the center of the normal distribution.
The standard deviation gives an expression of the spread. A special case of the normal distribution is called the standard normal distribution. The standard normal distribution has mean zero and standard deviation 1.

35 Why is a normal distribution like a lion?
They both have a mean µ!

36 When Is a z-score Big? (cont.)
Summaries of data, like the sample mean and standard deviation, are written with Latin letters. Such summaries of data are called statistics. When we standardize Normal data, we still call the standardized value a z-score, and we write

37 When Is a z-score Big? (cont.)
Once we have standardized, we need only one model: The N(0,1) model is called the standard Normal model (or the standard Normal distribution). Be careful—don’t use a Normal model for just any data set, since standardizing does not change the shape of the distribution.

38 When Is a z-score Big? (cont.)
When we use the Normal model, we are assuming the distribution is Normal. We cannot check this assumption in practice, so we check the following condition: Nearly Normal Condition: The shape of the data’s distribution is unimodal and symmetric. This condition can be checked by making a histogram or a Normal probability plot (to be explained later).

39 A little history – Normal Model
First published in 1718 by Abraham de Moivre (“Doctrine of Chances”). He had no idea how to apply it to experimental observations. Context of estimating binomial (coin-toss, etc.) probabilities for large n. The paper remained unknown until another statistician, Karl Pearson, discovered it in 1924!

40 A little history – Normal Model
Pierre-Simon, Marquis Laplace - Analytical Theory of Probabilities (1812) – first used the normal distribution in 1778 for the analysis of errors of experiments. Karl Friedrich Gauss, who claimed to have used the method since 1794, justified it rigorously in 1809 (independent of LaPlace). Sometimes the Normal distribution is referred to as the Gaussian distribution. The name "bell curve" goes back to Jouffret who first used the term "bell surface" in 1872 for a “multivariate normal” distribution, i.e. an extension to three dimensions. The name "normal distribution" was coined independently by Charles S. Peirce, Francis Galton and Wilhelm Lexis around 1875. The independent discoveries show how naturally the Normal Model arises.

41 The Rule Normal models give us an idea of how extreme a value is by telling us how likely it is to find one that far from the mean. We can find these numbers precisely, but until then we will use a simple rule that tells us a lot about the Normal model…

42 The 68-95-99.7 Rule (cont.) It turns out that in a Normal model:
about 68% of the values fall within one standard deviation of the mean; about 95% of the values fall within two standard deviations of the mean; and, about 99.7% (almost all!) of the values fall within three standard deviations of the mean.

43 The Rule (cont.) The following shows what the Rule tells us:

44 More on the rule If the population is normally distributed then: Approximately 68% of the observations are within 1 standard deviation of the population mean. Approximately 95% of the observations are within 2 standard deviations of the population mean. Approximately 99.7% of the observations are within 3 standard deviations of the population mean. Source (this and the next 5 slides): Marc Boyer and Martine Ferguson, slides for “Basic Statistics Course”, given in Fall 2008 at FDA Center for Food Safety and Applied Nutrition.

45 Approximately 68% of the observations fall within 1
standard deviation of the mean

46 More on the rule Note that the range "within one standard deviation of the mean" is highlighted in green. The area under the curve over this range is the relative frequency of observations in the range. That is, 0.68 = 68% of the observations fall within one standard deviation of the mean (µ ± σ). Below the axis, in red, is another set of numbers. These numbers are simply measures of standard deviations from the mean. In working with the variable X we will often find it necessary to convert into units of standard deviations from the mean. When the variable is measured this way, the letter Z is commonly used.

47 Approximately 95% of the observations fall within 2
standard deviations of the mean

48 Approximately 99.7% of the observations fall within 3
standard deviations of the mean

49

50 The First Three Rules for Working with Normal Models
Make a picture. And, when we have data, make a histogram to check the Nearly Normal Condition to make sure we can use the Normal model to model the distribution.

51 Example: rule. Example: For men aged 18 to 24, serum cholesterol levels have a mean of 178 mg/100mL with a standard deviation of 40.7 mg/mL. Pete’s cholesterol reading is 231. Where is Pete with respect to the mean cholesterol level?

52 Watching Pete’s Cholesterol level
(231 – 178) = 53. The standard deviation was 40.7 53 / 40.7 is 1.3 standard deviations above the mean. Pete’s z-score is 1.3 We can say that between 68% and 95% of the stated population has a cholesterol level more extreme than Pete’s. Between 2.5% and 16% have a cholesterol level higher than Pete’s. We can say more using technology.

53 Importance of the z-score
The z-score is a ruler for comparing populations, even those which do not have the same mean and standard deviation. One study showed the mean cholesterol of American women as 188 mg/100mL and a standard deviation of 24 mg/100 mL. By coincidence, Susan has a cholesterol reading of 231 mg/100mL. Who’s is really higher – Pete’s or Susan’s?

54 Pete and Susan Susan is above her mean by 231-188, or 43 mg/100mL.
43.24 = standard deviations. Susan’s z-score is Pete’s z-score is 1.3. Susan’s is higher. Medically, Susan may have a bigger problem than Pete.

55 Pete and Susan Susan’s reading is closer to the mean (43 mg/100ml vs. Pete’s of 53 mg/100ml). But Susan’s population has smaller variability than Pete’s. This made Susan’s cholesterol more extreme than Pete’s. It’s about variability!

56 *Finding Normal Percentiles by Hand (This is both slow and inaccurate)
When a data value doesn’t fall exactly 1, 2, or 3 standard deviations from the mean, we can look it up in a table of Normal percentiles. Table Z in Appendix D provides us with normal percentiles, but many calculators and statistics computer packages provide these as well. Let’s use the technology. As for the tables – let’s not and say we did!

57 *Finding Normal Percentiles by Hand (cont.)
Table Z is the standard Normal table. We have to convert our data to z-scores before using the table. The figure shows us how to find the area to the left when we have a z-score of 1.80:

58 Finding Normal Percentiles Using Technology Much preferred method
Many calculators and statistics programs have the ability to find normal percentiles for us. Both the TI and StatCrunch will easily do it. The ActivStats Multimedia Assistant offers two methods for finding normal percentiles: The “Normal Model Tool” makes it easy to see how areas under parts of the Normal model correspond to particular cut points. There is also a Normal table in which the picture of the normal model is interactive.

59 Finding Normal Percentiles Using Technology (cont.)
The following was produced with the “Normal Model Tool” in ActivStats:

60 Finding Normal Percentiles Using the TI
To find the percentile between z = -0.5 and z = 1. Press 2nd VARS, which will get you “DISTR” Press 2 – normalcdf(, then “ENTER When normalcdf( appears, type (-.5,1) (Not – as in subtract). Your answer will display. To find the percentile less than 1 Enter normalcdf (-999,1) as above. Similarly, we can enter normalcdf(-.5,999) to find the percent of values bigger than -0.5.

61 Normalcdf with the 2.55 operating system
See the screen captures at the right for the percentile between z = -0.5 and z = 1. Enter -.5 and 1 in the menu; then select Paste. Normalcdf appears on the next screen (you must scroll to see it all.).

62 (Make a picture)3 with ShadeNorm
Select [DISTR] as before, but this time, select [DRAW], then ShadeNorm. Enter -0.5 and 1. The result is at the lower right. You might have to select an appropriate window.

63

64 Finding Normal Percentiles Using StatCrunch
Under Stat, select “Calculators”, then “Normal”. First, select <= and list – Note the answer as

65 Finding Normal Percentiles Using StatCrunch
Now select => and type 1. Note the answer as Then calculate 1 – – =0.5328 I recommend the TI – it is faster and more direct.

66 Another way with StatCrunch
Under Data, select “Compute Expression” Type the expression as shown. The answer will be added to the first nonempty column ( ). I still recommend the TI.

67 Let’s verify the 68-95-99.7 rule with the TI!
It turns out that in a Normal model: about 68% of the values fall within one standard deviation of the mean; Normalcdf((-)1,1)= about 95% of the values fall within two standard deviations of the mean; Normalcdf((-)2,2)= about 99.7% (almost all!) of the values fall within three standard deviations of the mean. Normalcdf((-3),3)=

68 From Percentiles to Scores: z in Reverse
Sometimes we start with areas and need to find the corresponding z-score or even the original data value. Example: What z-score represents the first quartile in a Normal model?

69 Z in reverse with the TI. SAT Math scores are normally distributed with a mean score of 500 and a standard deviation of 100. Great Eastern Technical University brags that they will consider for admission only students in the 90th percentile of the population as measured by the SAT Math scores. What is the cutoff for admission consideration at GETU? There is a TI function, InvNorm “Inverse Normal”. This function goes the other way from normalcdf.

70 SAT score with InvNorm The arguments are InvNorm(Pct,Mean,StDev)
Because InvNorm takes only Percentile, it gives the score corresponding to the area from the percentile to the extreme left side. InvNorm(0.90,500,100) = Since SAT scores are typically reported in units of 10, a 630 is required for admission consideration at GWTU.

71 Using InvNorm Enter 2nd, then DIST, then InvNorm Old Operating System:
Enter “.9,500,100” (this includes entering the commas) Close Parentheses You will then nave “invNorm(.9,500,100)” entered. New Operating System: Fill in as shown below:

72 Using InvNorm Notice that the default mean is 0, standard deviation is 1. Leaving it this way will get the z-score corresponding to the 90th percentile, i.e. InvNorm(0.9) = This will be very useful when we get to Unit 3.

73 Z in reverse with the TI. SAT Math scores are normally distributed with a mean score of 500 and a standard deviation of 100. Great Western Technical University brags that they will consider for admission only students in the top 5% of the population as measured by the SAT Math scores. What is the cutoff for admission consideration at GWTU? Note that the top 5% is the bottom 95%. InvNorm(0.95,500,100) = Again, if SAT reports in units of 10, you would need an SAT Math score of 670 for consideration at GWTU.

74 What z-scores correspond to the middle 95%?

75 Middle 95% The z-score cutoffs for the middle 95% are +z and –z. How to find z? Issue: InvNorm goes only from a cutoff to the extreme left. We need to “fudge” to accommodate InvNorm! There is 0.95 in the middle, plus on the extreme left. InvNorm(0.975) is or 1.96. This is extremely important for Unit 3. You need to understand this and keep it in mind.

76 z in Reverse with StatCrunch
Start out the same way as before This time, use the right-hand side and look for the answer on the left ( )

77 An Application to Test Scores
QUESTION 1 The SAT Verbal has a mean of 500 and a standard deviation of 100. Pat got 650 on the SAT Verbal. How well did Pat do?

78 An Application to Test Scores
ANSWER – actually, two answers! If we standardize Pat’s score, we get 650 – 500 100 Or a z-score of +1.5. That is, Pat’s score is 1.5 standard deviations above the mean SAT score of 500. Only people who have had statistics think in terms of z-scores, so let’s figure a percentile. We use normalcdf(?,1.5). ?? On the TI, use a low lower bound such as -999. Normalcdf((-)999,1.5)=93.32%, a respectable job.

79 An Application to Test Scores
Tell: If Pat got a 650 on the SAT Verbal, his score was in the percentile.

80 An Application to Test Scores
QUESTION 2 One college that Pat is considering requires the ACT. Pat took it as well and got a 27. The ACT has a mean of 21 and a standard deviation of 4.7 How well did Pat do? As well as the SAT? ANSWER Standardizing Pat’s score, we get 27 – 21 4.7 Or a z-score of Not quite as good. As before, on the TI, use a low lower bound such as For a percentile, Normalcdf((-)999,1.28)=89.97%, still a good showing.

81 An Application to Test Scores
Tell: If Pat got a 27 on the ACT, which is N(21,4.7), then Pat’s score was in the percentile. This score was not as good as his SAT score, which was in the percentile.

82 An Application to Test Scores
QUESTION 3 How well would Pat have to do on the ACT to match his percentile (93.32) and equivalent z-score (1.5) on the SAT? ANSWER Remembering our standardization, we have (X is Pat’s ACT score) 1.5 = X – 21 4.7 We manipulate to get X (1.5)*(4.7)+21 = 27.05! Even though Pat just missed it with the 27, 28 is needed since ACT scores are reported in whole numbers!

83 An Application to Test Scores
Tell: In order to do as well on the ACT as on the SAT, Pat would need an ACT score of 28.

84 Shortcut with the TI If we have, for example, N(500,100) and Pat’s 650, we do not have to compute the z-score. Use normalcdf with two more parameters as shown. (-999,1.5) vs (-999,650,500,100)

85 Going the other way Question 4: Chris scored in the 60th percentile.
(a) What is Chris’s z-score? (b) What is Chris’s SAT score? We can use the TI function InvNorm. InvNorm(0.6) = Then = (x – 500)/100 Then x = 525 Shortcut: InvNorm(0.6,500,100) =

86

87 Are You Normal? Normal Probability Plots
When you actually have your own data, you must check to see whether a Normal model is reasonable. Looking at a histogram of the data is a good way to check that the underlying distribution is roughly unimodal and symmetric.

88 Are You Normal? Normal Probability Plots (cont)
A more specialized graphical display that can help you decide whether a Normal model is appropriate is the Normal probability plot. If the distribution of the data is roughly Normal, the Normal probability plot approximates a diagonal straight line. Deviations from a straight line indicate that the distribution is not Normal.

89 Are You Normal? Normal Probability Plots (cont)
Nearly Normal data have a histogram and a Normal probability plot that look somewhat like this example:

90 Are You Normal? Normal Probability Plots (cont)
A skewed distribution might have a histogram and Normal probability plot like this:

91 Normal probability plots with the TI
Assume that your data are in L1 (the data from Chapter 5 are there.) Data: 62, 63, 65, 66, 68, 70, 71, 73, 75 Choose X:[60,80],Y:[-4,4] Press 2nd, Y= to brig up “STAT PLOT” Pick one of the plots 1 through 3; say 1 and turn it on – make sure the others are off. The TYPE is the one at the lower right (the squiggly one) Be sure that L1 is after “Data List” Select ZOOM, then type 9

92 Normal Probability Plot – StatCrunch (called a QQ plot)
Assume that your data are in the first column Select Graph, then QQ Plot Select the column where your data are. Continue on as in all of the other StatCrunch graphs. The graph comes up, but with the normal scale on the x-axis and the data on the y-axis. This is the opposite of how most books do it! Again, I would recommend the TI.

93

94

95 Publisher’s Instructions: Normal Probability (QQ) Plot
QQ Plot Displays the sample quantiles of a variable versus the quantiles of a standard normal distribution. Select the column(s) to be displayed in the plot(s). Enter an optional Where clause to specify the data rows to be included in the computation. Select an optional Group by column to generate a separate QQ plot for each distinct value of the Group by column. Click the Next button to specify graph layout options. Click the Create Graph! button to create the plot(s).

96 What Can Go Wrong? Don’t use a Normal model when the distribution is not unimodal and symmetric.

97 An example – incorrect z-score
Below : µ = 0.5; σ = 0.288 The point 0.99 is actually at the 99th percentile. If you assume N(.5,.288), the z-score would be 1.701; percentile would be 95.56! Slide 1- 97 97

98 An example – incorrect z-score
Below : µ = 0.5; σ = 0.5 The point 5.98 would be at the 95th percentile. If you assume N(.5,.5), the z-score of 5.98 would be off the charts!

99 What Can Go Wrong? (cont.)
Don’t use the mean and standard deviation when outliers are present—the mean and standard deviation can both be distorted by outliers. Don’t round your results in the middle of a calculation. Don’t worry about minor differences in results.

100 What have we learned? The story data can tell may be easier to understand after shifting or rescaling the data. Shifting data by adding or subtracting the same amount from each value affects measures of center and position but not measures of spread. Rescaling data by multiplying or dividing every value by a constant changes all the summary statistics—center, position, and spread.

101 What have we learned? (cont.)
We’ve learned the power of standardizing data. Standardizing uses the SD as a ruler to measure distance from the mean (z-scores). With z-scores, we can compare values from different distributions or values based on different units. z-scores can identify unusual or surprising values among data.

102 What have we learned? (cont.)
We’ve learned that the Rule can be a useful rule of thumb for understanding distributions: For data that are unimodal and symmetric, about 68% fall within 1 SD of the mean, 95% fall within 2 SDs of the mean, and 99.7% fall within 3 SDs of the mean.

103 What have we learned? (cont.)
We see the importance of Thinking about whether a method will work: Normality Assumption: We sometimes work with Normal tables (Table Z). These tables are based on the Normal model. But the TI is faster and more accurate. Data can’t be exactly Normal, so we check the Nearly Normal Condition by making a histogram (is it unimodal, symmetric and free of outliers?) or a normal probability plot (is it straight enough?).

104 Example from the Video The situation: A company fills cereal boxes to an average of 16 oz. There are minor variations. The standard deviation is 0.2 oz. If the company sets its standard at 16.0 oz, half of the boxes will be underweight. Suppose the machine is set at 16.3 oz. What percent of the boxes will be underweight?

105 Example from the Video Easy way: With the TI,
normalcdf(16.3,999,16,0.2) = It can also be worked by hand. About 6.7%, or about 1 in 15 boxes will be underweight. Tell: If the machine is set at 16.3 oz with a st. dev. of 0.2 oz, about 1 in 15 boxes will be underweight.

106 Example from the Video The company decides that 1 in 15 is too high. They now want 1 in 25, or 4%. How high should the machine be set? With the TI, invNorm(0.04) = The machine must be set 1.75 standard deviations above the mean, or .35 above the mean. Tell: The machine must be set for a mean of to have 4% underweight boxes..

107 Example from the Video The company president wants less free cereal!
The mean must be set at 16.2 and no more than 4% underweight. What to do? Change the standard deviation. We need N(16.2,σ) and we have to find σ. We know that the z-score that corresponds to 4% underweight is “-1.75”. We can then solve = (16.2 – 16)/ σ for σ. We get σ =

108 Example from the Video Tell: The company must get the machine to box cereal with a standard deviation of ounces in order to meet the stated objective of a mean of 16.2 oz. and no more than 4% underweight boxes.

109 Topics in this chapter Shifting and Rescaling Data
Standardized values (z-scores) Using the standard deviation as a ruler The Normal Model The Rule Finding normal percents and the reverse Normal Probability Plots The Normality Assumption

110 Division of Mathematics, HCC Course Objectives for Chapter 6
After studying this chapter, the student will be able to: Compare values from two different distributions using their z-scores. Use Normal models (when appropriate) and the Rule to estimate the percentage of observations falling within one, two, or three standard deviations of the mean. Determine the percentages of observations that satisfy certain conditions by using the Normal model and determine “extraordinary” values, and the reverse. Determine whether a variable satisfies the Nearly Normal condition by making a normal probability plot or histogram. Note: It is essential that this chapter be mastered. Almost everything in Unit 3 depends on it.


Download ppt "The Standard Deviation as a Ruler and the Normal Model"

Similar presentations


Ads by Google