Presentation on theme: "Chapter 6: The Standard Deviation as a Ruler and the Normal Model."— Presentation transcript:
Chapter 6: The Standard Deviation as a Ruler and the Normal Model
The Standard Deviation as a Ruler Use standard deviation when comparing unlike measures. Standard deviation is the most common measure of spread. Remember standard deviation is the square root of the variance.
Standardizing We standardize to eliminate units. A standardized value can be found by subtracting the mean from the value and dividing by the standard deviation. Has no units A z-score measures the distance of each data value from the mean in standard deviation. Negative z-score- data value below the mean Positive z-score- data value above the mean
Benefits of Standardizing Standardized values are converted to the standard statistical unit of standard deviations from the mean. (z-score) Values that are measured on different scales or in different units can now be compared.
Example Which performance is better? Bacher ran the 800-m in 129 seconds, which was 8 seconds faster than the mean of 137 seconds. How many standard deviations better than the mean is that? The standard deviation of all the qualifying runners was 5 seconds. So her time was ( )/5= -1.6 or 1.6 better than the mean. Prokhonovas winning jump was 60m longer than the 6m jump. The standard deviation was 30cm, so the winning jump was (60/30)=2 standard deviations better than the mean. The long jump was better because it was a greater improvement over its mean than the winning 800m time, as measured in standard deviation.
Shifting Data Adding or subtracting a constant amount to each value just adds or subtracts the same constant to: the mean and median Maximum, minimum, and quartiles The spread does not change because the distribution is simply shifting. The range, IQR, and the standard deviation remains the same. Recap: Adding a constant to every data value adds the same constant to measures of center and percentiles, but leaves measures of spread unchanged.
Rescaling Data Rescaling data is multiplying or dividing all values by the same number. Changes the measurement units. Ex. Inches to feet (multiply by 12) When we divide or multiply all the data values by any constant value, both measures of location (mean and median) and measures of spread (range, IQR and standard deviation) are divided or multiplied by that same value.
Example Suppose the class took a 40 point quiz. The results show a mean score of 30. median of 32, IQR 8, SD 6, min 12 and Q1 27. (suppose YOU got a 35)What happens to each statistic I decide to weight the quiz as 50 points, and will add 50 points to each score you score is now a 45 I decide to weight the score as 80 points and I double each score. Your score is now a 70 I decide to count the quiz as 100 points; Ill double each score and add 20 points. Your score is now a 90
Table StatisticOriginalxX+102x2X+20 Mean Median IQR8816 SD6612 Minimum Q Your score
What happened Measures of center and position are affected by addition and multiplication Measures of spread are only affected by multiplication
Back to z-scores Standardizing z-scores is shifting them by the mean and rescaling them by standard deviation. Standardizing: does not change the shape of the distribution of a variable. Changes the center by making the mean 0 changes the spread by making standard deviation 1
When is a z-score BIG? Normal models- appropriate for distributions whose shapes are unimodal and roughly symmetric parameter- a numerically value attribute of a model ex. The values of μ (mean) and σ (standard deviation) in N(μ,σ) model are parameters. summaries of data are called statistics standard Normal model (standard Normal distribution) - the Normal model with mean μ=0 and standard deviation σ=1
The Rule In a normal model: about 68% of the data fall within one standard deviation of the mean about 95% of the data fall within two standard deviations of the mean about 99.7% of the data fall within three standard deviations of the mean
The First Three Rules for Working with Normal Models Make a picture.
Working with the Rule Step by Step SAT scores are designed to have an overall mean of 500 and standard deviation of 100. Where do u stand among other students if you earned a 600? (use the rule) Make a picture Model the score with N(500,100)
Continued (page 110 and 111) A score of 600 is one standard deviation away from the mean. About 32% (100%-68%) of those who took the test were more than one standard deviation away from the mean, but only half on the high side. About 16 % of the test scores were better than 600
Finding Normal Percentiles by Hand The normal percentile corresponding to a z-score gives the percentage of values in a standard Normal distribution found at that z-score or below. Table of normal percentiles- used when a value doesnt fall exactly 1, 2, or 3 from the convert data into z-score before using the table look down the left column of the table for the first two digits (of z) look across the top row for the third digit whatever number connects the two is your percent
Normal Percentiles Using Technology Normalcdf- finds the area between 2 z-scores 2nd DISTR- normal cdf (zLeft,zRight) Example: find the area between z= -5 and z= 10. 2nd DISTR- normal cdf ( -5, 10) when you want infinity as your cut point, use -99 or 99 ex. What percentage of 1.8 above the 2nd DISTR- normal cdf (1.8, 99) =.0359
From Percentiles to Scores: z in Reverse What z-score represents the first quartile in a normal model? (25th percentile) go to 2nd DISTR, invNorm specify the desired percentile invorm(.25) and ENTER the cutpoint for the 25 % is z= What z-score cuts off the highest 10% of a Normal model? Since we want the cut point for the highest 10%, we know that the 90% must be below the z-score invNorm(.90) = % of the area in a Normal model is more than 1.28 standard deviations from the mean
Are You Normal? How Can You Tell? Draw a histogram- if the histogram is unimodal and symmetric, the Normal model is appropriate to use usually the easiest way to tell if the distribution is Normal Normal probability Plot- a display to asses whether a distribution of data is Normal Normal model is appropriate if plot is nearly straight deviations from a straight line indicate that the distribution is not Normal
What Can Go Wrong? Dont use Normal models when the distribution is not unimodal and symmetric. Dont use the mean and standard deviation when outliers are present. Both mean and standard deviation can be distorted by outliers
Lets Try One! Page 102, # 19 a-c What percent of a standard Normal model is found in each region? a) z>1.5 normal cdf ( 1.5,99) = 6.68% b) z< 2.25 normal cdf (-99,2.25)= 98.78% c) -1
Lets try another! Page 102,# 21 a-c In a Normal model, what values of z cut off the region described? a) highest 20% invNorm(.8)=.842 b) highest 75% =invNorm (.25)= c) the lowest 3% =invNorm(.03)=