Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measures of Dispersion

Similar presentations


Presentation on theme: "Measures of Dispersion"— Presentation transcript:

1 Measures of Dispersion
Chapter 3 Section 2 Measures of Dispersion

2 Chapter 3 – Section 2 Comparing two sets of data
The measures of central tendency (mean, median, mode) measure the differences between the “average” or “typical” values between two sets of data The measures of dispersion in this section measure the differences between how far “spread out” the data values are Comparing two sets of data Comparing two sets of data The measures of central tendency (mean, median, mode) measure the differences between the “average” or “typical” values between two sets of data

3 Chapter 3 – Section 2 The range of a variable is the largest data value minus the smallest data value Compute the range of 6, 1, 2, 6, 11, 7, 3, 3 The largest value is 11 The smallest value is 1 Subtracting the two … 11 – 1 = 10 … the range is 10 The range of a variable is the largest data value minus the smallest data value Compute the range of 6, 1, 2, 6, 11, 7, 3, 3 The largest value is 11 The smallest value is 1 The range of a variable is the largest data value minus the smallest data value Compute the range of 6, 1, 2, 6, 11, 7, 3, 3 The range of a variable is the largest data value minus the smallest data value

4 Chapter 3 – Section 2 The range only uses two values in the data set – the largest value and the smallest value The range is not resistant If we made a mistake and 6, 1, 2 was recorded as 6000, 1, 2 The range is now ( 6000 – 1 ) = 5999 The range only uses two values in the data set – the largest value and the smallest value The range is not resistant The range only uses two values in the data set – the largest value and the smallest value The range is not resistant If we made a mistake and 6, 1, 2 was recorded as 6000, 1, 2

5 Chapter 3 – Section 2 The variance is based on the deviation from the mean ( xi – μ ) for populations ( xi – ) for samples To treat positive differences and negative differences, we square the deviations ( xi – μ )2 for populations ( xi – )2 for samples The variance is based on the deviation from the mean ( xi – μ ) for populations ( xi – ) for samples

6 Chapter 3 – Section 2 The population variance of a variable is the sum of these squared deviations divided by the number in the population The population variance is represented by σ2 Note: For accuracy, use as many decimal places as allowed by your calculator The population variance of a variable is the sum of these squared deviations divided by the number in the population The population variance of a variable is the sum of these squared deviations divided by the number in the population

7 Chapter 3 – Section 2 Compute the population variance of 6, 1, 2, 11
Compute the population mean first μ = ( ) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Average the squared deviations ( ) / 4 = 15.5 The population variance σ2 is 15.5 Compute the population variance of 6, 1, 2, 11 Compute the population mean first μ = ( ) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Compute the population variance of 6, 1, 2, 11 Compute the population mean first μ = ( ) / 4 = 5 Compute the population variance of 6, 1, 2, 11

8 Chapter 3 – Section 2 The sample variance of a variable is the sum of these squared deviations divided by one less than the number in the sample The sample variance is represented by s2 We say that this statistic has n – 1 degrees of freedom The sample variance of a variable is the sum of these squared deviations divided by one less than the number in the sample

9 Chapter 3 – Section 2 Compute the sample variance of 6, 1, 2, 11
Compute the sample mean first = ( ) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Average the squared deviations ( ) / 3 = 20.7 The sample variance s2 is 20.7 Compute the sample variance of 6, 1, 2, 11 Compute the sample mean first = ( ) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Compute the sample variance of 6, 1, 2, 11 Compute the sample mean first = ( ) / 4 = 5 Compute the sample variance of 6, 1, 2, 11

10 Chapter 3 – Section 2 Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers? In the first case, { 6, 1, 2, 11 } was the entire population (divide by N) In the second case, { 6, 1, 2, 11 } was just a sample from the population (divide by n – 1) These are two different situations Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers? In the first case, { 6, 1, 2, 11 } was the entire population (divide by N) In the second case, { 6, 1, 2, 11 } was just a sample from the population (divide by n – 1) Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers? In the first case, { 6, 1, 2, 11 } was the entire population (divide by N) Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers?

11 Chapter 3 – Section 2 Why do we use different formulas?
The reason is that using the sample mean is not quite as accurate as using the population mean If we used “n” in the denominator for the sample variance calculation, we would get a “biased” result Bias here means that we would tend to underestimate the true variance

12 Chapter 3 – Section 2 The standard deviation is the square root of the variance The population standard deviation Is the square root of the population variance (σ2) Is represented by σ The sample standard deviation Is the square root of the sample variance (s2) Is represented by s The standard deviation is the square root of the variance The standard deviation is the square root of the variance The population standard deviation Is the square root of the population variance (σ2) Is represented by σ

13 Chapter 3 – Section 2 If the population is { 6, 1, 2, 11 }
The population variance σ2 = 15.5 The population standard deviation σ = If the sample is { 6, 1, 2, 11 } The sample variance s2 = 20.7 The sample standard deviation s = The population standard deviation and the sample standard deviation apply in different situations If the population is { 6, 1, 2, 11 } The population variance σ2 = 15.5 The population standard deviation σ = If the population is { 6, 1, 2, 11 } The population variance σ2 = 15.5 The population standard deviation σ = If the sample is { 6, 1, 2, 11 } The sample variance s2 = 20.7 The sample standard deviation s =

14 Chapter 3 – Section 2 The standard deviation is very useful for estimating probabilities

15 Chapter 3 – Section 2 The empirical rule
If the distribution is roughly bell shaped, then Approximately 68% of the data will lie within 1 standard deviation of the mean Approximately 95% of the data will lie within 2 standard deviations of the mean Approximately 99.7% of the data (i.e. almost all) will lie within 3 standard deviations of the mean The empirical rule If the distribution is roughly bell shaped, then Approximately 68% of the data will lie within 1 standard deviation of the mean Approximately 95% of the data will lie within 2 standard deviations of the mean The empirical rule If the distribution is roughly bell shaped, then Approximately 68% of the data will lie within 1 standard deviation of the mean The empirical rule If the distribution is roughly bell shaped, then

16 Chapter 3 – Section 2 For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and ( ), i.e and 20.4 Approximately 95% of the values will lie between (17 – 2  3.4) and (  3.4), i.e and 23.8 Approximately 99.7% of the values will lie between (17 – 3  3.4) and (  3.4), i.e. 6.8 and 27.2 A value of 2.1 and a value of 33.2 would both be very unusual For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and ( ), i.e and 20.4 Approximately 95% of the values will lie between (17 – 2  3.4) and (  3.4), i.e and 23.8 Approximately 99.7% of the values will lie between (17 – 3  3.4) and (  3.4), i.e. 6.8 and 27.2 For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and ( ), i.e and 20.4 For a variable with mean 17 and standard deviation 3.4 For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and ( ), i.e and 20.4 Approximately 95% of the values will lie between (17 – 2  3.4) and (  3.4), i.e and 23.8

17 Chapter 3 – Section 2 Chebyshev’s inequality gives a lower bound on the percentage of observations that lie within k standard deviations of the mean (where k > 1) This lower bound is An estimated percentage The actual percentage for any variable cannot be lower than this number Therefore the actual percentage must be this value or higher Chebyshev’s inequality gives a lower bound on the percentage of observations that lie within k standard deviations of the mean (where k > 1) Chebyshev’s inequality gives a lower bound on the percentage of observations that lie within k standard deviations of the mean (where k > 1) This lower bound is An estimated percentage The actual percentage for any variable cannot be lower than this number

18 Chapter 3 – Section 2 Chebyshev’s inequality
For any data set, at least of the observations will lie within k standard deviations of the mean, where k is any number greater than 1

19 Chapter 3 – Section 2 How much of the data lies within 1.5 standard deviations of the mean? From Chebyshev’s inequality so that at least 55.6% of the data will lie within 1.5 standard deviations of the mean

20 Chapter 3 – Section 2 If the mean is equal to 20 and the standard deviation is equal to 4, how much of the data lies between 14 and 26? 14 to 26 are 1.5 standard deviations from 20 so that at least 55.6% of the data will lie between 14 and 26

21 Summary: Chapter 3 – Section 2
Range The maximum minus the minimum Not a resistant measurement Variance and standard deviation Measures deviations from the mean Empirical rule About 68% of the data is within 1 standard deviation About 95% of the data is within 2 standard deviations


Download ppt "Measures of Dispersion"

Similar presentations


Ads by Google