Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.

Similar presentations


Presentation on theme: "Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents."— Presentation transcript:

1 Chapter 5: Measures of Dispersion

2 Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents differ from each other. (spread is another interchangeable term) Dispersion helps us to measure the extent to which the values or responses deviate from each other. The tools used to measure the amount of variability or spread in the categories or values of a variable are called the measures of dispersion or measures of variability. Example: Which one of these data sets has more variability? A: 22, 24, 25, 28, 30B: 1, 5, 12, 20, 32 Data set B is clearly more spread out.

3 Measures of Dispersion for Numerical Variables Dispersion in the case of a numerical variable is the degree to which the numbers in a distribution vary from the typical, or average value. Three typical methods to measure dispersion for numerical variables are: – Range – Interquartile range – Standard deviation

4 Range The range is the difference between the highest value and lowest value in a data set. It is the simplest way to obtain information about dispersion. Daily temperature is an example of range as it is usually given in the form of the highest and lowest values.

5 Example: Compute the range for the following set of data: 12, 3, 17, 21, 11, 18, 20, 19, 6, 15 Range: 21 – 3 = 18 There are a few disadvantages to using range: 1)The range does not give any information about the distribution of the observations within its two endpoint values. 2)The range can be skewed by outliers, just as the mean can be. For example, add the value 200 to the above data set. Now the range is 197. That is not representative of the data.

6 Interquartile Range Interquartile range is the range of the middle 50% of a distribution after the bottom 25% and top 25% have been eliminated. Interquartile range is more resistant to outliers than the range. In order to calculate the IQR, we first have to find the quartiles of the distribution, which are the values that divide the distribution into four groups of roughly the same size.

7 Steps to calculate quartiles 1) Put the data in numerical order and find the median (this also happens to be the second quartile) 2) Find the median of the values whose position in the ordered list is to the left of the median. This value is the first quartile. 3) Find the median of the values whose position in the ordered list is to the right of the median. This value is the third quartile. Note: When the number of observations is odd, don’t include the median value in the calculations in steps 2 and 3.

8 If the position of a quartile is between two numbers, average those numbers as you would to find a median. Subtract the first quartile from the third quartile to get the IQR. When dealing with frequency distributions, there are useful formulas to help us figure out the position of the quartiles. If the position is a whole number, then the quartile lies on that exact one observation. If the position is a decimal, then the quartile is the average.

9 Example: Find the interquartile range. 5, 6, 5, 10, 11, 16, 15, 13 Order the data:5, 5, 6, 10, 11, 13, 15, 16 The median is 10.5 Q1 is 5.5. Q3 is 14 IQR=14-5.5=8.5

10 Example: Find the IQR for the following data.

11 Standard Deviation Standard deviation measures the variability in a distribution using the squared deviations from the mean. A deviation is the difference between the score and the mean of the data set. Deviations can be positive, negative, or 0.

12 Example: Find the deviations and the sum of the deviations for the following data set: 25, 26, 30. Since the sum of the deviations from the mean is always zero, their average will always be zero. We need a different way to measure the average deviations from the mean. To counteract that problem, we will square each deviation and then find the average of the squared deviations.

13 The Variance The variance is the average of the squared deviations from the mean. Standard deviation is the square root of the variance. The variance (and as a result standard deviation) is calculated differently for the population and the sample.

14 Notation

15

16 Steps to find the variance. 1)Find the sum of your data set. 2)Find the mean. 3)Calculate the deviations. 4)Square the deviations. 5)Sum the squared deviations. 6)Divide the sum of the squared deviations by n if working with the population or n-1 if working with the sample. To find the standard deviation, take the square root of step 6.

17 Example: Find the variance and the standard deviation for the sample data: 25, 26, 30.

18 You try. Find the variance and standard deviation for the population data: 2, 4, 8, 14.

19 Short-cut Formula for Variance and SD There are two quantities in the short-cut formula that we need to clarify.

20 Note: For the population variance and standard deviation, divide by n stead of n-1

21 Let’s retry the previous example of the sample of 25, 26, and 30. We know the variance should come out to be 7 and the sd 2.65.

22 Now try the example with 2, 4, 8, and 14. Remember we are dealing with a population!


Download ppt "Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents."

Similar presentations


Ads by Google