Presentation is loading. Please wait.

Presentation is loading. Please wait.

12.3 – Measures of Dispersion

Similar presentations


Presentation on theme: "12.3 – Measures of Dispersion"— Presentation transcript:

1

2 12.3 – Measures of Dispersion
Dispersion is another analytical method to study data. A main use of dispersion is to compare the amounts of spread in two (or more) data sets. A common technique in inferential statistics is to draw comparisons between populations by analyzing samples that come from those populations. Two of the most common measures of dispersion are the range and the standard deviation. Range For any set of data, the range of the set is given by the following formula: Range = (greatest value in set) – (least value in set).

3 12.3 – Measures of Dispersion
Range Example: The two sets below have the same mean and median (7). Find the range of each set. Set A 1 2 7 12 13 Set B 5 6 8 9 Range of Set A: 13 – 1 = 12 Range of Set A: 9 – 5 = 4

4 12.3 – Measures of Dispersion
Standard Deviation One of the most useful measures of dispersion is the standard deviation. It is based on deviations from the mean of the data. Find the deviations from the mean for all data values of the sample 1, 2, 8, 11, 13. The mean is 7. To find each deviation, subtract the mean from each data value. Data Value 1 2 8 11 13 Deviation – 6 – 5 1 4 6 The sum of the deviations is always equal to zero.

5 12.3 – Measures of Dispersion
Standard Deviation Calculating the Sample Standard Deviation The sample standard deviation is found by calculating the square root of the variance. The variance is found by summing the squares of the deviations and dividing that sum by n – 1 (since it is a sample instead of a population). The sample standard deviation is denoted by the letter s. The standard deviation of a population is denoted by .

6 12.3 – Measures of Dispersion
Standard Deviation Calculating the Sample Standard Deviation 1. Calculate the mean of the numbers. 2. Find the deviations from the mean. 3. Square each deviation. 4. Sum the squared deviations. 5. Divide the sum in Step 4 by n – 1. 6. Take the square root of the quotient in Step 5.

7 12.3 – Measures of Dispersion
Standard Deviation Calculating the Sample Standard Deviation Example: Find the standard deviation of the sample set {1, 2, 8, 11, 13}. = 7 Data Value 1 2 8 11 13 Deviation – 6 – 5 1 4 6 (Deviation)2 36 25 1 16 36 Sum of the (Deviations)2 = = 114

8 12.3 – Measures of Dispersion
Standard Deviation Calculating the Sample Standard Deviation Sum of the (Deviations)2 = = 114 Divide 114 by n – 1 with n = 5: 114 = 28.5 5 – 1 Take the square root of 28.5: 5.34 The sample standard deviation of the data is 5.34.

9 Example: Interpreting Measures
12.3 – Measures of Dispersion Standard Deviation Example: Interpreting Measures Two companies, A and B, sell small packs of sugar for coffee. The mean and standard deviation for samples from each company are given below. Which company consistently provides more sugar in their packs? Which company fills its packs more consistently? Company A Company B

10 Example: Interpreting Measures
12.3 – Measures of Dispersion Standard Deviation Example: Interpreting Measures Company A Company B Which company consistently provides more sugar in their packs? The sample mean for Company A is greater than the sample mean of Company B. The inference can be made that Company A provides more sugar in their packs.

11 Example: Interpreting Measures
12.3 – Measures of Dispersion Standard Deviation Example: Interpreting Measures Company A Company B Which company fills its packs more consistently? The standard deviation for Company B is less than the standard deviation for Company A. The inference can be made that Company B fills their packs more closer to their mean than Company A.

12 12.3 – Measures of Dispersion
Chebyshev’s Theorem For any set of numbers, regardless of how they are distributed, the fraction of them that lie within k standard deviations of their mean (where k > 1) is at least What is the minimum percentage of the items in a data set which lie within 2, and 3 standard deviations of the mean? 75% 88.9%

13 Coefficient of Variation
12.3 – Measures of Dispersion Coefficient of Variation The coefficient of variation expresses the standard deviation as a percentage of the mean. It is not strictly a measure of dispersion as it combines central tendency and dispersion. For any set of data, the coefficient of variation is given by for a sample or for a population.

14 Example: Comparing Samples
12.3 – Measures of Dispersion Coefficient of Variation Example: Comparing Samples Compare the dispersions in the two samples A and B A: 12, 13, 16, 18, 18, 20 B: 125, 131, 144, 158, 168, 193 Sample A Sample B Sample B has a larger dispersion than sample A, but sample A has the larger relative dispersion (coefficient of variation).

15 12.4 – Measures of Position In some cases, the analysis of certain individual items in the data set is of more interest rather than the entire set. It is necessary at times, to be able to measure how an item fits into the data, how it compares to other items of the data, or even how it compares to another item in another data set. Measures of position are several common ways of creating such comparisons.

16 12.4 – Measures of Position The z-Score
The z-score measures how many standard deviations a single data item is from the mean.

17 Example: Comparing with z-Scores
12.4 – Measures of Position Example: Comparing with z-Scores Two students, who take different history classes, had exams on the same day. Jen’s score was 83 while Joy’s score was 78. Which student did relatively better, given the class data shown below? Jen Joy Class mean 78 70 Class standard deviation 4 5

18 Example: Comparing with z-Scores
12.4 – Measures of Position Example: Comparing with z-Scores Jen 83 Joy 78 Class mean 78 70 Class standard deviation 4 5 Jen’s z-score: Joy’s z-score: 83 – 78 78 – 70 = 1.25 = 1.6 4 5 Joy’s z-score is higher as she was positioned relatively higher within her class than Jen was within her class.

19 12.4 – Measures of Position Percentiles
A percentile measure the position of a single data item based on the percentage of data items below that single data item. Standardized tests taken by larger numbers of students, convert raw scores to a percentile score. If approximately n percent of the items in a distribution are less than the number x, then x is the nth percentile of the distribution, denoted Pn.

20 12.4 – Measures of Position Percentiles Example:
The following are test scores (out of 100) for a particular math class. Find the fortieth percentile. 40% = 0.4 The average of the 12th and 13th items represents the 40th percentile (P40). 0.4(30) 12 40% of the scores were below 74.5.

21 Other Percentiles: Deciles and Quartiles
12.4 – Measures of Position Other Percentiles: Deciles and Quartiles Deciles are the nine values (denoted D1, D2,…, D9) along the scale that divide a data set into ten (approximately) equal parts. 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, and 90% Quartiles are the three values (Q1, Q2, Q3) that divide the data set into four (approximately) equal parts. 25%, 50%, and 75%

22 12.4 – Measures of Position Other Percentiles: Deciles and Quartiles
Example: Deciles The following are test scores (out of 100) for a particular math class. Find the sixth decile. Sixth decile = 60% The average of the 18th and 19th items represents the 6th decile (D6). 60% = 0.6 0.6(30) 60% of the scores were at or below 82. 18

23 Other Percentiles: Deciles and Quartiles
12.4 – Measures of Position Other Percentiles: Deciles and Quartiles Quartiles For any set of data (ranked in order from least to greatest): The second quartile, Q2 (50%) is the median. The first quartile, Q1 (25%) is the median of items below Q2. The third quartile, Q3 (75%) is the median of items above Q2.

24 12.4 – Measures of Position Other Percentiles: Deciles and Quartiles
Example: Quartiles The following are test scores (out of 100) for a particular math class. Find the three quartiles. Q1= 25% The 8th item represents the 1st quartile (Q1) 25% = 0.25 0.25(30) 25% of the scores were below 72. 7.5

25 12.4 – Measures of Position Other Percentiles: Deciles and Quartiles
Example: Quartiles The following are test scores (out of 100) for a particular math class. Find the three quartiles. Q2= 50% = median The average of the 15th and 16th items represents the 2nd quartile (Q2) or the median 50% = 0.5 0.5(30) 50% of the scores were below 78.5. 15

26 12.4 – Measures of Position Other Percentiles: Deciles and Quartiles
Example: Quartiles The following are test scores (out of 100) for a particular math class. Find the three quartiles. Q3= 75% The 23rd item represents the 3rd quartile (Q3) 75% = 0.75 0.75(30) 75% of the scores were below 88. 22.5

27 the first quartile, the median, the third quartile,
12.4 – Measures of Position Box Plots A box plot or a box and whisker plot is a visual display of five statistical measures. The five statistical measures are: the lowest value, the first quartile, the median, the third quartile, the largest value. the lowest value the largest value

28 12.4 – Measures of Position Box Plots Example:
The following are test scores (out of 100) for a particular math class. Q1= 25% = 72 Q2= 50% = median= 78.5 Q3= 75%= 88 Lowest = 44 Largest = 100

29 Discrete and Continuous Random Variables
12.5 – The Normal Distribution Discrete and Continuous Random Variables Discrete random variable: A random variable that can take on only certain fixed values. The number of even values of a single die. The number of heads in three tosses of a fair coin. Continuous random variable: A variable whose values are not restricted. The diameter of a growing tree. The height of third graders.

30 Definition and Properties of a Normal Curve
12.5 – The Normal Distribution Definition and Properties of a Normal Curve A normal curve is a symmetric, bell-shaped curve. Any random continuous variable whose graph has this characteristic shape is said to have a normal distribution. On a normal curve the horizontal axis is labeled with the mean and the specific data values of the standard deviations. If the horizontal axis is labeled using the number of standard deviations from the mean, rather than the specific data values, then the curve the standard normal curve

31 12.5 – The Normal Distribution
Sample Statistics Normal Curve Standard Normal Curve – 2.8 – 1.4 1.4 2.8 – 2 – 1 1 2 5.5 0 or 5.5

32 12.5 – The Normal Distribution
Normal Curves B S A C S is standard, with mean = 0, standard deviation = 1 A has mean < 0, standard deviation = 1 B has mean = 0, standard deviation < 1 C has mean > 0, standard deviation > 1

33 Properties of Normal Curves
12.5 – The Normal Distribution Properties of Normal Curves The graph of a normal curve is bell-shaped and symmetric about a vertical line through its center. The mean, median, and mode of a normal curve are all equal and occur at the center of the distribution. Empirical Rule: the approximate percentage of all data lying within 1, 2, and 3 standard deviations of the mean. within 1 standard deviation 68% within 2 standard deviations 95% within 3 standard deviations. 99.7%

34 12.5 – The Normal Distribution
Empirical Rule 68% 95% 99.7%

35 Example: Applying the Empirical Rule
12.5 – The Normal Distribution Example: Applying the Empirical Rule A sociology class of 280 students takes an exam. The distribution of their scores can be treated as normal. Find the number of scores falling within 2 standard deviations of the mean. A total of 95% of all scores lie within 2 standard deviations of the mean. (.95)(280) = 266 scores

36 12.5 – The Normal Distribution
Normal Curve Areas In a normal curve and a standard normal curve, the total area under the curve is equal to 1. The area under the curve is presented as one of the following: Percentage (of total items that lie in an interval), Probability (of a randomly chosen item lying in an interval), Area (under the normal curve along an interval).

37 A Table of Standard Normal Curve Areas
12.5 – The Normal Distribution A Table of Standard Normal Curve Areas To answer questions that involve regions other than 1, 2, or 3 standard deviations, a Table of Standard Normal Curve Areas is necessary. The table shows the area under the curve for all values in a normal distribution that lie between the mean and z standard deviations from the mean. The percentage of values within a certain range of z-scores, or the probability of a value occurring within that range are the more common uses of the table. Because of the symmetry of the normal curve, the table can be used for values above the mean or below the mean.

38 Example: Applying the Normal Curve Table
12.5 – The Normal Distribution Example: Applying the Normal Curve Table Use the table to find the percent of all scores that lie between the mean and 1.5 standard deviations above the mean. z = 1.5 z = 1.50 Find 1.50 in the z column. The table entry is .4332 Therefore, 43.32% of all values lie between the mean and 1.5 standard deviations above the mean. or There is a probability that a randomly selected value will lie between the mean and 1.5 standard deviations above the mean.

39 Example: Applying the Normal Curve Table
12.5 – The Normal Distribution Example: Applying the Normal Curve Table Use the table to find the percent of all scores that lie between the mean and standard deviations below the mean. z = –2.62 z = – 2.62 Find 2.62 in the z column. The table entry is Therefore, 49.56% of all values lie between the mean and 2.62 standard deviations below the mean. or There is a probability that a randomly selected value will lie between the mean and 2.62 standard deviations below the mean.

40 Example: Applying the Normal Curve Table
12.5 – The Normal Distribution Example: Applying the Normal Curve Table Find the percent of all scores that lie between the given z-scores. z = –1.7 z = 2.55 z = – 1.7 The table entry is z = 2.55 The table entry is = 0.95 Therefore, 95% of all values lie between – 1.7 and 2.55 standard deviations.

41 Example: Applying the Normal Curve Table
12.5 – The Normal Distribution Example: Applying the Normal Curve Table Find the probability that a randomly selected value will lie between the given z-scores. z = 0.61 z = 2.63 z = 0.61 The table entry is z = 2.63 The table entry is – = 0.2666 There is a probability that a randomly selected value will lie between 0.61 and 2.63 standard deviations.

42 Example: Applying the Normal Curve Table
12.5 – The Normal Distribution Example: Applying the Normal Curve Table Find the probability that a randomly selected value will lie above the given z-score. z = 2.14 z = 2.14 The table entry is Half of the area under the curve is – = 0.0162 There is a probability that a randomly selected value will lie 2.14 standard deviations.

43 Example: Applying the Normal Curve Table
12.5 – The Normal Distribution Example: Applying the Normal Curve Table The volumes of soda in bottles from a small company are distributed normally with a mean of 12 ounces and a standard deviation .15 ounces. If 1 bottle is randomly selected, what is the probability that it will have more than ounces? z = 2.2 The table entry is 12.33 Half of the area under the curve is – = 0.0139 There is a probability that a randomly selected bottle will contain more than ounces.

44 Example: Finding z-scores for Given Areas
12.5 – The Normal Distribution Example: Finding z-scores for Given Areas Assuming a normal distribution, find the z-score meeting the condition that 39% of the area is to the right of z. 50% of the area lies to the right of the mean. 11% = 0.11 39% = 0.39 The areas from the Normal Curve Table are based on the area between the mean and the z-score. area between the mean and the z-score = 0.50 – 0.39 = 0.11 From the table, find the area of or the closest value and read the z-score. z-score = 0.28

45 Example: Finding z-scores for Given Areas
12.5 – The Normal Distribution Example: Finding z-scores for Given Areas Assuming a normal distribution, find the z-score meeting the condition that 76% of the area is to the left of z. 50% of the area lies to the left of the mean. 26% = 0.26 50% The areas from the Normal Curve Table are based on the area between the mean and the z-score. 0.5000 area between the mean and the z-score = 0.76 – 0.50 = 0.26 From the table, find the area of or the closest value and read the z-score. z-score = 0.71


Download ppt "12.3 – Measures of Dispersion"

Similar presentations


Ads by Google