Download presentation

Presentation is loading. Please wait.

Published byRylee Tallon Modified about 1 year ago

1
Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5 number summary and construct a boxplot. I can describe spread using the standard deviation of a distribution. I can describe spread using the standard deviation of a distribution. Hw: pg 71: 92, 93, 95, 96, 97, 103, 105, 107 - 110

2
Five-Number Summary Data set consisting of smallest observation, first quartile, median, third quartile, and largest observation written in order. Min Q1 M Q3 Max It gives us a quick summary of both center and spread.

3
Bonds: Min Q1 M Q3 Max 1625344173

4
Box (and whiskers) Plot A graph of a five-number summary of a distribution; best for side- by-side comparisons since they show less detail than histograms or stemplots; best for side- by-side comparisons since they show less detail than histograms or stemplots; drawn either horizontally or vertically. drawn either horizontally or vertically.

5
Modified Boxplot Because the regular boxplot conceals outliers we will use modified boxplot. Plots outliers as isolated points Plots outliers as isolated points Extend “whiskers” out to largest and/or smallest data points that are not outliers Extend “whiskers” out to largest and/or smallest data points that are not outliers Remember: label axis, title graph, scale axis. Remember: label axis, title graph, scale axis.

6
Regular (a) and modified (b) boxplots comparing Barry Bonds and Hank Aaron home runs. Min Q1 M Q3 Max Outlier Min Q1 M Q3 Max Outlier

7
Activity: Acing the First Test Enter the scores of Mrs. Liao’s students on their first statistics test into L1 from page 71, ex. 92 Sort Data(ascending): Inspire Place cursor on column title Select:Menu,1:Actions,6:sort, sort by (a) Inspire: Appendix A6

8
a. Find the five-number summary and verify your expectation from a. Calculator activity Enter the scores into L1 from page 71. Enter the scores into L1 from page 71. Calculator: 1 VAR STAT(L1) Calculator: 1 VAR STAT(L1) 43, 82, 87.75, 93, 98 mean = 2544/30 (or )= 84.8 the median is greater than the mean

9
b. What is the range of the middle half of the score of the statistic students? Between Q1 and Q3: Between 82 and 93 Between 82 and 93

10
Acing the First Test Cont. c. Construct by hand a modified boxplot of the stats students scores. First find potential outliers. IQR = IQR = Q1 - IQR x 1.5 = Q1 - IQR x 1.5 = Q3 + IQR x 1.5 = Q3 + IQR x 1.5 = Outliers: Outliers: Graph: Mark a small x for the outlier(s), next lowest min, Q1, M, Q3, max. Draw box and whisker plot. Draw box and whisker plot.

11
Acing the First Test Cont. d. On your calculator: First define Plot1 to be a modified boxplot using the list. Graph, trace and compare. Is there an outlier? If so, was it the same as in part a ? d. On your calculator: First define Plot1 to be a modified boxplot using the list. Graph, trace and compare. Is there an outlier? If so, was it the same as in part a ? Based on the boxplot, conjecture the shape of the corresponding histogram. Based on the boxplot, conjecture the shape of the corresponding histogram. Histogram shape:______________________ Histogram shape:______________________

12
Acing the First Test Cont. Next, Define Plot2 to be a histogram also using the same list. Trace and compare. Did you guess correctly? Roughly draw histogram below. Next, Define Plot2 to be a histogram also using the same list. Trace and compare. Did you guess correctly? Roughly draw histogram below.

13
Important Note: If a distribution contains outliers, use the median and the IQR to describe the distribution. If a distribution contains outliers, use the median and the IQR to describe the distribution.

14
The most common numerical description of a distribution is the : Standard deviation (s): measures spread by looking at how far the observations are from their mean measures spread by looking at how far the observations are from their mean The standard deviations (s) is the square root of the variance (s 2 ). The standard deviations (s) is the square root of the variance (s 2 ).

15
Variance (s 2 ) of a set of observations is the average of the squares of the deviations of the observations from their mean. Note: Most of the time we will use calculator (STAT:CALC:1VAR STAT).

16
Why square the deviations? It makes them all non negative so that the observations far from the mean in either direction will have large positive squared deviation. It makes them all non negative so that the observations far from the mean in either direction will have large positive squared deviation.

17
Properties of the Standard Deviation The sum of the deviations of the observations from their mean will always be zero. Choose s only when mean is chosen as the measure of center. s = 0 only when there is no spread (all observations have the same value). s, like the mean is not resistant. Strong skewness or a few outliers can make s very large. If a value is more than 2σ’s from the mean it is an outlier.

18
Why divide by (n – 1)? Degrees of freedom – Since is the exact balancing point of the data, the data will almost always be closer to, on average, than they will be to μ. The sum of the squared deviations of will underestimate the sum of the squared deviations of µ. To correct this we divide by n-1 instead of n.

19
Example: Roger Maris New York Yankee Roger Maris held the single-season home run record from 1961 until 1998. Here are Maris’s home run counts for his 10 years in the American League: New York Yankee Roger Maris held the single-season home run record from 1961 until 1998. Here are Maris’s home run counts for his 10 years in the American League: 14 28 16 39 61 33 23 26 8 13

20
a. Maris’s mean number of home runs is = 26.1. Find the standard deviation s from its definition (by hand). ∑ (x i - ) 2 = (14-26.1) 2 + (28-26.1) 2 … s 2 = / n-1 s 2 = / n-1 s 2 = 2192.9/9 s 2 = 2192.9/9 s 2 = 243.66 s 2 = 243.66 s = 15.609 s = 15.609

21
14 28 16 39 61 33 23 26 8 13 b. Use your calculator to verify your results. (STAT:CALC:1 var stat:L1) Then use your calculator to find the mean and s for the 9 observations that remain when you leave out any outlier(s). Then use your calculator to find the mean and s for the 9 observations that remain when you leave out any outlier(s). Recall IQR x 1.5 Note: they choose 61 as an outlier while the upper bound is 61.5.

22
Mean = 22.2 Mean = 22.2 Sx = 10.244 Sx = 10.244 How does the leaving out the “outlier” affect the values of the mean and s? How does the leaving out the “outlier” affect the values of the mean and s? It caused the values of both measures to decrease. Is s a resistant measure of spread? Is s a resistant measure of spread? Clearly, s is not a resistant measure of spread.

23
Key Points of Chapter

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google