4 Definitions Types of variables Categorical E.g., gender, type of degree Quantitative E.g., time, mass, force, dollars The distribution of a variable tells us what values it takes and how often it takes these values.
16 Displaying distributions graphically The distribution of a variable tells us what values it takes and how often it takes these values. Ways to display distributions for quantitative variables: dotplots histograms stemplots See example on pp. 221-222.
19 Histograms Most common graph of the distribution of a quantitative variable. How to make a histogram: Example 4.9, p. 224 Range: 5.7 to 17.6 Shoot for 6-15 classes (bars) Read paragraph on p. 226
27 Stemplots Usually reserved for smaller data sets. Advantage: Actual (or rounded) data are provided. Possible drawback: Many people are not used to this type of plot, so the presenter/writer has to describe it.
29 More problems Exercises: 4.24 and 4.25, p. 233 4.26, p. 233
30 Practice Exercises 4.30, p. 239 and 4.32, p. 240 4.28, p. 238
31 Wrapping up Section 4.2 … 4.28, p. 238 4.33, p. 242 4.36 4.37
32 4.3 Describing Distributions with Numbers Until now, we’ve been satisfied with using words to describe the center and spread of distributions. Now, we will use numbers to describe these characteristics of a distribution. The 5-number summary: Center: Median (p. 248) Spread: Find the Quartiles, Q 1 and Q 3. (p. 250) Spread: Min and Max
33 Boxplots We can use this information to construct a boxplot:
34 Practice 4.46, p. 254 Enter data in the Stat Edit menu in your calculator, and order them.
35 Boxplot vs. Modified Boxplot The modified boxplot shows outliers … they are marked with a *. The lines extending from the quartiles go to the last number which is not an outlier. If there are no outliers, the modified boxplot and the regular boxplot are identical. Below are a boxplot (on the left) and modified boxplot (on the right) for Problem 4.39, p. 245.
37 Practice Exercises: 4.50, p. 256 4.49, p. 256
38 Testing for Outliers Find the Inter-Quartile Range: IQR=Q 3 -Q 1 Multiply: 1.5*IQR Outliers on low side: Q 1 -1.5*IQR Outliers on high side: Q 3 +1.5*IQR Are there any numbers outside of these values? If so, they are outliers, and are marked on boxplots with an asterisk. The tail is drawn to the highest (or lowest) value which is not an outlier.
39 Measures of Center and Spread Median and IQR Mean and Standard Deviation Mean is the arithmetic average Standard deviation measures the average distance of the observations from their mean. Variance is simply the squared standard deviation. All of these statistics can be calculated by hand, but we use technology to do these today … We use 1-sample stats on our calculators, or a stats program.
40 Properties of standard deviation (p. 259) Use s as a measure of spread when you use the mean. If s=0, there is no spread. The larger the value for s, the larger the spread of the distribution.
41 Practice Problem 4.52, p. 263 Mike: 59,69,71,52,65,55,72,50,75,67,51,69,68,62,69
44 Choosing a summary The book has a section on which summary to use (mean and std. dev., or median with the quartiles). I like to report all of them. However, when writing about a distribution, or comparing distributions, we should think about which summary works best. See p. 266. Skewed, outliers … median and quartiles Symmetrical, no (or few) outliers … mean and std. dev. Mean and standard deviation are most common. One reason is that they allow for more sophisticated calculations to be used in higher statistics.