Chapter 6: Interpreting the Measures of Variability.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

C. D. Toliver AP Statistics
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Descriptive Measures MARE 250 Dr. Jason Turner.
CHAPTER 1 Exploring Data
Numerically Summarizing Data
SECTION 3.3 MEASURES OF POSITION Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
Chapter 3 Numerically Summarizing Data Section 3.5 Five Number Summary; Boxplots.
MEASURES OF SPREAD – VARIABILITY- DIVERSITY- VARIATION-DISPERSION
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Statistics: Use Graphs to Show Data Box Plots.
CHAPTER 2: Describing Distributions with Numbers
5 Number Summary Box Plots. The five-number summary is the collection of The smallest value The first quartile (Q 1 or P 25 ) The median (M or Q 2 or.
The Five-Number Summary And Boxplots. Chapter 3 – Section 5 ●Learning objectives  Compute the five-number summary  Draw and interpret boxplots 1 2.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Box and Whisker Plots A Modern View of the Data. History Lesson In 1977, John Tukey published an efficient method for displaying a five-number data summary.
Vocabulary for Box and Whisker Plots. Box and Whisker Plot: A diagram that summarizes data using the median, the upper and lowers quartiles, and the extreme.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Department of Quantitative Methods & Information Systems
Enter these data into your calculator!!!
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Box Plots Lesson After completing this lesson, you will be able to say: I can find the median, quartile, and interquartile range of a set of data.
LECTURE 8 Thursday, 19 February STA291 Fall 2008.
Copyright © 2005 Pearson Education, Inc. Slide 6-1.
Exploratory Data Analysis
Chapter 2 Describing Data.
Measure of Central Tendency Measures of central tendency – used to organize and summarize data so that you can understand a set of data. There are three.
Percentiles For any whole number P (between 1 and 99), the Pth percentile of a distribution is a value such that P% of the data fall at or below it. The.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 5 – Slide 1 of 21 Chapter 3 Section 5 The Five-Number Summary And Boxplots.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Summary Statistics: Measures of Location and Dispersion.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Unit 3: Averages and Variations Part 3 Statistics Mr. Evans.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Interpreting Categorical and Quantitative Data. Center, Shape, Spread, and unusual occurrences When describing graphs of data, we use central tendencies.
Exploratory Data Analysis
a graphical presentation of the five-number summary of data
Describing Distributions Numerically
Get out your notes we previously took on Box and Whisker Plots.
Unit 2 Section 2.5.
CHAPTER 1 Exploring Data
Chapter 2b.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Box and Whisker Plots Algebra 2.
A Modern View of the Data
Numerical Measures: Skewness and Location
Quartile Measures DCOVA
CHAPTER 1 Exploring Data
Measures of Central Tendency
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Day 52 – Box-and-Whisker.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Presentation transcript:

Chapter 6: Interpreting the Measures of Variability

To date we have discussed three measures of central tendency (mean, median, and mode) and three measures of variability (range, IQR, and standard deviation). To adequately summarize a set of data we need both measures. Remember that mean and range are may not be the best measures to use when a distribution is skewed or contains outliers. Better options would be median and IQR.

Five-Number Summary The five-number summary of a distribution for numerical data consists of: – The smallest observation (minimum) – The first quartile – The median – The third quartile – The largest observation (maximum) These five values are used to draw a boxplot (aka box-and-whisker plot).

Boxplots Boxplots are a statistical device used to examine graphically the shape of a distribution, the range and IQR of a distribution, and the sides with the greatest concentration of observations. Boxplots serve as a statistical tool for summarizing and comparing numerical data from two or more samples, in particular where their medians are located, how spread out they are, and whether they are symmetric, positively skewed (skewed right) or negatively skewed (skewed left). Boxplots also identify outliers of a distribution.

The whiskers of the boxplot extend to the lowest and highest values in data set that are not outliers. Outliers are marked separately with an asterisk or circle.

Positively Skewed Data When the median is closer to the bottom of the box, and if the whisker is shorter on the lower end of the box, then the distribution is positively skewed (skewed right).

Negatively Skewed Data When the median is closer to the top of the box, and if the whisker is shorter on the upper end of the box, then the distribution is negatively skewed (skewed left).

Symmetrical Data When the median is in the middle of the box, and the whiskers are about the same on both sides of the box, then the distribution is symmetric.

Outliers Outliers are any value that falls out of the pattern of the rest of the data (unusually high or unusually low values in a distribution). The rule of thumb for an observation being an outlier is if the observation lies more than 1.5 IQR’s below the first quartile or above the third quartile.

Example: On September 20, 2009, the Tennessee Titans played the Houston Texans. Here are the rushing yards Titan’s running back Chris Johnson had for each of his 16 rushing attempts. Determine if there are any outliers.

Steps to Make a Boxplot 1)Draw a central box (rectangle) from the first quartile to the third quartile 2)Draw a vertical line to mark the median 3)Draw horizontal lines (whiskers) that extend from the box out to the smallest and largest observations that are not outliers 4)If there are any outliers, mark them separately

Example: Draw a boxplot for the Chris Johnson data.

From the boxplot, we can see that our distribution contains no outliers. In addition, due to the location of the median and the shorter left whisker, we can also state that our distribution is slightly skewed right (positively skewed).

Example: Construct a boxplot for the following set of data: 1, 6, 5, 4, 10, 16, 8, 3, 18, 13 Order: 1, 3, 4, 5, 6, 8, 10, 13, 16, 18 Median: 7Q1: 4Q3: 13IQR: 9 There are no outliers.

The Empirical Rule In symmetric (bell-shaped) distributions, the values are distributed symmetrically about the mean in such a way that the values are clustered most densely around the mean and become rarer as the distance between the values and the mean widens. This distribution is called a Normal distribution and the graph is called the Normal curve. The empirical rule is a statement about the proportion of the items that falls within different standard deviation units from the mean, when the distribution is a Normal distribution.

The empirical rule is also known as the rule, for an obvious reason. In general, in a Normal distribution: – Approximately 68% of the observations will be within 1 standard deviation of the mean. – Approximately 95% of the observations will be within 2 standard deviations of the mean. – Approximately 99.7% of the observations will be within 3 standard deviations of the mean.

A Visual Summary of the Empirical Rule

Example: Suppose a sample of scores yields a mean of 100 and a standard deviation of 15. Assume that the distribution is Normal. What percent of scores should fall between 85 and 115? (Hint: Draw a diagram first!) 85 and 115 are both one standard deviation from the mean, so the percent of scores that fall between 85 and 115 is approximately 68% Let’s try some more with the same distribution…

What percent of scores should fall: a)Between 70 and 130? 95% c) Between 70 and 115? 13.5%+68%=81.5% e) Less than 70? 2.5% b) Between 55 and 145? 99.7% d) Greater than 115? 13.5%+2.5%=16%