Chapter 5: Understanding and Comparing Distributions

Slides:



Advertisements
Similar presentations
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 5- 1.
Advertisements

Copyright © 2010 Pearson Education, Inc. Slide
Describing Quantitative Variables
DESCRIBING DISTRIBUTION NUMERICALLY
F IND THE S TANDARD DEVIATION FOR THE FOLLOWING D ATA OF GPA: 4, 3, 2, 3.5,
Understanding and Comparing Distributions
Understanding and Comparing Distributions 30 min.
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Slide 3- 1 Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
Copyright © 2010 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Copyright © 2009 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 4- 1.
Chapter 4: Displaying Quantitative Data
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Displaying and Summarizing Quantitative Data.
It’s an outliar!.  Similar to a bar graph but uses data that is measured.
Understanding and Comparing Distributions
Understanding and Comparing Distributions
5 Number Summary Box Plots. The five-number summary is the collection of The smallest value The first quartile (Q 1 or P 25 ) The median (M or Q 2 or.
The Five-Number Summary And Boxplots. Chapter 3 – Section 5 ●Learning objectives  Compute the five-number summary  Draw and interpret boxplots 1 2.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Slide Slide 1 Baby Leo’s 4-month “Healthy Baby” check-up reported the following: 1)He is in the 90 th percentile for weight 2)He is in the 95 th percentile.
1.1 Displaying Distributions with Graphs
Have out your calculator and your notes! The four C’s: Clear, Concise, Complete, Context.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Copyright © 2009 Pearson Education, Inc. Chapter 5 Understanding and Comparing Distributions.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Chapter 6 Displaying and Describing Quantitative Data © 2010 Pearson Education 1.
Slide 4-1 Copyright © 2004 Pearson Education, Inc. Dealing With a Lot of Numbers… Summarizing the data will help us when we look at large sets of quantitative.
Chapter 4 Displaying Quantitative Data. Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or.
Unit 4 Statistical Analysis Data Representations.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
Chapter 5 Understanding and Comparing Distributions Math2200.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Understanding and Comparing Distributions.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Understanding and Comparing Distributions
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 5 – Slide 1 of 21 Chapter 3 Section 5 The Five-Number Summary And Boxplots.
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
Displaying Quantitative Data AP STATS NHS Mr. Unruh.
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
Chapter 3: Displaying and Summarizing Quantitative Data Part 1 Pg
Understanding & Comparing Distributions Chapter 5.
Copyright © 2009 Pearson Education, Inc. Slide 4- 1 Practice – Ch4 #26: A meteorologist preparing a talk about global warming compiled a list of weekly.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Probability & Statistics
Understanding and Comparing Distributions
Chapter 5 : Describing Distributions Numerically I
Describing Distributions Numerically
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Unit 2 Section 2.5.
Objective: Given a data set, compute measures of center and spread.
Describing Distributions Numerically
Laugh, and the world laughs with you. Weep and you weep alone
The histograms represent the distribution of five different data sets, each containing 28 integers from 1 through 7. The horizontal and vertical scales.
Understanding and Comparing Distributions
2.6: Boxplots CHS Statistics
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Understanding and Comparing Distributions
Describing Distributions Numerically
Honors Statistics Review Chapters 4 - 5
Understanding and Comparing Distributions
Presentation transcript:

Chapter 5: Understanding and Comparing Distributions AP Statistics

Understanding and Comparing Distributions In Chapter 3 we compared the associations between two categorical variables through contingency tables and displays. Now we will look at the different ways of examining the relationships between two variables (one quantitative and one categorical).

The Big Picture We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram of the Average Wind Speed for every day in 1989.

The Big Picture (cont.) The distribution is unimodal and skewed to the right. The high value may be an outlier Median daily wind speed is about 1.90 mph and the IQR is reported to be 1.78 mph. Can we say more?

Review: The Five-Number Summary The five-number summary of a distribution reports its median, quartiles, and extremes (maximum and minimum). Example: The five-number summary for the daily wind speed is: Max 8.67 Q3 2.93 Median 1.90 Q1 1.15 Min 0.20

Making Boxplots A boxplot is a graphical display of the five-number summary. Boxplots are useful when comparing groups. Boxplots are particularly good at pointing out outliers.

Constructing Boxplots Draw a single vertical axis spanning the range of the data. Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box.

Constructing Boxplots (cont.) Draw “fences” around the main part of the data. The upper fence is 1.5*(IQR) above the upper quartile. The lower fence is 1.5*(IQR) below the lower quartile. Note: the fences only help with constructing the boxplot and should not appear in the final display.

Constructing Boxplots (cont.) Use the fences to grow “whiskers.” Draw lines from the ends of the box up and down to the most extreme data values found within the fences. If a data value falls outside one of the fences, we do not connect it with a whisker.

Constructing Boxplots (cont.) Add the outliers by displaying any data values beyond the fences with special symbols. We often use a different symbol for “far outliers” that are farther than 3 IQRs from the quartiles.

Wind Speed: Making Boxplots Compare the histogram and boxplot for daily wind speeds: How does each display represent the distribution?

What Do Boxplots Tell Us? The center of the boxplot shows us the middle half of the data between the quartiles. The height of the box is equal to the IQR. If the median is roughly centered between the quartiles, then the middle half of the data is roughly symmetric. Thus, if the median is not centered, the distribution is skewed. The whiskers also show the skewness if they are not the same length. Outliers are out of the way to keep you from judging skewness, but give them special attention.

Comparing Groups It is almost always more interesting to compare groups. With histograms, note the shapes, centers, and spreads of the two distributions. What does this graphical display tell you?

Comparing Groups (Stem-and-Leaf Displays) In 2004, the average number of SV students to pass the AP Statistics exam was 2.9 out of 15 students in a class. Ms. Halliday collected class data from the past AP Statistics classes. Since there were so many, she broke them off into males and females to compare. The results are to the right. How does the number of students who pass the AP Statistics exam compare to males and females? Key: 1|3| = 3.1

Comparing Groups (cont.) Boxplots offer an ideal balance of information and simplicity, hiding the details while displaying the overall summary information. We often plot them side by side for groups or categories we wish to compare. What do these boxplots tell you?

What About Outliers? Day 2 If there is a legitimate reason for a mistake in the data, disregard the outlier and give your reason for doing so. If there are any clear outliers and you cant tell if they are errors or not, it is best to report the mean and standard deviation with the outliers present and with the outliers removed. The differences may be quite revealing. Note: Remember that the median and IQR are not likely to be affected by the outliers.

Timeplots For some data sets, we are interested in how the data behave over time. In these cases, we construct timeplots of the data. NOTE: Time is ALWAYS on the horizontal axis.

*Re-expressing Skewed Data to Improve Symmetry When the data are skewed it can be hard to summarize them simply with a center and spread, and hard to decide whether the most extreme values are outliers or just part of a stretched out tail. How can we say anything useful about such data?

*Re-expressing Skewed Data to Improve Symmetry (cont.) One way to make a skewed distribution more symmetric is to re-express or transform the data by applying a simple function (e.g., logarithmic function). Note the change in skewness from the raw data (previous slide) to the transformed data (right):

What Can Go Wrong? Avoid inconsistent scales, either within the display or when comparing two displays. Label clearly so a reader knows what the plot displays. Good intentions, bad plot:

What Can Go Wrong? (cont.) Beware of outliers Be careful when comparing groups that have very different spreads. Consider these side-by-side boxplots of cotinine levels: Re-express . . .

Histogram vs. Timeplot Imagine we were to examine data for the number of inches of snowfall in December in Bismarck, ND, for the past 100 years. What information could we learn from looking at a histogram that would be difficult to see in a timeplot? What information could we see in a timeplot that would not show up in a histogram?

Recap We’ve learned the value of comparing data groups and looking for patterns among groups and over time. We’ve seen that boxplots are very effective for comparing groups graphically. We’ve experienced the value of identifying and investigating outliers. We’ve graphed data that has been measured over time against a time axis and looked for long-term trends both by eye and with a data smoother.

Assignments: pp. 95 – 103 Day 1: # 5, 7, 10, 11, 12, 15 Day 3: # 27, 3, 35, 36b, 39, 40 Reading: Chapter 6