Describing Distributions Numerically

Slides:



Advertisements
Similar presentations
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 5- 1.
Advertisements

Copyright © 2010 Pearson Education, Inc. Slide
DESCRIBING DISTRIBUTION NUMERICALLY
F IND THE S TANDARD DEVIATION FOR THE FOLLOWING D ATA OF GPA: 4, 3, 2, 3.5,
Understanding and Comparing Distributions
Understanding and Comparing Distributions 30 min.
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Chapter 5: Understanding and Comparing Distributions
It’s an outliar!.  Similar to a bar graph but uses data that is measured.
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Copyright © 2009 Pearson Education, Inc. Chapter 5 Understanding and Comparing Distributions.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Chapter 6 Displaying and Describing Quantitative Data © 2010 Pearson Education 1.
Chapter 4 Displaying Quantitative Data. Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
Chapter 5 Understanding and Comparing Distributions Math2200.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Understanding and Comparing Distributions.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Understanding and Comparing Distributions
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Copyright © 2009 Pearson Education, Inc. Slide 4- 1 Practice – Ch4 #26: A meteorologist preparing a talk about global warming compiled a list of weekly.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Exploratory Data Analysis
Probability & Statistics
5-Number Summaries, Outliers, and Boxplots
Understanding and Comparing Distributions
Chapter 5 : Describing Distributions Numerically I
Describing Distributions Numerically
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Unit 2 Section 2.5.
Objective: Given a data set, compute measures of center and spread.
U4D3 Warmup: Find the mean (rounded to the nearest tenth) and median for the following data: 73, 50, 72, 70, 70, 84, 85, 89, 89, 70, 73, 70, 72, 74 Mean:
Bell Ringer Create a stem-and-leaf display using the Super Bowl data from yesterday’s example
Laugh, and the world laughs with you. Weep and you weep alone
The histograms represent the distribution of five different data sets, each containing 28 integers from 1 through 7. The horizontal and vertical scales.
Understanding and Comparing Distributions
CHAPTER 1 Exploring Data
Box and Whisker Plots Algebra 2.
2.6: Boxplots CHS Statistics
Displaying and Summarizing Quantitative Data
DAY 3 Sections 1.2 and 1.3.
A Modern View of the Data
Histograms: Earthquake Magnitudes
Numerical Measures: Skewness and Location
Displaying and Summarizing Quantitative Data
Understanding and Comparing Distributions
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
Understanding and Comparing Distributions
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Describing a Skewed Distribution Numerically
Define the following words in your own definition
Organizing, Summarizing, &Describing Data UNIT SELF-TEST QUESTIONS
Exploratory Data Analysis
Chapter 1: Exploring Data
Understanding and Comparing Distributions
Describing Distributions Numerically
Honors Statistics Review Chapters 4 - 5
Understanding and Comparing Distributions
The Five-Number Summary
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

Describing Distributions Numerically Statistics 5

We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram of the Average Wind Speed for every day in 1989. The Big Picture

The Big Picture The distribution is unimodal and skewed to the right. The high value may be an outlier Median daily wind speed is about 1.90 mph and the IQR is reported to be 1.78 mph. Can we say more? The Big Picture

The Five-Number Summary The five-number summary of a distribution reports its median, quartiles, and extremes (maximum and minimum). Example: The five-number summary for for the daily wind speed is: Max 8.67 Q3 2.93 Median 1.90 Q1 1.15 Min 0.20 The Five-Number Summary

Daily Wind Speed: Making Boxplots A boxplot is a graphical display of the five-number summary. Boxplots are useful when comparing groups. Boxplots are particularly good at pointing out outliers. Daily Wind Speed: Making Boxplots

Constructing Boxplots Draw a single vertical axis spanning the range of the data. Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box. Constructing Boxplots

Constructing Boxplots Erect “fences” around the main part of the data. The upper fence is 1.5 IQRs above the upper quartile. The lower fence is 1.5 IQRs below the lower quartile. Note: the fences only help with constructing the boxplot and should not appear in the final display. Constructing Boxplots

Constructing Boxplots Use the fences to grow “whiskers.” Draw lines from the ends of the box up and down to the most extreme data values found within the fences. If a data value falls outside one of the fences, we do not connect it with a whisker. Constructing Boxplots

Constructing Boxplots Add the outliers by displaying any data values beyond the fences with special symbols. We often use a different symbol for “far outliers” that are farther than 3 IQRs from the quartiles. Constructing Boxplots

Boxplot

Henry Aarons home run totals in 23 seasons are shown. Create a boxplot 10, 23, 13, 20, 24, 26, 27, 29, 30, 32, 34, 34, 38, 39, 39, 40, 40, 44, 44, 44, 44, 46, 47 Boxplot Practice

Wind Speed: Making Boxplots Compare the histogram and boxplot for daily wind speeds: How does each display represent the distribution? Wind Speed: Making Boxplots

It is almost always more interesting to compare groups. With histograms, note the shapes, centers, and spreads of the two distributions. What does this graphical display tell you? Comparing Groups

Boxplots offer an ideal balance of information and simplicity, hiding the details while displaying the overall summary information. We often plot them side by side for groups or categories we wish to compare. What do these boxplots tell you? Comparing Groups

If there are any clear outliers and you are reporting the mean and standard deviation, report them with the outliers present and with the outliers removed. The differences may be quite revealing. Note: The median and IQR are not likely to be affected by the outliers. What About Outliers?

Timeplots: Order, Please! For some data sets, we are interested in how the data behave over time. In these cases, we construct timeplots of the data. Timeplots: Order, Please!

*Re-expressing Skewed Data to Improve Symmetry When the data are skewed it can be hard to summarize them simply with a center and spread, and hard to decide whether the most extreme values are outliers or just part of a stretched out tail. How can we say anything useful about such data? *Re-expressing Skewed Data to Improve Symmetry

*Re-expressing Skewed Data to Improve Symmetry One way to make a skewed distribution more symmetric is to re-express or transform the data by applying a simple function (e.g., logarithmic function). Note the change in skewness from the raw data (previous slide) to the transformed data (right): *Re-expressing Skewed Data to Improve Symmetry

Practice

Practice

Avoid inconsistent scales, either within the display or when comparing two displays. Label clearly so a reader knows what the plot displays. Good intentions, bad plot: What Can Go Wrong?

What Can Go Wrong? Beware of outliers Be careful when comparing groups that have very different spreads. Consider these side-by-side boxplots of cotinine levels: Re-express . . . What Can Go Wrong?

We’ve learned the value of comparing data groups and looking for patterns among groups and over time. We’ve seen that boxplots are very effective for comparing groups graphically. We’ve experienced the value of identifying and investigating outliers. We’ve graphed data that has been measured over time against a time axis and looked for long-term trends both by eye and with a data smoother. What have we learned?

Pages 91 – 100 12, 15, 20, 24, 31, 36, 48, 49, 50 Homework