Describing Distributions With Numbers

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Understanding and Comparing Distributions
CHAPTER 1 Exploring Data
Describing Quantitative Data with Numbers Part 2
Understanding and Comparing Distributions
CHAPTER 2: Describing Distributions with Numbers
Describing distributions with numbers
Chapter 1 Exploring Data
Enter these data into your calculator!!!
Chapter 12: Describing Distributions with Numbers We create graphs to give us a picture of the data. We also need numbers to summarize the center and spread.
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Objectives 1.2 Describing distributions with numbers
1.1 Displaying Distributions with Graphs
Describing Distributions With Numbers Section 1.3 (mean, median, range, quartiles, IQR) Target Goal: I can analyze data using shape, center and spread.
1.3: Describing Quantitative Data with Numbers
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Describing distributions with numbers
Warm-up The number of deaths among persons aged 15 to 24 years in the United States in 1997 due to the seven leading causes of death for this age group.
Chapter 3 Looking at Data: Distributions Chapter Three
1.2 Describing Distributions with Numbers Is the mean a good measure of center? Ex. Roger Maris’s yearly homerun production:
Chapter 5 Describing Distributions Numerically.
1.3 Describing Quantitative Data with Numbers Pages Objectives SWBAT: 1)Calculate measures of center (mean, median). 2)Calculate and interpret measures.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Chapter 1: Exploring Data, cont. 1.2 Describing Distributions with Numbers Measuring Center: The Mean Most common measure of center Arithmetic average,
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
CHAPTER 1 Exploring Data
Chapter 5 : Describing Distributions Numerically I
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
DAY 3 Sections 1.2 and 1.3.
Please take out Sec HW It is worth 20 points (2 pts
Warmup What is the shape of the distribution? Will the mean be smaller or larger than the median (don’t calculate) What is the median? Calculate the.
Describing Distributions with Numbers
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
CHAPTER 1 Exploring Data
Dotplots Horizontal axis with either quanitative scale or categories
Describing a Skewed Distribution Numerically
Chapter 1 Warm Up .
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Measures of Center.
Honors Statistics Review Chapters 4 - 5
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
The Five-Number Summary
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5 number summary and construct a boxplot. I can describe spread using the standard deviation of a distribution. Hw: pg 71: 92, 93, 95, 96, 97, 103, 105, 107 - 110

Five-Number Summary Data set consisting of smallest observation, first quartile, median, third quartile, and largest observation written in order. Min Q1 M Q3 Max It gives us a quick summary of both center and spread.

Bonds: Min Q1 M Q3 Max 16 25 34 41 73

Box (and whiskers) Plot A graph of a five-number summary of a distribution; best for side- by-side comparisons since they show less detail than histograms or stemplots; drawn either horizontally or vertically.

Modified Boxplot Because the regular boxplot conceals outliers we will use modified boxplot. Plots outliers as isolated points Extend “whiskers” out to largest and/or smallest data points that are not outliers Remember: label axis, title graph, scale axis.

Regular (a) and modified (b) boxplots comparing Barry Bonds and Hank Aaron home runs. Min Q1 M Q3 Max Outlier

Activity: Acing the First Test Enter the scores of Mrs Activity: Acing the First Test Enter the scores of Mrs. Liao’s students on their first statistics test into L1 from page 71, ex. 92 Sort Data(ascending): Inspire Place cursor on column title Select:Menu,1:Actions,6:sort, sort by (a) Inspire: Appendix A6

Calculator: 1 VAR STAT(L1) 43, 82, 87.75, 93, 98 a. Find the five-number summary and verify your expectation from a. Calculator activity Enter the scores into L1 from page 71. Calculator: 1 VAR STAT(L1) 43, 82, 87.75, 93, 98 mean = 2544/30 (or )= 84.8 the median is greater than the mean

Between Q1 and Q3: Between 82 and 93 b. What is the range of the middle half of the score of the statistic students? Between Q1 and Q3: Between 82 and 93

Acing the First Test Cont. Construct by hand a modified boxplot of the stats students scores. First find potential outliers. IQR = Q1 - IQR x 1.5 = Q3 + IQR x 1.5 = Outliers: Graph: Mark a small x for the outlier(s), next lowest min, Q1, M, Q3, max. Draw box and whisker plot.

Acing the First Test Cont. On your calculator: First define Plot1 to be a modified boxplot using the list. Graph, trace and compare. Is there an outlier? If so, was it the same as in part a ?  Based on the boxplot, conjecture the shape of the corresponding histogram. Histogram shape:______________________

Acing the First Test Cont. Next, Define Plot2 to be a histogram also using the same list. Trace and compare. Did you guess correctly? Roughly draw histogram below.

Important Note: If a distribution contains outliers, use the median and the IQR to describe the distribution.

The most common numerical description of a distribution is the : Standard deviation (s): measures spread by looking at how far the observations are from their mean The standard deviations (s) is the square root of the variance (s2).

Variance (s2) of a set of observations is the average of the squares of the deviations of the observations from their mean. Note: Most of the time we will use calculator (STAT:CALC:1VAR STAT).

Why square the deviations? It makes them all non negative so that the observations far from the mean in either direction will have large positive squared deviation.

Properties of the Standard Deviation The sum of the deviations of the observations from their mean will always be zero. Choose s only when mean is chosen as the measure of center. s = 0 only when there is no spread (all observations have the same value). s, like the mean is not resistant. Strong skewness or a few outliers can make s very large. If a value is more than 2σ’s from the mean it is an outlier.

Why divide by (n – 1)? Degrees of freedom – Since is the exact balancing point of the data, the data will almost always be closer to , on average, than they will be to μ. The sum of the squared deviations of will underestimate the sum of the squared deviations of µ. To correct this we divide by n-1 instead of n.

Example: Roger Maris New York Yankee Roger Maris held the single-season home run record from 1961 until 1998. Here are Maris’s home run counts for his 10 years in the American League: 14 28 16 39 61 33 23 26 8 13

14 28 16 39 61 33 23 26 8 13 Maris’s mean number of home runs is = 26.1. Find the standard deviation s from its definition (by hand). ∑ (xi - )2 = (14-26.1)2 + (28-26.1)2… s2 = / n-1 s2 = 2192.9/9 s2 = 243.66 s = 15.609

14 28 16 39 61 33 23 26 8 13 b. Use your calculator to verify your results. (STAT:CALC:1 var stat:L1) Then use your calculator to find the mean and s for the 9 observations that remain when you leave out any outlier(s). Recall IQR x 1.5 Note: they choose 61 as an outlier while the upper bound is 61.5.

Mean = 22.2 Sx = 10.244 How does the leaving out the “outlier” affect the values of the mean and s? It caused the values of both measures to decrease. Is s a resistant measure of spread? Clearly, s is not a resistant measure of spread.

Key Points of Chapter