The Practice of Statistics, Fourth Edition.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
CHAPTER 1 Exploring Data
Objectives 1.2 Describing distributions with numbers
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
5-Number Summaries, Outliers, and Boxplots
CHAPTER 1 Exploring Data
Describing Distributions Numerically
Box and Whisker Plots and the 5 number summary
STAT 4030 – Programming in R STATISTICS MODULE: Basic Data Analysis
Chapter 1: Exploring Data
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
Bell Ringer Create a stem-and-leaf display using the Super Bowl data from yesterday’s example
CHAPTER 1 Exploring Data
Chapter 2b.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Box and Whisker Plots Algebra 2.
Please take out Sec HW It is worth 20 points (2 pts
Warmup What is the shape of the distribution? Will the mean be smaller or larger than the median (don’t calculate) What is the median? Calculate the.
Describing Distributions with Numbers
Measure of Center And Boxplot’s.
Measure of Center And Boxplot’s.
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Measures of Central Tendency
Chapter 1: Exploring Data
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Mean As A Balancing Point
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
The Five-Number Summary
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Box and Whisker Plots and the 5 number summary
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Compare and contrast histograms to bar graphs
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
The Practice of Statistics, 4th edition STARNES, YATES, MOORE
Chapter 1: Exploring Data
Presentation transcript:

The Practice of Statistics, Fourth Edition. Describing Quantitative Data with numbers: Measures of Center and Spread Section 1.3 Reference Text: The Practice of Statistics, Fourth Edition. Starnes, Yates, Moore Lesson 1.2.1

Today’s Objectives Define specific measures of center: Mean and Median. Recognize the 5-number summary of data. Determine Interquartile Range (IQR) outliers by the 1.5 IQR rule. Draw boxplots. Break Describe RESISTANCE as it applies to mean and median. Standard Deviation Lesson 1.2.1

Some Humor…

Measures of Center The MEAN of a data set is its arithmetic average. Use the symbol for mean. Calculate by adding all the numbers and divide by how many individual numbers there are. Example: 11+72+83+94+25 5 The MEDIAN of a data set is its midpoint. That is, half the data fall above the median and half fall below the median. Use “med” for median (TI uses “med”). Sort data from low to high, count to middle. The MODE is the most frequently occurring value…We will rarely be interested in it. Lesson 1.2.1

Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Calculate the mean, median, and mode of these data. Here are the data: 16, 3, 16, 15, 13, 26, 15, 13, 14, 3, 10, 8, 9, 2, 9 Mean: _________ Median: _________ Mode: __________ Lesson 1.2.1

Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Calculate the mean, median, and mode of these data. Here are the data: 16, 3, 16, 15, 13, 26, 15, 13, 14, 3, 10, 8, 9, 2, 9 Rearranged: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 Mean: ____11.46_____ Median: ____13_____ Mode: _____3,9,13,15,16_____ Lesson 1.2.1

5 Number Summary Consisting of: Minimum, Q1, Medium, Q3, and Maximum Minimum: Smallest value of the sample data Q1: first quartile, this is the median of the lower half of data [lower 25% of data falls in this range] Median: technically Q2, middle point of sample data Q3: third quartile, this is the median of the upper half of data [upper 25% of data falls in this range] Maximum: largest value of the sample data This 5 number summary can be used to create a boxplot. (aka box and whisker plot)

Find the 5 number summary, IQR, and Range Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Find the 5 number summary, IQR, and Range Here are the data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 Min:_______ Q1:________ Range:_______ Med:_______ IQR:________ Q3:________ Max:_______ Lesson 1.2.1

Find the 5 number summary, IQR, and Range Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Find the 5 number summary, IQR, and Range Here are the data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 Min:___2____ Q1:____8____ Range:_______ Med:___13____ IQR:________ Q3:_____15___ Max:___26____ Lesson 1.2.1

Measures of Spread: 5 number summary RANGE = maximum value – minimum value Inter-quartile range (IQR) = Q3 – Q1 Lesson 1.2.1

Find the 5 number summary, IQR, and Range Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Find the 5 number summary, IQR, and Range Here are the data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 Min:___2____ Q1:____8____ Range:_______ Med:___13____ IQR:________ Q3:_____15___ Max:___26____ Lesson 1.2.1

Find the 5 number summary, IQR, and Range Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Find the 5 number summary, IQR, and Range Here are the data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 Min:___2____ Q1:____8____ Range:___24___ Med:___13____ IQR:____7____ Q3:_____15___ Max:___26____ Lesson 1.2.1

Boxplots: Box and Whisker Plot

Describing Quantitative Data Construct a Boxplot Consider our NY travel times data. Construct a boxplot. Example Describing Quantitative Data 10 30 5 25 40 20 15 85 65 60 45 5 10 15 20 25 30 40 45 60 65 85 Min=5 Q1 = 15 M = 22.5 Q3= 42.5 Max=85 Recall, this is an outlier by the 1.5 x IQR rule

Make a box plot! Lets make a box plot using the same important information from the 5 number summary! Lets use the information from your 1.2 homework worksheet, number 10 84, 76, 92, 92, 88, 96, 68, 80, 92, 88, 76, 96

Answer The mean is 11.5 Data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 The median is 13 For the boxplot, we also need Minimum = 2 and Maximum = 26 First Quartile = 8 and Third Quartile = 15 Lesson 1.2.1

Outliers Outliers are observations (data points) “too far” removed from the main body of data. Outliers often skew our data. We can calculate how to find outliers with the “1.5 Rule” for outliers. Upper Outlier: Any observation above Q3 + 1.5 x IQR Lower Outlier: Any observation below Q1 – 1.5 x IQR How did this apply to the “Rainy Days”? The next slide has the data again…. Lesson 1.2.1

Find the 5 number summary, IQR, and Range Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Find the 5 number summary, IQR, and Range Here are the data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 Min:___2____ Q1:____8____ Range:___24___ Med:___13____ IQR:____7____ Q3:_____15___ Outliers? Upper: Q3 + 1.5(IQR) Max:___26____ Lower: Q1 – 1.5(IQR) Lesson 1.2.1

Find the 5 number summary, IQR, and Range Rainy Days! For the last 15 years, I have kept track of the number of rainy days we had in April. The results are below. Find the 5 number summary, IQR, and Range Here are the data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 Min:___2____ Q1:____8____ Range:___24___ Med:___13____ IQR:____7____ Q3:_____15___ Outliers? Upper: Q3 + 1.5(IQR)= 25.5 Max:___26____ Lower: Q1 – 1.5(IQR)= -2.5 Lesson 1.2.1

Outlier in our data! 26 is an outlier for our data! We must take this into account when constructing a boxplot When constructing a box plot we now only extend the “whisker” to the point of the data that stays within our boundaries for outliers. The way we represent an outlier on a box plot is just a single point (dot). Lets make a new boxplot taking this into account! Lets look at the box plot again!

For the boxplot, we also need Answer Data: 2, 3, 3, 8, 9, 9, 10, 13, 13, 14, 15, 15, 16, 16, 26 The median is 13 For the boxplot, we also need Minimum = 2 and Maximum = 26 First Quartile = 8 and Third Quartile = 15 Lesson 1.2.1

C.U.S.S put to use with rainy days example: After we have crunched numbers and calculated all of this information that describes this quantitative data, we need to communicate it back into context! My write up: The data recorded for the number of rainy days in April for the past 15 years appears to have a median number of 13 rainy days with one outlier of 26 rainy days. The data appears to have a slight skewness to the right with a spread of 24 rainy days; this includes the outlier found in our recorded data. Center: “…median number of 13 rainy days…” Unusual points: “…with one outlier of 26 rainy days.” Spread: “…with a spread of 24 rainy days” Shape: “…the data appears to have a slight skewness to the right…”

Resistance What does the word resistant mean? In statistics: The median is resistant to outliers! A really large/small number relatively will not affect the median. Think of the median as Gandalf “you shall not pass!” The mean is not resistant to outliers! Really Large/small numbers can change the average, it gets “pulled” to the left or right. (so is Standard Deviation)

Bell Curve Statistics is about representing data and analyzing it in order to report back the findings. Part of representing data is through graphs. Pie charts, histograms, stem-and-leaf plot, and box plots. All of these graphs can be transformed from one or the other. One of the most useful visuals in statistics is the Standard Normal Curve. Or “bell curve” Here is what a bell curve looks like, and also a matching box plot.

Notice how the median is centered in the middle of the bell curve Notice how the median is centered in the middle of the bell curve. Could we break the box plot up into the percentages of data that falls between each quartile??

Empirical rule 68-95-99.7%  

Empirical rule

Standard Deviation Aircraft! Medical Records! Baseball! Since we will be sampling from different areas of interest, such as … Baseball! Insurance Records! Cars!

Standard Deviation Since we will be sampling from different areas of interest, such as … We need to make sure we are talking about standard deviation in context to the problem! Every set of sample data h baseball as its own unique sample standard deviation. Since we are touching the basis of statistics at this point we will not worry about distinguishing between what it means to calculate Sample Standard Deviation versus the Population Standard Deviation. At this point we want to get down the basics, then later down in our stats career we will make sure to distinguish between the differences!

Calculating the Standard Deviation  

   

Step 2: Subtract the mean from each and every data point we have.   Step 2: Subtract the mean from each and every data point we have. Step 3: square the differences from each and every point we have.

   

68-95-99.7 Rule With Pets  

Use 1-Var Stats for s Copy the list RAIN into L1 Tap STAT : CALC : 1-Var Stats L1 Read the result as Sx Note: There is also σx which is the standard deviation calculated by dividing by n rather than n-1. It is used for populations rather than samples, and we will deal with it later. Lesson 1.2.2

Today’s Objectives Define specific measures of center: Mean and Median. Recognize the 5-number summary of data. Determine Interquartile Range (IQR) outliers by the 1.5 IQR rule. Draw boxplots. Describe RESISTANCE as it applies to mean and median. Standard Deviation Lesson 1.2.1

Continue working on Chapter 1 Reading Guide Homework 1.3 Homework Worksheet Continue working on Chapter 1 Reading Guide Lesson 1.2.1