Chapter 5 Describing Distributions Numerically.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Describing Distributions With Numbers
Descriptive Measures MARE 250 Dr. Jason Turner.
CHAPTER 1 Exploring Data
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Basic Practice of Statistics - 3rd Edition
Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
Objectives 1.2 Describing distributions with numbers
1.3: Describing Quantitative Data with Numbers
Have out your calculator and your notes! The four C’s: Clear, Concise, Complete, Context.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
1 Further Maths Chapter 2 Summarising Numerical Data.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
Quantitative data. mean median mode range  average add all of the numbers and divide by the number of numbers you have  the middle number when the numbers.
Essential Statistics Chapter 21 Describing Distributions with Numbers.
Chapter 2 Describing Distributions with Numbers. Numerical Summaries u Center of the data –mean –median u Variation –range –quartiles (interquartile range)
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
1.3 Describing Quantitative Data with Numbers Pages Objectives SWBAT: 1)Calculate measures of center (mean, median). 2)Calculate and interpret measures.
Summary Statistics, Center, Spread, Range, Mean, and Median Ms. Daniels Integrated Math 1.
BPS - 5th Ed.Chapter 21 Describing Distributions with Numbers.
IPS Chapter 1 © 2012 W.H. Freeman and Company  1.1: Displaying distributions with graphs  1.2: Describing distributions with numbers  1.3: Density Curves.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
CHAPTER 1 Exploring Data
Describing Distributions Numerically
Chapter 5 : Describing Distributions Numerically I
CHAPTER 2: Describing Distributions with Numbers
1st Semester Final Review Day 1: Exploratory Data Analysis
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Measures of central tendency
Please take out Sec HW It is worth 20 points (2 pts
CHAPTER 1 Exploring Data
1.3 Describing Quantitative Data with Numbers
Describing Quantitative Data with Numbers
Basic Practice of Statistics - 3rd Edition
Measures of central tendency
Measures of Central Tendency
Describing a Skewed Distribution Numerically
Define the following words in your own definition
AP Statistics Day 4 Objective: The students will be able to describe distributions with numbers and create and interpret boxplots.
Chapter 1 Warm Up .
Organizing, Summarizing, &Describing Data UNIT SELF-TEST QUESTIONS
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Essential Statistics Describing Distributions with Numbers
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Univariate Data Univariate Data: involving a single variable
CHAPTER 1 Exploring Data
Presentation transcript:

Chapter 5 Describing Distributions Numerically

Describing the Distribution Center Median (.5 quartile, 2nd quartile, 50th percentile) Mean Spread Range Interquartile Range Standard Deviation

Median Literally = middle number (data value) Has the same units as the data n (number of observations) is odd Order the data from smallest to largest Median is the middle number on the list (n+1)/2 number from the smallest value Ex: If n=11, median is the (11+1)/2 = 6th number from the smallest value Ex: If n=37, median is the (37+1)/2 = 19th number from the smallest value

Example – Frank Thomas 15 observations Median = 32 HRs Career Home Runs 4 7 15 18 24 28 29 32 35 38 40 40 41 42 43 Remember to order the values, if they aren’t already in order! 15 observations (15+1)/2 = 8th observation from bottom Median = 32 HRs

Median n is even Order the data from smallest to largest Median is the average of the two middle numbers (n+1)/2 will be halfway between these two numbers Ex: If n=10, (10+1)/2 = 5.5, median is average of 5th and 6th numbers from smallest value

Example – Ryne Sandberg 16 observations (16 + 1)/2 = 8.5, average of 8th and 9th observations from bottom Median = average of 16 and 19 Median = 17.5 HRs Career Home Runs 0 5 7 8 9 12 14 16 19 19 25 26 26 26 30 40 Remember to order the values if they aren’t already in order!

Mean Ordinary average Formula Add up all observations Divide by the number of observations Has the same units as the data Formula n observations y1, y2, y3, …, yn are the values

Mean

Examples Thomas Sandberg FIND THE MEAN

Mean vs. Median Median = middle number Mean = value where histogram balances Mean and Median similar when Data are symmetric Mean and median different when Data are skewed There are outliers

Mean vs. Median Mean influenced by unusually high or unusually low values Example: Income in a small town of 6 people $25,000 $27,000 $29,000 $35,000 $37,000 $38,000 **The mean income is $31,830 **The median income is $32,000

Mean vs. Median Bill Gates moves to town Mean is pulled by the outlier $25,000 $27,000 $29,000 $35,000 $37,000 $38,000 $40,000,000 **The mean income is $5,741,571 **The median income is $35,000 Mean is pulled by the outlier Median is not Mean is not a good center of these data

Mean vs. Median Skewness pulls the mean in the direction of the tail Skewed to the right = mean > median Skewed to the left = mean < median Outliers pull the mean in their direction Large outlier = mean > median Small outlier = mean < median

Spread Range = maximum – minimum Thomas Sandberg Min = 4, Max = 43, Range = 43 - 4 = 39 HRs Sandberg Min = 0, Max = 40, Range = 40 - 0 = 40 HRs

Spread Range is a very basic measure of spread It is highly affected by outliers Makes spread appear larger than reality Ex. The annual numbers of deaths from tornadoes in the U.S. from 1990 to 2000: 53 39 39 33 69 30 25 67 130 94 40 Range with outlier: 130 – 25 = 105 tornadoes Range without outlier: 94 – 25 = 69 tornadoes

Spread Interquartile Range (IQR) IQR = Q3 – Q1 First Quartile (Q1) Larger than about 25% of the data Third Quartile (Q3) Larger than about 75% of the data IQR = Q3 – Q1 Center (Middle) 50% of the values

Finding Quartiles Order the data Split into two halves at the median When n is odd, include the median in both halves When n is even, do not include the median in either half Q1 = median of the lower half Q3 = median of the upper half

Example – Frank Thomas Order the values (15 values) 4 7 15 18 24 28 29 32 35 38 40 40 41 42 43 Lower Half = 4 7 15 18 24 28 29 32 Q1 = Median of lower half = 21 HRs Upper Half = 32 35 38 40 40 41 42 43 Q3 = Median of upper half = 40 HRs IQR = 40 – 21 = 19 HRs

Example – Ryne Sandberg Order the values (16 values) 0 5 7 8 9 12 14 16 19 19 25 26 26 26 30 40 Lower Half = 0 5 7 8 9 12 14 16 Q1 = Median of lower half = 8.5 HRs Upper Half =19 19 25 26 26 26 30 40 Q3 = Median of upper half = 26 HRs IQR = Q3 – Q1 = 26 – 8.5 = 17.5 HRs

Five Number Summary Minimum Q1 Median Q3 Maximum

Examples Thomas Sandberg Min = 4 HRs Q1 = 21 HRs Median = 32 HRs Max = 43 HRs Sandberg Min = 0 HRs Q1 = 8.5 HRs Median = 17.5 HRs Q3 = 26 HRs Max = 40 HRs

Graph of Five Number Summary Boxplot Box between Q1 and Q3 Line in the box marks the median Lines extend out to minimum and maximum Best used for comparisons Use this simpler method

Example – Thomas & Sandberg Boxplot of Thomas Home Runs Box from 21 to 40 Line in box 32 Lines extend out from box from 4 and 43 Boxplot of Sandberg Home Runs Box from 8.5 to 26 Line in box at 17.5 Lines extend out from box to 0 and 40

Side by Side Boxplots of Thomas & Sandberg Home Runs

Spread Standard deviation “Average” spread from mean Most common measure of spread (Although it is influenced by skewness and outliers) Denoted by letter s Make a table when calculating by hand

Standard Deviation

Example – Deaths from Tornadoes 53 53-56.27 =-3.27 10.69 39 39-56.27 = -17.27 298.25 33 33-56.27 = -23.27 541.49 69 69-56.27 = 12.73 162.05 30 30-56.27 = -26.27 690.11 25 25-56.27 = -31.27 977.81 67 67-56.27 = 10.73 115.13 130 130-56.27 = 73.73 5436.11 94 94-56.27 = 37.73 1423.55 40 40-56.27 = -16.27 264.71

Example – Frank Thomas Find the standard deviation of the number of home runs given the following statistic:

Properties of s s = 0 only when all observations are equal; otherwise, s > 0 s has the same units as the data s is not resistant Skewness and outliers affect s, just like mean Tornado Example: s with outlier: 31.97 tornadoes s without outlier: 21.70 tornadoes

Which summaries should you use? What numbers are affected by outliers? Mean Standard deviation Range What numbers are not affected by outliers? Median IQR

Which summaries should you use? Five Number Summary Skewed Data Data with outliers Mean and Standard Deviation Symmetric Data ALWAYS PLOT YOUR DATA!!