Measures of Center & Spread. Measures of Center.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

3.3 Measures of Position Measures of location in comparison to the mean. - standard scores - percentiles - deciles - quartiles.
Measures of Dispersion
IB Math Studies – Topic 6 Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
Describing Distributions Numerically
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Measures of Central Tendency
Describing distributions with numbers
Objectives 1.2 Describing distributions with numbers
Dr. Serhat Eren DESCRIPTIVE STATISTICS FOR GROUPED DATA If there were 30 observations of weekly sales then you had all 30 numbers available to you.
3. Use the data below to make a stem-and-leaf plot.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
INTRODUCTORY STATISTICS Chapter 2 DESCRIPTIVE STATISTICS PowerPoint Image Slideshow.
Have out your calculator and your notes! The four C’s: Clear, Concise, Complete, Context.
Warm Up – Find the mean, median & mode of each set. Data Set I Data Set II.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Math I: Unit 2 - Statistics
Describing distributions with numbers
Objectives The student will be able to: find the variance of a data set. find the standard deviation of a data set.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
The table below shows the number of students who are varsity and junior varsity athletes. Find the probability that a student is a senior given that he.
Warm Up Find the mean, median, mode, range, and outliers of the following data. 11, 7, 2, 7, 6, 12, 9, 10, 8, 6, 4, 8, 8, 7, 4, 7, 8, 8, 6, 5, 9 How does.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Warm-up The number of deaths among persons aged 15 to 24 years in the United States in 1997 due to the seven leading causes of death for this age group.
INVESTIGATION 1.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Chapter 2 Describing Distributions with Numbers. Numerical Summaries u Center of the data –mean –median u Variation –range –quartiles (interquartile range)
Created by: Tonya Jagoe. Measures of Central Tendency mean median mode.
1 Chapter 4 Numerical Methods for Describing Data.
Organizing Data AP Stats Chapter 1. Organizing Data Categorical Categorical Dotplot (also used for quantitative) Dotplot (also used for quantitative)
Notes Unit 1 Chapters 2-5 Univariate Data. Statistics is the science of data. A set of data includes information about individuals. This information is.
Summary Statistics, Center, Spread, Range, Mean, and Median Ms. Daniels Integrated Math 1.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
COMPUTATIONAL FORMULAS AND IQR’S. Compare the following heights in inches: BoysGirls
Descriptive Statistics(Summary and Variability measures)
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
1.Assemble the following tools: Graphing calculator z-tables (modules 3 - 5)z-tables Paper and pencil Reference for calculator keystrokes 2.Complete the.
CHAPTER 3 – Numerical Techniques for Describing Data 3.1 Measures of Central Tendency 3.2 Measures of Variability.
Measures of Dispersion Measures of Variability
Shoe Size  Please write your shoe size on the board.  Girls put yours on the girl’s chart  Boys put yours on the boy’s chart.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
Chapter 3 Numerical Descriptive Measures. 3.1 Measures of central tendency for ungrouped data A measure of central tendency gives the center of a histogram.
Descriptive Statistics ( )
Business and Economics 6th Edition
Statistics 1: Statistical Measures
The Practice of Statistics, Fourth Edition.
Numerical Descriptive Measures
Analyze Data: IQR and Outliers
Chapter 5: Describing Distributions Numerically
Describing Distributions with Numbers
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Measure of Center And Boxplot’s.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
The absolute value of each deviation.
Measure of Center And Boxplot’s.
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Honors Stats Chapter 4 Part 6
Organizing Data AP Stats Chapter 1.
Numerical Descriptive Statistics
Thursday, February 6th What are the measures of center?
Probability and Statistics
Business and Economics 7th Edition
Presentation transcript:

Measures of Center & Spread

Measures of Center

Example: What if another 5 is added to the data? Now find just the mode First rearrange the data from least to greatest: Mode: None

Measures of Center

Measures of Center How can we make the calculator do this for us? “Stat” “Edit” Enter data into L 1 “Stat” “Calc” “1-Var Stats” List: L 1 (or applicable list) “Calculate”

Measures of Center Example: A teenager recoded the time (in minutes per day) he spent playing Candy Crush over a 2 week period: Using your GDC, determine the mean and median daily time spent playing this game. Mean: 98.4 minutes; median: minutes Be sure to include units!

Measures of Center Outliers: data values that are either much larger or much smaller than the general body of data. Resistance: the sensitivity of an estimator to extreme observations. Estimators that do not change much with the addition or deletion of extreme observations are said to be resistant.

Effects of Outliers Let’s look at an example of each measure of center with and without an outlier. 1.Mean? Example A: 1, 2, 3, 4, 5 Example B: 1, 2, 3, 4, Median? Example A: 1, 2, 3, 4, 5 Example B: 1, 2, 3, 4, Mode? Example A: 1, 2, 3, 3, 4, 5 Example B: 1, 2, 3, 3, 4,

Effects of Outliers The mean is a “follower” – it will go towards the skewness! Skewed left: Skewed right: Mean Median

Effects of Outliers And if the data is approximately symmetrical? Mean ≈ Median

Choosing the Best Measure of Center Mean Commonly used & easy to understand Takes all values into account Affected by extreme values (non-resistant) Median Gives the halfway point of the data Only takes middle values into account Not affected by extreme values (resistant) Mode Gives the most frequently occurring value Only takes common values into account Not affected by extreme values (resistant) To decide which measure of center to use depends on whether or not you have an outlier as well as the shape of the data.

Choosing the Best Measure of Center Determine the best measure of center to use in each of the following situations: A shoe store trying to determine which size shoe to restock A typical house buyer trying to determine the price they should expect to pay in a particular area that has just a handful of very expensive homes. A cookie company trying to determine the amount of sales they should expect per day when looking at previous data, all of which is relatively consistent

Measures of Center from a Frequency Table

Data value (x)Frequency (f) Total40

Measures of Center from a Frequency Table Data value (x) Frequency (f) Product (fx) Total40278

Measures of Center from a Frequency Table Number of acesFrequency Total55

Measures of Center from a Frequency Table Number of aces FrequencyProduct Total55179

Measures of Center from a Frequency Table How can we make the calculator do this for us? “Stat” “Edit” Enter data values into L 1 and frequency into L 2 “Stat” “Calc” “1-Var Stats” List: L 1 (or applicable list) Frequency: L 2 (or applicable list) “Calculate”

Measures of Center for Grouped Data When data has been grouped, we will use the midpoint or mid-interval value of each class to represent all the scores within the interval. Why do we do this? We are assuming that all the scores within each class are evenly distributed throughout the interval. The mean calculated is an approximation of the true value, and we cannot do any better than this without knowing each individual data value.

Measures of Center for Grouped Data Age (years)Frequency (f)Midpoint (x) Total137

Measures of Center for Grouped Data Age (years)Frequency (f)Midpoint (x) , , , Total1375,161

Measures of Center for Grouped Data How can we make the calculator do this for us? “Stat” “Edit” Enter midpoints into L 1 and frequency into L 2 “Stat” “Calc” “1-Var Stats” List: L 1 (or applicable list) Frequency: L 2 (or applicable list) “Calculate”

Measures of Spread Describe the below graphs: Is using a measure of center enough to accurately describe a distribution?

Measures of Spread Range: Maximum value – minimum value Quartiles: Splitting the data up into 4 equal parts Interquartile range (IQR): describes how the middle 50% of the data is behaving IQR = Upper quartile – lower quartile IQR = Q 3 – Q 1 Standard deviation: measures the spread by looking at how far apart the observations are from the mean; a standard deviation with a larger value means the data is further away from the mean

Range Example. A random sample of people determined the number of cats they wished they had: 10, 12, 12, 13, 15, 16, 9, 10, 14, 11 Find the range. 16 – 9 = 7 cats

Quartiles & IQR To calculate quartiles: 1.Put the data in numerical order 2.Find the median (Q 2 ) 3.Find the median of the numbers below the median (lower quartile or Q 1 ) 4.Find the median of the numbers above the median (upper quartile or Q 3 ) To calculate the IQR (interquartile range): Upper quartile – lower quartile or Q 3 – Q 1

Quartiles & IQR Example. Calculate the median, lower quartile (Q 1 ), upper quartile (Q 3 ), and IQR: 7, 3, 1, 7, 6, 9, 3, 8, 5, 8, 6, 3, 7, 1, 9 First put the data in numerical order lower upper 1, 1, 3, 3, 3, 5, 6, 6, 7, 7, 7, 8, 8, 9, 9 Median: 6 Q 1 : 3Q 3 : 8 IQR: 8 – 3 = 5

Range, Quartiles & IQR How can we make the calculator do this for us? “Stat” “Edit” Enter data values into L 1 “Stat” “Calc” “1-Var Stats” List: L 1 (or applicable list) “Calculate” Range: maxX – minX Quartiles: Q1 Med (Q2) Q3 IQR: Q3 – Q1

Outliers To “officially” determine if a value is an outlier, we use the 1.5 X IQR criterion: Q 1 – 1.5 X IQR Q X IQR Any values beyond this range are outliers!

Measures of Spread Example. Using your GDC, calculate the median, lower quartile (Q 1 ), upper quartile (Q 3 ), and IQR of the following data: Median: 34Lower quartile: 25Upper quartile: 41 IQR: 16 Do we have an outlier? Show how you can tell. Lower outlier bound: Q 1 – 1.5 X IQR: 25 – 1.5 X 16 = 1 Upper outlier bound: Q X IQR: X 16 = is an outlier because it is beyond the upper outlier bound

Measures of Spread The problem with using the range & IQR as measures of spread is that both of them only use 2 values in the calculation. Standard deviation (s x ) takes into account the deviation of each value from the mean. It is a good measure of the dispersion of the data.

Properties of Standard Deviation (s x ) s x measures the spread about the mean and therefore should only be used when the mean is chosen as the measure of spread s x (like the mean) is not resistant s x = 0 only when there is no spread – When does this happen? When all data values are the same

Calculating Standard Deviation

Score (x) 22 – 5 = -3 (-3) 2 = 9 44 – 5 = -1 (-1) 2 = 1 55 – 5 = 0 (0) 2 = 0 55 – 5 = 0 (0) 2 = 0 66 – 5 =1 (1) 2 = 1 66 – 5 = 1 (1) 2 = 1 77 – 5 = 2 (2) 2 = 4 Total16

Standard Deviation How can we make the calculator do this for us? “Stat” “Edit” Enter data values into L 1 “Stat” “Calc” “1-Var Stats” List: L 1 (or applicable list) “Calculate” Standard deviation: σx (do NOT use Sx!)

Standard Deviation Example. 50 students were asked to total the number of points that they received on their IB diploma. Their results are shown in the table below. Using your GDC, calculate the mean and standard deviation for the boys and girls separately. Comment on what these numbers mean for each gender.

Standard Deviation Boys’ mean: 34 points Boys’ standard deviation: 1.23 points Girls’ mean: 34.3 points Girls’ standard deviation: 2.41 points Both boys & girls have a mean of about 34 points. The standard deviation for the boys is small, which implies that most boys scored close to 34 points. However, the standard deviation for girls is larger, which implies that some girls will scored much less than 34 points while others scored much more.

Standard Deviation To find the standard deviation of grouped data, use the mid-interval values. Example. Use your GDC to estimate the mean & standard deviation for this distribution of honors biology test scores. (Remember to use the midpoint values) Mean: 59.8 Standard deviation: 16.8

Choosing the Best Measures When should we use mean & standard deviation? – If the distribution is close to symmetrical When should we use median & quartiles? – If the distribution is considerably skewed