Welcome to Week 04 College Statistics

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

Chapter 3 Describing Data Using Numerical Measures
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Describing Data: Numerical
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Statistics for Managers.
1.3: Describing Quantitative Data with Numbers
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Lecture 3 Describing Data Using Numerical Measures.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
1 Measures of Center. 2 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely.
Statistics 1: Introduction to Probability and Statistics Section 3-2.
Statistical Methods © 2004 Prentice-Hall, Inc. Week 3-1 Week 3 Numerical Descriptive Measures Statistical Methods.
Welcome to Week 03 Tues MAT135 Statistics
Descriptive Statistics ( )
Welcome to Week 03 College Statistics
Business and Economics 6th Edition
Welcome to Wk10 MATH225 Applications of Discrete Mathematics and Statistics
Chapter 3 Describing Data Using Numerical Measures
Welcome to Week 03 Thurs MAT135 Statistics
Welcome to Week 02 Thurs MAT135 Statistics
CHAPTER 2: Describing Distributions with Numbers
Chapter 6 ENGR 201: Statistics for Engineers
Midrange (rarely used)
Welcome to Wk09 MATH225 Applications of Discrete Mathematics and Statistics
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Statistics Central Tendency
DAY 3 Sections 1.2 and 1.3.
Please take out Sec HW It is worth 20 points (2 pts
Warmup What is the shape of the distribution? Will the mean be smaller or larger than the median (don’t calculate) What is the median? Calculate the.
Statistics Variability
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
CHAPTER 1 Exploring Data
Describing Quantitative Data with Numbers
Chapter 1: Exploring Data
Statistics 1: Introduction to Probability and Statistics
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
The Five-Number Summary
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Business and Economics 7th Edition
Chapter 1: Exploring Data
Presentation transcript:

Welcome to Week 04 College Statistics

Descriptive Statistics Averages tell where the data tends to pile up

Descriptive Statistics Another good way to describe data is how spread out it is

Suppose you are using the mean “5” to describe each of the observations in your sample Descriptive Statistics

VARIABILITY IN-CLASS PROBLEMS For which sample would “5” be closer to the actual data values?

VARIABILITY IN-CLASS PROBLEMS In other words, for which of the two sets of data would the mean be a better descriptor?

For which of the two sets of data would the mean be a better descriptor? VARIABILITY IN-CLASS PROBLEMS

Variability Numbers telling how spread out our data values are are called “Measures of Variability”

Variability The variability tells how close to the “average” the sample data tend to be

Variability Just like measures of central tendency, there are several measures of variability

Variability Range = max – min

Variability Interquartile range (symbolized IQR): IQR = 3 rd quartile – 1 st quartile

Variability “Range Rule of Thumb” A quick-and-dirty variance measure: (Max – Min)/4

Variability Variance (symbolized s 2 ) s 2 =

Variability

Sums of squared deviations are used in the formula for a circle: r 2 = (x-h) 2 + (y-k) 2 where r is the radius of the circle and (h,k) is its center

Variability OK… so if its sort of an arithmetic mean, howcum is it divided by “n-1” not “n”?

Variability Every time we estimate something in the population using our sample we have used up a bit of the “luck” that we had in getting a (hopefully) representative sample

Variability To make up for that, we give a little edge to the opposing side of the story

Variability Since a small variability means our sample arithmetic mean is a better estimate of the population mean than a large variability is, we bump up our estimate of variability a tad to make up for it

Variability Dividing by “n” would give us a smaller variance than dividing by “n-1”, so we use that

Variability Why not “n-2”?

Variability

Trust me…

Variability

The range, interquartile range and standard deviation are in the same units as the original data (a good thing) The variance is in squared units (which can be confusing…)

Variability Naturally, the measure of variability used most often is the hard-to-calculate one…

Variability Naturally, the measure of variability used most often is the hard-to-calculate one… … the standard deviation

Variability Statisticians like it because it is an average distance of all of the data from the center – the arithmetic mean

Variability

Questions?

Variability

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS Min Max

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS Q1 Median Q3

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS Min Max

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

Variability What do you get if you add up all of the deviations? Data: Dev: 1-2= = = 0 2-2= 0 3-2= 1 3-2= 1

Variability Zero!

Variability Zero! That’s true for ALL deviations everywhere in all times!

Variability Zero! That’s true for ALL deviations everywhere in all times! That’s why they are squared in the sum of squares!

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

YAY!

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

VARIABILITY IN-CLASS PROBLEMS

Variability Aren’t you glad Excel does all this for you???

Questions?

Variability

Naturally, these are going to have funny Greek-y symbols just like the averages …

Variability The population variance is “σ 2 ” called “sigma-squared” The population standard deviation is “σ” called “sigma”

Variability Again, the sample statistics s 2 and s values estimate population parameters σ 2 and σ (which are unknown)

Variability

s sq vs sigma sq

Variability s sq is divided by “n-1” sigma sq is divided by “n”

Questions?

Variability Outliers! They can really affect your statistics!

Suppose we originally had data: Suppose we now have data: Is the mode affected? OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Original mode: 1 New mode: 1 OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Is the midrange affected? OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Original midrange: 3 New midrange: 371 OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Is the median affected? OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Original median: 1.5 New median: 1.5 OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Is the mean affected? OUTLIERS IN-CLASS PROBLEMS

OUTLIERS IN-CLASS PROBLEMS

Outliers! How about measures of variability?

Suppose we originally had data: Suppose we now have data: Is the range affected? OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Original range: 4 New range: 740 OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Is the interquartile range affected? OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Original IQR: 2.5 – 1 = 1.5 New IQR: 1.5 OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Is the variance affected? OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Original s 2 : ≈2.57 New s 2 : ≈91, OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Is the standard deviation affected? OUTLIERS IN-CLASS PROBLEMS

Suppose we originally had data: Suppose we now have data: Original s: ≈1.60 New s: ≈ OUTLIERS IN-CLASS PROBLEMS

Questions?

Descriptive Statistics Last week we got this summary table from Excel - Descriptive Statistics BeansLiquorButterBEQ Mean72,836.85, , ,030.2 Standard Error1, ,528.7 Median72,539.05, , ,617.2 Mode#N/A Standard Deviation9,359.41,580.23,024.17,794.8 Sample Variance87,599,301.82,496,988.99,145, ,759,154.8 Kurtosis Skewness Range32,359.46,477.29, ,075.8 Midrange71,625.35, , ,849.2 Minimum55,445.61, , ,311.3 Maximum87,805.08, , ,387.1 Sum1,893, , ,975.22,704,784.1 Count26.0

Descriptive Statistics Which are Measures of Central Tendency? BeansLiquorButterBEQ Mean72,836.85, , ,030.2 Standard Error1, ,528.7 Median72,539.05, , ,617.2 Mode#N/A Standard Deviation9,359.41,580.23,024.17,794.8 Sample Variance87,599,301.82,496,988.99,145, ,759,154.8 Kurtosis Skewness Range32,359.46,477.29, ,075.8 Midrange71,625.35, , ,849.2 Minimum55,445.61, , ,311.3 Maximum87,805.08, , ,387.1 Sum1,893, , ,975.22,704,784.1 Count26.0

Descriptive Statistics Which are Measures of Central Tendency? BeansLiquorButterBEQ Mean72,836.85, , ,030.2 Standard Error1, ,528.7 Median72,539.05, , ,617.2 Mode#N/A Standard Deviation9,359.41,580.23,024.17,794.8 Sample Variance87,599,301.82,496,988.99,145, ,759,154.8 Kurtosis Skewness Range32,359.46,477.29, ,075.8 Midrange71,625.35, , ,849.2 Minimum55,445.61, , ,311.3 Maximum87,805.08, , ,387.1 Sum1,893, , ,975.22,704,784.1 Count26.0

Descriptive Statistics Which are Measures of Variability? BeansLiquorButterBEQ Mean72,836.85, , ,030.2 Standard Error1, ,528.7 Median72,539.05, , ,617.2 Mode#N/A Standard Deviation9,359.41,580.23,024.17,794.8 Sample Variance87,599,301.82,496,988.99,145, ,759,154.8 Kurtosis Skewness Range32,359.46,477.29, ,075.8 Midrange71,625.35, , ,849.2 Minimum55,445.61, , ,311.3 Maximum87,805.08, , ,387.1 Sum1,893, , ,975.22,704,784.1 Count26.0

Descriptive Statistics Which are Measures of Variability? BeansLiquorButterBEQ Mean72,836.85, , ,030.2 Standard Error1, ,528.7 Median72,539.05, , ,617.2 Mode#N/A Standard Deviation9,359.41,580.23,024.17,794.8 Sample Variance87,599,301.82,496,988.99,145, ,759,154.8 Kurtosis Skewness Range32,359.46,477.29, ,075.8 Midrange71,625.35, , ,849.2 Minimum55,445.61, , ,311.3 Maximum87,805.08, , ,387.1 Sum1,893, , ,975.22,704,784.1 Count26.0

Questions?

Variability Ok… swell… but WHAT DO YOU USE THESE MEASURES OF VARIABILITY FOR???

Variability From last week – THE BEANS! We wanted to know – could you use sieves to separate the beans? Moong -L Moong -W Moong -D Black- L Black- W Black- DCran-L Cran- W Cran- DLima-L Lima- W Lima- DFava-L Fava- W Fava- D Mean Standard Deviation Sample Variance Range Minimum Maximum

You could have plotted the mean measurement for each bean type: Variability

This might have helped you tell whether sieves could separate the types of beans Variability

But… beans are not all “average” – smaller beans might slip through the holes of the sieve! How could you tell if the beans were totally separable? Variability

Make a graph that includes not just the average, but also the spread of the measurements! Variability

New Excel Graph: hi-lo-close

Variability Rearrange your data so that the labels are followed by the maximums, then the minimums, then the means: Moong -L Moong -W Moong -D Black- L Black- W Black- DCran-L Cran- W Cran- DLima-L Lima- W Lima- DFava-L Fava- W Fava- D Maximum Minimum Mean

Highlight this data Click “Insert” Click “Other Charts” Click the first Stock chart: “Hi-Lo-Close”

Ugly… as usual …but informative!

Left click the graph area Click on “Layout”

Enter title and y-axis label:

Click one of the “mean” markers on the graph Click Format Data Series

Click Marker Options to adjust the markers

Repeat for the max (top of black vertical line) and min (bottom of black vertical line)

TAH DAH!

Which beans can you sieve?

Questions?

How to Lie with Statistics #4 You can probably guess… It involves using the type of measure of variability that serves your purpose best This is almost always the smallest one

Questions?