Measures of Dispersion

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

STATISTICS ELEMENTARY MARIO F. TRIOLA
Psy302 Quantitative Methods
Statistics 1: Introduction to Probability and Statistics Section 3-3.
Measures of Variation Section 2.4 Statistics Mrs. Spitz Fall 2008.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Chapter 3 Numerically Summarizing Data
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 17 Standard Deviation, Z score, and Normal.
Numerically Summarizing Data
Measures of Dispersion
Chapter 4 SUMMARIZING SCORES WITH MEASURES OF VARIABILITY.
The arithmetic mean of a variable is computed by determining the sum of all the values of the variable in the data set divided by the number of observations.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
12.3 – Measures of Dispersion Dispersion is another analytical method to study data. Two of the most common measures of dispersion are the range and the.
Descriptive Statistics Anwar Ahmad. Central Tendency- Measure of location Measures descriptive of a typical or representative value in a group of observations.
Unit 3 Section 3-3 – Day : Measures of Variation  Range – the highest value minus the lowest value.  The symbol R is used for range.  Variance.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
Chapter 3 Descriptive Measures
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
Chapter 3 Numerically Summarizing Data Insert photo of cover.
Chapter 9 Statistics Section 9.2 Measures of Variation.
Section 3.2 Measures of Dispersion 1.Range 2.Variance 3.Standard deviation 4.Empirical Rule for bell shaped distributions 5.Chebyshev’s Inequality for.
3.2 Measures of Dispersion. D ATA ● Comparing two sets of data ● The measures of central tendency (mean, median, mode) measure the differences between.
Chapter Numerically Summarizing Data © 2010 Pearson Prentice Hall. All rights reserved 3 3.
Chapter 3 Numerically Summarizing Data 3.2 Measures of Dispersion.
SECTION 12-3 Measures of Dispersion Slide
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Measures of Variance Section 2-5 M A R I O F. T R I O L A Copyright © 1998, Triola,
Variability Pick up little assignments from Wed. class.
Chapter 3: Averages and Variation Section 2: Measures of Dispersion.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Numerically Summarizing Data 3.
Measures of Variation Range Standard Deviation Variance.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
CHAPTER 2: Basic Summary Statistics
Closing prices for two stocks were recorded on ten successive Fridays. Mean = 61.5 Median = 62 Mode = 67 Mean = 61.5 Median = 62 Mode =
Lectures' Notes STAT –324 Probability Probability and Statistics for Engineers First Semester 1431/1432 and 5735 Teacher: Dr. Abdel-Hamid El-Zaid Department.
Sect.2.4 Measures of variation Objective: SWBAT find the range of a data set Find the variance and standard deviation of a population and of a sample How.
 2012 Pearson Education, Inc. Slide Chapter 12 Statistics.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
STATISTICS AND PROBABILITY IN CIVIL ENGINEERING
Descriptive Statistics Measures of Variation
Measures of Dispersion
Confidence Intervals and Sample Size
Business and Economics 6th Edition
Measures of Dispersion
Elementary Statistics
Descriptive Measures Descriptive Measure – A Unique Measure of a Data Set Central Tendency of Data Mean Median Mode 2) Dispersion or Spread of Data A.
2.5: Numerical Measures of Variability (Spread)
Measures of Central Tendency
Chapter 12 Statistics 2012 Pearson Education, Inc.
Descriptive Statistics (Part 2)
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Midrange (rarely used)
Numerical Descriptive Measures
Descriptive Statistics: Numerical Methods
Variance Variance: Standard deviation:
Construction Engineering 221
Lecture Slides Elementary Statistics Twelfth Edition
Numerical Descriptive Measures
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Measures of Dispersion (Spread)
CHAPTER 15 SUMMARY Chapter Specifics
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
CHAPTER 2: Basic Summary Statistics
Chapter 12 Statistics.
Two Data Sets Stock A Stock B
Business and Economics 7th Edition
Numerical Descriptive Measures
Presentation transcript:

Measures of Dispersion Chapter 3 Section 2 Measures of Dispersion

Chapter 3 – Section 2 Comparing two sets of data The measures of central tendency (mean, median, mode) measure the differences between the “average” or “typical” values between two sets of data The measures of dispersion in this section measure the differences between how far “spread out” the data values are Comparing two sets of data Comparing two sets of data The measures of central tendency (mean, median, mode) measure the differences between the “average” or “typical” values between two sets of data

Chapter 3 – Section 2 The range of a variable is the largest data value minus the smallest data value Compute the range of 6, 1, 2, 6, 11, 7, 3, 3 The largest value is 11 The smallest value is 1 Subtracting the two … 11 – 1 = 10 … the range is 10 The range of a variable is the largest data value minus the smallest data value Compute the range of 6, 1, 2, 6, 11, 7, 3, 3 The largest value is 11 The smallest value is 1 The range of a variable is the largest data value minus the smallest data value Compute the range of 6, 1, 2, 6, 11, 7, 3, 3 The range of a variable is the largest data value minus the smallest data value

Chapter 3 – Section 2 The range only uses two values in the data set – the largest value and the smallest value The range is not resistant If we made a mistake and 6, 1, 2 was recorded as 6000, 1, 2 The range is now ( 6000 – 1 ) = 5999 The range only uses two values in the data set – the largest value and the smallest value The range is not resistant The range only uses two values in the data set – the largest value and the smallest value The range is not resistant If we made a mistake and 6, 1, 2 was recorded as 6000, 1, 2

Chapter 3 – Section 2 The variance is based on the deviation from the mean ( xi – μ ) for populations ( xi – ) for samples To treat positive differences and negative differences, we square the deviations ( xi – μ )2 for populations ( xi – )2 for samples The variance is based on the deviation from the mean ( xi – μ ) for populations ( xi – ) for samples

Chapter 3 – Section 2 The population variance of a variable is the sum of these squared deviations divided by the number in the population The population variance is represented by σ2 Note: For accuracy, use as many decimal places as allowed by your calculator The population variance of a variable is the sum of these squared deviations divided by the number in the population The population variance of a variable is the sum of these squared deviations divided by the number in the population

Chapter 3 – Section 2 Compute the population variance of 6, 1, 2, 11 Compute the population mean first μ = (6 + 1 + 2 + 11) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Average the squared deviations (16 + 9 + 1 + 36) / 4 = 15.5 The population variance σ2 is 15.5 Compute the population variance of 6, 1, 2, 11 Compute the population mean first μ = (6 + 1 + 2 + 11) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Compute the population variance of 6, 1, 2, 11 Compute the population mean first μ = (6 + 1 + 2 + 11) / 4 = 5 Compute the population variance of 6, 1, 2, 11

Chapter 3 – Section 2 The sample variance of a variable is the sum of these squared deviations divided by one less than the number in the sample The sample variance is represented by s2 We say that this statistic has n – 1 degrees of freedom The sample variance of a variable is the sum of these squared deviations divided by one less than the number in the sample

Chapter 3 – Section 2 Compute the sample variance of 6, 1, 2, 11 Compute the sample mean first = (6 + 1 + 2 + 11) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Average the squared deviations (16 + 9 + 1 + 36) / 3 = 20.7 The sample variance s2 is 20.7 Compute the sample variance of 6, 1, 2, 11 Compute the sample mean first = (6 + 1 + 2 + 11) / 4 = 5 Now compute the squared deviations (1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36 Compute the sample variance of 6, 1, 2, 11 Compute the sample mean first = (6 + 1 + 2 + 11) / 4 = 5 Compute the sample variance of 6, 1, 2, 11

Chapter 3 – Section 2 Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers? In the first case, { 6, 1, 2, 11 } was the entire population (divide by N) In the second case, { 6, 1, 2, 11 } was just a sample from the population (divide by n – 1) These are two different situations Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers? In the first case, { 6, 1, 2, 11 } was the entire population (divide by N) In the second case, { 6, 1, 2, 11 } was just a sample from the population (divide by n – 1) Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers? In the first case, { 6, 1, 2, 11 } was the entire population (divide by N) Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers?

Chapter 3 – Section 2 Why do we use different formulas? The reason is that using the sample mean is not quite as accurate as using the population mean If we used “n” in the denominator for the sample variance calculation, we would get a “biased” result Bias here means that we would tend to underestimate the true variance

Chapter 3 – Section 2 The standard deviation is the square root of the variance The population standard deviation Is the square root of the population variance (σ2) Is represented by σ The sample standard deviation Is the square root of the sample variance (s2) Is represented by s The standard deviation is the square root of the variance The standard deviation is the square root of the variance The population standard deviation Is the square root of the population variance (σ2) Is represented by σ

Chapter 3 – Section 2 If the population is { 6, 1, 2, 11 } The population variance σ2 = 15.5 The population standard deviation σ = If the sample is { 6, 1, 2, 11 } The sample variance s2 = 20.7 The sample standard deviation s = The population standard deviation and the sample standard deviation apply in different situations If the population is { 6, 1, 2, 11 } The population variance σ2 = 15.5 The population standard deviation σ = If the population is { 6, 1, 2, 11 } The population variance σ2 = 15.5 The population standard deviation σ = If the sample is { 6, 1, 2, 11 } The sample variance s2 = 20.7 The sample standard deviation s =

Chapter 3 – Section 2 The standard deviation is very useful for estimating probabilities

Chapter 3 – Section 2 The empirical rule If the distribution is roughly bell shaped, then Approximately 68% of the data will lie within 1 standard deviation of the mean Approximately 95% of the data will lie within 2 standard deviations of the mean Approximately 99.7% of the data (i.e. almost all) will lie within 3 standard deviations of the mean The empirical rule If the distribution is roughly bell shaped, then Approximately 68% of the data will lie within 1 standard deviation of the mean Approximately 95% of the data will lie within 2 standard deviations of the mean The empirical rule If the distribution is roughly bell shaped, then Approximately 68% of the data will lie within 1 standard deviation of the mean The empirical rule If the distribution is roughly bell shaped, then

Chapter 3 – Section 2 For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and (17 + 3.4), i.e. 13.6 and 20.4 Approximately 95% of the values will lie between (17 – 2  3.4) and (17 + 2  3.4), i.e. 10.2 and 23.8 Approximately 99.7% of the values will lie between (17 – 3  3.4) and (17 + 3  3.4), i.e. 6.8 and 27.2 A value of 2.1 and a value of 33.2 would both be very unusual For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and (17 + 3.4), i.e. 13.6 and 20.4 Approximately 95% of the values will lie between (17 – 2  3.4) and (17 + 2  3.4), i.e. 10.2 and 23.8 Approximately 99.7% of the values will lie between (17 – 3  3.4) and (17 + 3  3.4), i.e. 6.8 and 27.2 For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and (17 + 3.4), i.e. 13.6 and 20.4 For a variable with mean 17 and standard deviation 3.4 For a variable with mean 17 and standard deviation 3.4 Approximately 68% of the values will lie between (17 – 3.4) and (17 + 3.4), i.e. 13.6 and 20.4 Approximately 95% of the values will lie between (17 – 2  3.4) and (17 + 2  3.4), i.e. 10.2 and 23.8

Chapter 3 – Section 2 Chebyshev’s inequality gives a lower bound on the percentage of observations that lie within k standard deviations of the mean (where k > 1) This lower bound is An estimated percentage The actual percentage for any variable cannot be lower than this number Therefore the actual percentage must be this value or higher Chebyshev’s inequality gives a lower bound on the percentage of observations that lie within k standard deviations of the mean (where k > 1) Chebyshev’s inequality gives a lower bound on the percentage of observations that lie within k standard deviations of the mean (where k > 1) This lower bound is An estimated percentage The actual percentage for any variable cannot be lower than this number

Chapter 3 – Section 2 Chebyshev’s inequality For any data set, at least of the observations will lie within k standard deviations of the mean, where k is any number greater than 1

Chapter 3 – Section 2 How much of the data lies within 1.5 standard deviations of the mean? From Chebyshev’s inequality so that at least 55.6% of the data will lie within 1.5 standard deviations of the mean

Chapter 3 – Section 2 If the mean is equal to 20 and the standard deviation is equal to 4, how much of the data lies between 14 and 26? 14 to 26 are 1.5 standard deviations from 20 so that at least 55.6% of the data will lie between 14 and 26

Summary: Chapter 3 – Section 2 Range The maximum minus the minimum Not a resistant measurement Variance and standard deviation Measures deviations from the mean Empirical rule About 68% of the data is within 1 standard deviation About 95% of the data is within 2 standard deviations