What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Statistics: 2.5 – Measures of Position
Descriptive Measures MARE 250 Dr. Jason Turner.
Measures of Dispersion
Descriptive Statistics
Measures of Dispersion or Measures of Variability
Chapter 3 Describing Data Using Numerical Measures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Numerical Representation of Data Part 3 – Measure of Position
12.3 – Measures of Dispersion
Section 2.5 Measures of Position.
Section 2.5 Measures of Position Larson/Farber 4th ed.
Section 2.4 Measures of Variation.
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
Descriptive Statistics
Numerical Descriptive Techniques
Section 2.5 Measures of Position.
Section 2.5 Measures of Position Larson/Farber 4th ed. 1.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.
Review Measures of central tendency
Chapter 6 1. Chebychev’s Theorem The portion of any data set lying within k standard deviations (k > 1) of the mean is at least: 2 k = 2: In any data.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Lecture 3 Describing Data Using Numerical Measures.
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
Skewness & Kurtosis: Reference
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Three Averages and Variation.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 CHEBYSHEV'S THEOREM For any set of data and for any number k, greater than one, the.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Descriptive Statistics 2.
Percentiles For any whole number P (between 1 and 99), the Pth percentile of a distribution is a value such that P% of the data fall at or below it. The.
Chapter 2 Section 5 Notes Coach Bridges
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Summary Statistics: Measures of Location and Dispersion.
Measures of Position. Determine the quartiles of a data set Determine the interquartile range of a data set Create a box-and-whisker plot Interpret.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
MODULE 3: DESCRIPTIVE STATISTICS 2/6/2016BUS216: Probability & Statistics for Economics & Business 1.
Section 2.5 Measures of Position.
Unit 3: Averages and Variations Part 3 Statistics Mr. Evans.
Chapter 4 Measures of Central Tendency Measures of Variation Measures of Position Dot Plots Stem-and-Leaf Histograms.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Chapter 2 Describing Data: Numerical
Chapter 2 Descriptive Statistics.
Chapter 3 Describing Data Using Numerical Measures
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Descriptive Statistics
Unit 4 Statistics Review
Percentiles and Box-and- Whisker Plots
Measures of Position Quartiles Interquartile Range
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Chapter 2 Descriptive Statistics.
Quartile Measures DCOVA
Descriptive Statistics
Section 2.4 Measures of Variation.
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Presentation transcript:

What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data. The common measures of variation in data are – range, deviation, variance and standard deviation. 2.4 Measures of Variation

Range The range is the simplest measure of variation. It is difference between the biggest and smallest random variable. Range = Maximum value - Minimum value Range has the advantage of being easy to compute. Its disadvantage, however, is that it uses only two entries from the entire data set. Age based on class survey data: 26, 25, 35, 35, 40, 41, 21, 19, 20, 20, 30, 25, 24, 47, 36, 16, 23, 48, 40, 21, 27, 22, 39, 34, 26, 25, 16, 24, 33, 32, 28, 48, 40, 38. Range = maximum – minimum = 48 – 16 = 32

Deviation, Variance and Standard Deviation The deviation of an entry x i in a data set is the difference between that entry and the mean μ of the data set i.e. x i – μ The population variance of the population data set of N entries is: The population standard deviation is the square root of the population variance i.e. The sample variance of the sample data set of N entries is: The sample standard deviation is the square root of the sample variance i.e.

Deviation, Variance and Standard Deviation Age based on class survey: 26, 25, 35, 35, 40, 41, 21, 19, 20, 20, 30, 25, 24, 47, 36, 16, 23, 48, 40, 21, 27, 22, 39, 34, 26, 25, 16, 24, 33, 32, 28, 48, 40, 38. Population size N = 34, Population mean μ = 1024/34 = σ 2 = σ = Age (x i )x i - μ(x i – μ) : :: : :: Σ=

Deviation, Variance and Standard Deviation Variance and standard deviation take into consideration all the data. However they are both easily influenced by extreme scores since it is a square term. Variance is hard to interpret since it is a squared measure, standard deviation is interpreted as the average deviation from the mean.

Interpreting Standard Deviation When interpreting the standard deviation, remember that it is a measure of the typical amount an entry deviates from the mean. The more the entries are spread out, the greater the standard deviation.

Interpreting Standard Deviation Empirical Rule or The rule: For a bell shaped symmetric distribution 68% of the data lies within one standard deviation of the mean, 95% of the data lies within two standard deviations of the mean and 99.7% of the data lies within 3 standard deviations of the mean.

Interpreting Standard Deviation Chebychev’s theorem When the distribution is not bell shaped or symmetric then this theorem gives a lower bound to the proportion of data the lies with k standard deviations of the mean. It states that: The proportion of any data set lying within k standard deviations of the mean is at least k=2, In any data set, at least i.e. 75% of the data lies within 2 standard deviations of the mean.

Standard Deviation of Grouped Data Sample standard deviation for a frequency distribution is: Where c is the number of classes, x i is the ith data point in the sample, f i is the corresponding frequency, n is the sample size.

What are measures of position? A measure of position gives you some idea of where particular data values would rank in an ordering of a data set where a data value falls with respect to the mean of the sample or population Measures of Position

Quartiles Quartiles divide the data into 4 equal parts. We need three quartiles to divide any data set into 4 equal parts, Q 1, Q 2 and Q 3. About a quarter of the data falls below the first quartile, Q 1 About a half of the data falls below the second quartile, Q 2 About three quarters of the data falls below the third quartile, Q 3 Interquartile range (IQR) of a data set is the difference between the third and first quartiles, Q 3 – Q 1

Quartiles In essence five values can use used to describe a data set: Minimum data value, three quartiles - Q 1, Q 2, Q 3 and maximum data value. These five numbers are called the five number summary since they describe the central tendency, the spread and the variation in the data. Drawing a Box-whisker plot Find the five-number summary of the data set. Construct a horizontal; scale that spans the range of the data. Plot the five number above the horizontal scale. Draw a box above the horizontal scale from Q 1 to Q 3 and draw a vertical line in the box at Q 2. Draw whiskers from the box to minimum and maximum entries For the age data: Min = 16, Q1=23.25, Q2 = 27.5, Q3 = 37.5, Max = 48 Min entry Q1 Q2, Median Q3 Max entry Whisker Box Whisker

Percentiles and Other Fractiles FractilesSummarySymbols QuartilesDivide a data set into 4 equal parts Q 1, Q 2, Q 3 DecilesDivide a data set into 10 equal parts D 1, D 2, D 3,.. Q 9 PercentilesDivide a data set into 100 equal parts P 1, P 2, P 3,.. P 99 Fractiles are numbers that divide an ordered data set into equal parts. Some commonly used fractiles are:

z-score The standard score or z-score, represents the number of standard deviations a given value x falls from the mean μ. To find the z-score for a given value, A z-score can be positive, negative or zero. If z is positive, the data point > the mean, If z is negative, the data point < the mean, If z = 0, the data point = mean.