# NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

## Presentation on theme: "NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability."— Presentation transcript:

NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability

Another Description of the Data -- Variability For Data Set A below, the mean of the 10 observations is 2.60. SET A:4,2,3,3,2,2,1,4,3,2 But each of the following two data sets with 10 observations also has a mean of 2.60 SET B:2,2,2,2,3,3,3,3,3,3 SET C:0,0,1,1,4,4,4,4,4,4 Although sets A, B, an C all have the same mean, the “spread” of the data differs from set to set.

The “Spread” of the Data Data Set A Data Set B Data Set C Most “spread”Least “spread”

Measures of Variability Population –Variance  2 – Standard Deviation  Sample – Range – Variance s 2 – Standard Deviation s

The Range When we are talking about a sample, the range is the difference between the highest and lowest observation In the sample there were some A’s (4’s), and the lowest value in the sample was a D (1) –Sample range = 4 - 1 = 3

Another Approach to Variability The range only takes into account the two most extreme values A better approach –Look at the variability of all the data In some sense find the “average” deviation from the mean The value of an observation minus the mean can be positive or negative The plusses and minuses cancel each other out giving an average value of 0 –Need another measure

How to Average Only Positive Deviations MEAN ABSOLUTE DEVIATION (MAD)MEAN ABSOLUTE DEVIATION (MAD) –Averages the absolute values of these differences –Used in quality control/inventory analyses –But this quantity is hard to work with algebraically and analytically POPULATION VARIANCE (σ 2 )POPULATION VARIANCE (σ 2 ) –Averages the squares of the differences from the mean

Population Variance Formulas

EXAMPLE Calculation of σ 2 Using the numbers from the population of 2000 GPA’s: 4,2,1,3,3,3,2,… 2

Standard Deviation But the unit of measurement for σ 2 is: –Square Grade Points (???) What is a square grade point? To get back to the original units (grade points), take the square root of σ 2 STANDARD DEVIATION (  )STANDARD DEVIATION (  ) – the square root of the variance, σ 2

Calculation of the Standard Deviation (σ) For the grade point data:

Estimating σ 2 SAMPLE VARIANCE (s 2 )SAMPLE VARIANCE (s 2 ) –Best estimate for is  2 is s 2 s 2 is found by using the sample data and using the formula for  2 except:

Sample Variance Formulas

Calculations for s 2 The data from the sample is: 4,2,3,3,2,2,1,4,3,2

Sample Standard Deviation, s The best estimate for  is denoted: s It is called the sample standard deviation s is found by taking the square root of s 2

s 2 for Grouped Data For the grade point example –4 occurs 2 times –3 occurs 3 times –2 occurs 4 times –1 occurs 1 time To calculate the sample variance, s 2, rather than write the term down each time: – Multiply the squared deviations by their class frequencies

Calculation of s 2 -Grouped Data

Extension to Class Data Sometimes data is given in classes instead of individual observations approximationIn this case, the midpoints of the classes can be used to get an approximation of the mean and standard deviation using the grouped data approach

Example Suppose a sample of 100 salaries of CFO’s of small market firms is summarized below. Salary Range Midpoint Frequency Salary Range Midpoint Frequency \$ 80,000-\$100,000 \$ 90,000 12 \$100,000-\$120,000 \$110,000 38 \$120,000-\$140,000 \$130,000 40 \$140,000-\$160,000 \$150,000 10 Total = 100

_ Approximations for x and s

Interpreting s (Mound Shaped Distribution) If data forms a mound shaped distribution –Within  1s from the mean Approximately 68% of the measurements –Within  2s from the mean Approximately 95% of the measurements –Within  3s from the mean Approximately all of the measurements

Interpreting s (Any Distribution) If data is not mound shaped ( or shape is unknown) Within  2s from the mean At least 75% of the measurements –Within  3s from the mean At least 88.9% of the measurements –Within  ks from the mean (k > 1) At least 1 -1/k 2 of the measurements

Coefficient of Variation Coefficient of Variation (CV)Another measure of variability that is frequently used to compare different data sets (even if measured in different units) is the: Coefficient of Variation (CV) CV = (Standard Deviation/Mean) x 100%

CV Comparison of Return on Investment \$1000 Invested in Stock Fund A: –Average Weekly Return = \$1.90 –Standard Deviation = \$0.38 \$5000 Invested in Stock Fund B: –Average Weekly Return = \$12.50 –Standard Deviation = \$1.75 CV A = (.38/1.90)x100% = 20% CV B = (1.75/12.50)x100% = 14% Conclusion: Conclusion: Even though the risk (as measured by s) is larger for Stock fund B, the variability of the returns for Stock fund B (as measured by the CV) is less.

Range Approximation for σ If data is relatively mound-shaped a “good” approximation for s is: σ  (range)/4 Sometimes, when one is more certain that the sample range captures the entire population of data statisticians use, σ  (range)/6

Using Excel Suppose population data is in cells A2 to A2001 Population variance (  2 ) = VARP(A2:A2001) Population standard dev. (  ) =STDEVP(A2:A2001) Suppose sample data is in cells A2 to A11 Sample variance (s 2 ) =VAR(A2:A11) Sample standard dev. (s) =STDEV(A2:A11) Data Analysis

Where data values are stored Check Labels Check both: Summary Statistics Confidence Level Enter Name of Output Worksheet

Drag to make Column A wider Sample Standard Deviation Sample Variance

Using EXCEL for Grouped Data The Excel function SUMPRODUCT multiplies the respective numbers of two columns together then adds them up, e.g. =SUMPRODUCT(A2:A4,B2:B4) is equivalent to =A2*B2+A3*B3+A4*B4 This function aids in calculating the values in the following worksheet for calculating the mean, variance, and standard deviation for grouped data

Enter midpoints and frequencies Total Frequencies Square midpoint in Cell A2: =A2^2 Drag down to D6 =SUMPRODUCT(A2:A6,B2:B6)/B8 =(SUMPRODUCT(B2:B6,D2:D6)-200*B10^2)/199 =SQRT(B11)

Review Measures of variability for Populations and Samples –Range –Variance –Standard Deviation Formulas for –Complete Data –Frequency Data Excel –Functions –Data Analysis