Presentation is loading. Please wait.

Presentation is loading. Please wait.

Business Statistics Spring 2005 Summarizing and Describing Numerical Data.

Similar presentations


Presentation on theme: "Business Statistics Spring 2005 Summarizing and Describing Numerical Data."— Presentation transcript:

1 Business Statistics Spring 2005 Summarizing and Describing Numerical Data

2 Topics Measures of Central Tendency Mean, Median, Mode, Midrange, Midhinge Quartile Measures of Variation The Range, Interquartile Range, Variance and Standard Deviation, Coefficient of variation Shape Symmetric, Skewed, using Box-and-Whisker Plots

3 Numerical Data Properties Central Tendency (Location) Variation (Dispersion) Shape

4 Measures of Central Tendency for Ungrouped Data Raw Data

5 Summary Measures Central Tendency Mean Median Mode Midrange Quartile Midhinge Summary Measures Variation Variance Standard Deviation Coefficient of Variation Range

6 Measures of Central Tendency Central Tendency MeanMedianMode Midrange Midhinge

7 Population Mean For ungrouped data, the population mean is the sum of all the population values divided by the total number of population values: where µ stands for the population mean. N is the total number of observations in the population. X stands for a particular value.  indicates the operation of adding. 3-2

8 Population Mean Example Parameter: a measurable characteristic of a population. The Kane family owns four cars. The following is the mileage attained by each car: 56,000, 23,000, 42,000, and 73,000. Find the average miles covered by each car. The mean is (56,000 + 23,000 + 42,000 + 73,000)/4 = 48,500 3-3

9 Sample Mean For ungrouped data, the sample mean is the sum of all the sample values divided by the number of sample values: where X stands for the sample mean n is the total number of values in the sample 3-4

10 Return on Stock 1998 1997 1996 1995 1994 10% 8 12 2 8 17% -2 16 1 8 Stock XStock Y 40% Average Return on Stock = 40 / 5 = 8%

11 The Mean (Arithmetic Average) It is the Arithmetic Average of data values: The Most Common Measure of Central Tendency Affected by Extreme Values (Outliers) 0 1 2 3 4 5 6 7 8 9 100 1 2 3 4 5 6 7 8 9 10 12 14 Mean = 5Mean = 6 Sample Mean

12 Properties of the Arithmetic Mean Every set of interval-level and ratio-level data has a mean. All the values are included in computing the mean. A set of data has a unique mean. The mean is affected by unusually large or small data values. The mean is relatively reliable. The arithmetic mean is the only measure of central tendency where the sum of the deviations of each value from the mean is zero. 3-6

13 EXAMPLE Consider the set of values: 3, 8, and 4. The mean is 5. Illustrating the fifth property, (3-5) + (8-5) + (4-5) = -2 +3 -1 = 0. In other words, 3-7

14 The Median Median: The midpoint of the values after they have been ordered from the smallest to the largest, or the largest to the smallest. There are as many values above the median as below it in the data array. Note: For an even set of numbers, the median will be the arithmetic average of the two middle numbers. 3-10

15 Position of Median in Sequence Median Positioning Point   n1 2

16 The Median 0 1 2 3 4 5 6 7 8 9 100 1 2 3 4 5 6 7 8 9 10 12 14 Median = 5 Important Measure of Central Tendency In an ordered array, the median is the “middle” number. If n is odd, the median is the middle number. If n is even, the median is the average of the 2 middle numbers. Not Affected by Extreme Values

17

18 The Mode 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode = 9 A Measure of Central Tendency Value that Occurs Most Often Not Affected by Extreme Values There May Not be a Mode There May be Several Modes Used for Either Numerical or Categorical Data 0 1 2 3 4 5 6 No Mode

19 Midrange A Measure of Central Tendency Average of Smallest and Largest Observation: Affected by Extreme Value Midrange 0 1 2 3 4 5 6 7 8 9 10 Midrange = 5

20 Quartiles Not a Measure of Central Tendency Split Ordered Data into 4 Quarters Position of i-th Quartile: position of point 25% Q1Q1 Q2Q2 Q3Q3 Q i(n+1) i  4 Data in Ordered Array: 11 12 13 16 16 17 18 21 22 Position of Q 1 = 2.50 Q1Q1 =12.5 = 1(9 + 1) 4

21 Quartiles See text page 107 for “rounding rules” for position of the i-th quartile Position (not value) of i-th Quartile: 25% Q1Q1 Q2Q2 Q3Q3 Q i(n+1) i  4

22 Midhinge A Measure of Central Tendency The Middle point of 1st and 3rd Quarters Used to Overcome Extreme Values Midhinge = Data in Ordered Array: 11 12 13 16 16 17 18 21 22 Midhinge =

23 Summary Measures Central Tendency Mean Median Mode Midrange Quartile Midhinge Summary Measures Variation Variance Standard Deviation Coefficient of Variation Range

24 Measures of Variation Variation VarianceStandard DeviationCoefficient of Variation Population Variance Sample Variance Population Standard Deviation Sample Standard Deviation Range Interquartile Range

25 Measure of Variation Difference Between Largest & Smallest Observations: Range = Ignores How Data Are Distributed: The Range 7 8 9 10 11 12 Range = 12 - 7 = 5 7 8 9 10 11 12 Range = 12 - 7 = 5

26 Return on Stock 1998 1997 1996 1995 1994 10% 8 12 2 8 17% -2 16 1 8 Stock XStock Y Range on Stock X = 12 - 2 = 10% Range on Stock Y = 17 - (-2) = 19%

27 Measure of Variation Also Known as Midspread: Spread in the Middle 50% Difference Between Third & First Quartiles: Interquartile Range = Interquartile Range Data in Ordered Array: 11 12 13 16 16 17 17 18 21 = 17.5 - 12.5 = 5

28 IQR = 75th percentile - 25th percentile The IQR is useful for checking for outliers Not Affected by Extreme Values Interquartile Range Data in Ordered Array: 11 12 13 16 16 17 17 18 21 = 17.5 - 12.5 = 5

29 Variance & Standard Deviation Measures of Dispersion Most Common Measures Consider How Data Are Distributed Show Variation About Mean (  X or  ) 4681012 X = 8.3

30 Important Measure of Variation Shows Variation About the Mean: For the Population: For the Sample: Variance For the Population: use N in the denominator. For the Sample : use n - 1 in the denominator.

31 Population Variance The population variance for ungrouped data is the arithmetic mean of the squared deviations from the population mean. 4-5

32 Population Variance EXAMPLE The ages of the Dunn family are 2, 18, 34, and 42 years. What is the population variance?

33 Population Standard Deviation

34 Population Standard Deviation EXAMPLE The ages of the Dunn family are 2, 18, 34, and 42 years. What is the population variance?

35 Most Important Measure of Variation Shows Variation About the Mean: For the Population: For the Sample: Standard Deviation For the Population: use N in the denominator. For the Sample : use n - 1 in the denominator.

36 Sample Variance and Standard Deviation The sample variance estimates the population am variance. NOTE: important computation formriance estimates the population variance. The sample standard deviation =

37 Example of Standard Deviation s = = = = 129.71 2 2

38 Example of Standard Deviation (Computational Version) 2 s = = = 129.71

39 Sample Standard Deviation NOTE: For the Sample : use n - 1 in the denominator. Data: 10 12 14 15 17 18 18 24 s = n = 8 Mean =16 = 4.2426 s

40 Interpretation and Uses of the Standard Deviation Chebyshev’s theorem: For any set of observations, the minimum proportion of the values that lie within k standard deviations of the mean is at least 1 - 1/k 2 where k is any constant greater than 1. Multiply by 100% to get percentage of values within k standard deviations of the mean 4-14

41 Interpretation and Uses of the Standard Deviation Empirical Rule: For any symmetrical, bell- shaped distribution, approximately 68% of the observations will lie within of the mean ( );approximately 95% of the observations will lie within of the mean ( ); approximately 99.7% will lie within of the mean ( ). 4-15

42 Comparing Standard Deviations s = = 4.2426 = 3.9686 Value for the Standard Deviation is larger for data considered as a Sample. Data : 10 12 14 15 17 18 18 24 N= 8 Mean =16

43 Comparing Standard Deviations Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21 Data B Data A Mean = 15.5 s =.9258 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 4.57 Data C

44 Coefficient of Variation Measure of Relative Variation Always a % Shows Variation Relative to Mean Used to Compare 2 or More Groups Formula ( for Sample):

45 Comparing Coefficient of Variation Stock A: Average Price last year = $50 Standard Deviation = $5 Stock B: Average Price last year = $100 Standard Deviation = $5 Coefficient of Variation: Stock A: CV = 10% Stock B: CV = 5%

46 Shape Describes How Data Are Distributed Measures of Shape: Symmetric or skewed Right-Skewed Left-SkewedSymmetric Mean =Median =Mode Mean Median Mode Median Mean Mode

47 Box-and-Whisker Plot Graphical Display of Data Using 5-Number Summary Median 4 6 8 10 12 Q 3 Q 1 X largest X smallest

48 Distribution Shape & Box-and-Whisker Plots Right-SkewedLeft-SkewedSymmetric Q 1 Median Q 3 Q 1 Q 3 Q 1 Q 3

49 Summary Discussed Measures of Central Tendency Mean, Median, Mode, Midrange, Midhinge Quartiles Addressed Measures of Variation The Range, Interquartile Range, Variance, Standard Deviation, Coefficient of Variation Determined Shape of Distributions Symmetric, Skewed, Box-and-Whisker Plot Mean =Median =ModeMean Median Mode Median Mean


Download ppt "Business Statistics Spring 2005 Summarizing and Describing Numerical Data."

Similar presentations


Ads by Google