Presentation on theme: "Describing Data: Measures of Central Tendency"— Presentation transcript:
1 Describing Data: Measures of Central Tendency Topic-3Describing Data:Measures of Central Tendency
2 Average: A Measure of Central Tendency Average: A single numerical characteristic that represents a specific feature of a set of data - the center of the values.Population Mean (Overall Average): Computed by the Total Sum divided by the total number of items in the population. It is a Parameter - a measurable characteristic of a population.Sample Mean (Sample Average): Computed by the Total Sum of the sample divided by the total number of items in the sample.It is a Statistic - a measurable characteristic of a sample.Both Population Mean and Sample Mean are the Arithmetic Mean.
3 Important Properties of Arithmetic Mean Every set of data (interval and ratio level) has a unique mean.All values are included in computing the mean.The "sum of the deviations" of each value from the mean will always be equal to zero.Mean can be viewed as a "balance" point.Mean can be easily affected by unusually (large or small) values (called as "outliers").There is no "mean" if open-end classes are included in the data.
4 MedianThe value of midpoint - after the data have been ordered from the smallest to the largest (vise verse).There are the same number of values above the median as below it.For an even set, the median is computed as the arithmetic average of the two middle numbers.The median is not affected by extreme values in the set.Median is unique, can be determined even with open-end classes.Median can be computed for all levels of data (except nominal level data).
5 Mode The value of the observation that appears most frequently. Mode can be determined for all levels of data sets.Mode is not affected by extreme values in the set.Mode can be determined even with open-end classes.However, mode may not be unique for some data sets, while may not exist in other data sets.Another measure of central tendency.
6 Weighted MeanTo allow each item weighted differently according to its importance in the calculation of the mean.All weights (Wi) must be determined carefully.Different weight sets may result in different weighted mean.Weighted mean allows subjective managerial considerations to be considered and evaluated.Widely used in many business statistical applications.Examples
7 Geometric MeanThe Nth root of the product of N numbers for a set of N positive numbers, normally used:to average percents, indexes, and relatives, andto determine average percent changes (in sales, production) from one time period to another.Geometric mean is a more conservative measure of average.Examples
8 Mean, Median, Mode for Grouped Data In practice, we have to, more often, make estimates about some thing that are of interest from a sample, in a form of grouped data, as frequency distribution.Mean: (Arithmetic) the sum of the products of the midpoint by its frequency of each group divided by the total number of frequencies. (Examples)This mean may be different from the population mean, because it is only an estimate of the population mean.Median: only can be estimated by locating the median group and then calculating the midpoint assuming this group is evenly spaced. (Examples)This median is an estimate, may be different from population median.In the calculation, either frequency or its percentage is enough and open-end class at either side has no effect.Mode: can be estimated by the midpoint of the group with the largest group frequency. (Examples)There may be more than one "mode" (multimodal) within a group of data."Bimodal data" is for having two modes co-existing.
9 Symmetric Distribution Symmetric Distribution: symmetrical with bell-shaped, having the same shape on both sides of the center axis.Its mean, median, and mode are equal, located at the center.Asymmetric Distribution: bell-shaped but skewed toward one side, having different shapes on two sides of the center axis.If positively skewed, the mean will be the largest among three averages (Mean > Median > Mode), because the different influences of extreme values on the three averages. (Examples)If negatively skewed, the mean will be the lowest among three averages (Mean < Median < Mode). (Examples)Approximate relationship among the three: when the data set is large enough and not skewed too much, then the median is 1/3 from the mean to the mode.If two averages are known, the third can be estimated by given formulas.
10 Population MeanDefinition: For ungrouped data, the population mean is the sum of all the population values divided by the total number of population values. To compute the population mean, use the following formula:Where:µ is muΣ is SigmaΧ is the Individual valueΝ is the Population size
11 Sample MeanDefinition: For ungrouped data, the sample mean is the sum of all the sample values divided by the number of sample values. To compute the sample mean, use the following formula:Where:is X-barΣ is SigmaΧ is the Individual valuen is the Sample size
12 Example 1 Parameter: a measurable characteristic of a population. The Kiers family owns 4 cars. The following is the mileage attained by each car: 56,000; 23,000; 42,000; and 73,000. Find the average miles covered by each car.The mean is (56, , , ,000)/4 = 48,500.
13 Example 2 Statistic: a measurable characteristic of a sample. A sample of 5 executives received the following bonuses last year: $14,000; $15,000; $17,000; $16,000 and $15,000. Find the average bonus for these 5 executives.Since these values represent a sample size of 5, the sample mean is ($14,000 + $15,000 + $17,000 + $16,000 + $15,000)/5 = $15,400.
14 IllustrationConsider the set of values: 3, 8, and 4. The mean is 5. So, (3 – 5) + (8 – 5) + (4 – 5) = – 1 = 0. Symbolically, we write:
15 The Weighted MeanDefinition: The weighted mean of a set of numbers X1, X2, …, Xn, with corresponding weights w1, w2, …, wn, is computed from the following formulas:
16 Example 3 Compute the median for the following data. The age of a sample of 5 college students is: 21, 25, 19, 20, and 22.Arranging the data in ascending order gives: 19, 20, 21, 22, and 25. Thus, the median is 21.The height of 4 basketball players, in inches, is 76, 73, 80, and 75.Arranging the data is ascending order gives: 73, 75, 76, and 80.Thus, the median is 75.5.
17 Example 4The exam scores for 10 students are: 81, 93, 84, 75, 68, 87, 81, 75, 81, and 87.Since the score of 81 occurs the most, the modal score is 81.
18 Example 5During a 1-hour period on a hot Saturday afternoon, cabana boy Chris served 50 drinks.Compute the weighted mean of the price of the drinks.Price ($), Number sold): (0.50, 5); (0.75, 15); (0.90, 15); (1.10, 15).The weighted mean is: $((0.50 x 5) + (0.75 x 15) + (0.90 x 15) + (1.10 x 15)) = $43.75/50 = $0.875.
19 The Geometric Mean (I)Definition: The geometric mean (GM) of a set of n numbers is defined as the nth root of the product of n numbers. The formula for the geometric mean is given by:One main use of the geometric mean is to average percents, indexes, and relatives.
20 The Geometric Mean (II) The other main use of the geometric mean is to determine the average percent increase in sales, production, or other business or economic series from one time period to another.The formula for the geometric mean as applied to this type of problem is:
21 Example 6 The interest rates on 3 bonds were 5, 7, and 4 percent. The geometric mean is:The arithmetic mean is: ( )/3 =The GM gives a more conservative profit figure because it is not heavily weighted by the rate of 7%.
22 Example 7The total number of females enrolled in American colleges increased from 755,000 in 1986 to 835,000 in 1995.Here n = 10, so (n – 1) = 9.That is, the geometric mean rate of increase is 1.27%.
23 The Mean of Grouped Data The mean of a sample of data organized in a frequency distribution is computed by the following formula:Where:is X-barFor ΣXf, the X value is the class midpoint and the f value is the class frequencyFor Σf is the sum of the frequenciesn is the sample size
24 The Median of Grouped Data The median of a sample of data organized in a frequency distribution is computed by the following formula:Where:L is the lower limit of the median classn is the sample sizeCF is the cumulative frequency preceding the median classf is the frequency of the median classi is the median class interval.
25 Example 8A sample of 10 movie theatres in a large metropolitan area tallied the total number of movies showing last week. Compute the mean number of movies showing.To determine the median class for grouped data:Construct a cumulative frequency distribution.Divide the total number of data values by 2.Determine which class will contain this value. For example, if n = 50, 50/2 = 25, then determine which class will contain the 25th value – the median class.
26 Example 8 Movies Showing Frequency, f Class Midpoint, X (f)(X) 1 – 2 1 1.53 – 423.57.05 – 635.516.57 – 87.59 – 109.528.5Total106161/10 = 6.1 movies
27 Example 8Movies ShowingFrequency, fCumulative Frequency1 – 213 – 4235 – 667 – 879 – 1010The median class is 5 – 6, since it contains the 5th value (n/2=5).From the table, L = 5, n = 10, f = 3, i = 2, & CF = 3.Thus, the median
28 Example 9A sample of 20 appliance stores in a large metropolitan area revealed the following number of DVD players sold last week. Compute the mean number sold. The formula and computation is shown below:
29 Example 9 Number Sold Frequency, f Class Midpoint, X (f)(X) The table also gives the necessary computations:Number SoldFrequency, fClass Midpoint, X(f)(X)5 – 9271410 – 144124815 – 19101717020 – 243226625 – 29127Σf = 20ΣfX = 325
30 Table: Comparison of the Three Averages Mean (M)Median (Mdn)Mode (Mo)DefinitionBalance pointMiddle pointMost frequent scoreWhen to UseSymmetrical distributions of interval/ratio dataWhen mean is inappropriate (except for nominal data)Nominal dataFrequency of Use in Scientific ReportingVery frequentSomewhat frequentVery infrequent1Relative Values for Positive Skew2Higher than Mdn and MoBetween M and MoLower than M and MdnRelative Values for Negative SkewLower than Mdn and MoHigher than M and Mdn1Although nominal data (for which the mode is appropriate) is frequently reported in scientific writing, most researchers report percentages for each category rather than reporting the modal category. Thus, the mode is seldom used.2See Section 9 to review skewed distributions
31 Symmetric Distribution Zero skewness Mode = Median = Mean
32 Right Skewed Distribution Positively skewed: Mean and Median are to the right of the Mode Mode < Median < Mean
33 Left Skewed Distribution Negatively skewed: Mean and Median are to the left of the Mode Mean < Median < Mode
34 SkewnessIf two averages of a moderately skewed frequency distribution are known, the third can be approximated.Mode = mean – 3(mean – median)Mean = [3(median) – mode]/2Median = [2(mean) + mode]/3
35 Exercise 3ASpringers sold 95 Antonelli men’s suits for the regular price of $400. For the spring sale the suits were reduced to $200 and 126 were sold. At the final clearance, the price was reduced to $100 and the remaining 79 suits were sold.What was the weighted mean price of an Antonelli suit?Springers paid $200 a suit for the 300 suits. Comment on the store’s profit per suit if a salesperson receives a $25 commission for each suit sold.
36 Exercises 3B & 3CIn 1950 there were 51 countries that belonged to the United Nations. By 1996 this number had increased to What was the geometric mean annual rate of increase in membership during this period?Darenfest and Associates stated that in 1988 hospitals spent $3.9 billion on computer systems. They estimated that by the year 2000 this would increase to $14.0 billion. If expenditures did increase to $14.0 billion, what is the geometric mean annual rate of increase for the period?