Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measures of Central Tendency

Similar presentations


Presentation on theme: "Measures of Central Tendency"— Presentation transcript:

1 Measures of Central Tendency
By Rahul Jain

2 The Motivation Measure of central tendency are used to describe the typical member of a population. Depending on the type of data, typical could have a variety of “best” meanings. We will discuss four of these possible choices.

3 4 Measures of Central Tendency
Mean – the arithmetic average. This is used for continuous data. Median – a value that splits the data into two halves, that is, one half of the data is smaller than that number, the other half larger. May be used for continuous or ordinal data. Mode – this is the category that has the most data. As the description implies it is used for categorical data. Midrange – not used as often as the other three, it is found by taking the average of the lowest and highest number in the data set. Also primarily used for continuous data.

4 Measures of Central Tendency
The central tendency is measured by averages. These describe the point about which the various observed values cluster. In mathematics, an average, or central tendency of a data set refers to a measure of the "middle" or "expected" value of the data set.

5 Mean To find the mean, add all of the values, then divide by the number of values. The lower case, Greek letter mu is used for population mean. An “x” with a bar over it, read x-bar, is used for sample mean.

6 Mean Example

7 Arithmetic Mean of Group Data
if are the mid-values and are the corresponding frequencies, where the subscript ‘k’ stands for the number of classes, then the mean is

8 Exercise-1: Find the Arithmetic Mean
Class Frequency (f) x fx 20-29 3 24.5 73.5 30-39 5 34.5 172.5 40-49 20 44.5 890 50-59 10 54.5 545 60-69 64.5 322.5 Sum N=43 2003.5

9 Median The median is a number chosen so that half of the values in the data set are smaller than that number, and the other half are larger. To find the median List the numbers in ascending order If there is a number in the middle (odd number of values) that is the median If there is not a middle number (even number of values) take the two in the middle, their average is the median

10 Median Example

11 Median The implication of this definition is that a median is the middle value of the observations such that the number of observations above it is equal to the number of observations below it. If “n” is Even If “n” is odd

12 Median of Group Data L0 = Lower class boundary of the median class
h = Width of the median class f0 = Frequency of the median class F = Cumulative frequency of the pre- median class

13 Steps to find Median of group data
Compute the less than type cumulative frequencies. Determine N/2 , one-half of the total number of cases. Locate the median class for which the cumulative frequency is more than N/2 . Determine the lower limit of the median class. This is L0. Sum the frequencies of all classes prior to the median class. This is F. Determine the frequency of the median class. This is f0. Determine the class width of the median class. This is h.

14 Cumulative number of births
Example-:Find Median Age in years Number of births Cumulative number of births 677 1908 2585 1737 4332 1040 5362 294 5656 91 5747 16 5763 All ages -

15 Mode The mode is simply the category or value which occurs the most in a data set. If a category has radically more than the others, it is a mode. Generally speaking we do not consider more than two modes in a data set. No clear guideline exists for deciding how many more entries a category must have than the others to constitute a mode.

16 Obvious Example There is obviously more yellow than red or blue.
Yellow is the mode. The mode is the class, not the frequency.

17 Bimodal

18 No Mode Category Frequency 1 51 2 51 3 66 4 62 5 65 6 57 7 47 8 43 64
1 51 2 51 3 66 4 62 5 65 6 57 7 47 8 43 64 Although the third category is the largest, it is not sufficiently different to be called the mode.

19 Example-2: Find Mean, Median and Mode of Ungroup Data
The weekly pocket money for 9 first year pupils was found to be: 3 , 12 , 4 , 6 , 1 , 4 , 2 , 5 , 8 Mean 5 Median 4 Mode 4

20 Mode of Group Data L1 = Lower boundary of modal class
Δ1 = difference of frequency between modal class and class before it Δ2 = difference of frequency between modal class and class after H = class interval

21 Steps of Finding Mode Find the modal class which has highest frequency
L0 = Lower class boundary of modal class h = Interval of modal class Δ1 = difference of frequency of modal class and class before modal class Δ2 = difference of frequency of modal class and class after modal class

22 Example -4: Find Mode Slope Angle (°) Midpoint (x) Frequency (f)
frequency (fx) 0-4 2 6 12 5-9 7 84 10-14 15-19 17 5 85 20-24 22 Total n = 30 ∑(fx) = 265

23 Midrange The midrange is the average of the lowest and highest value in the data set. This measure is not often used since it is based strictly on the two extreme values in the data.

24 Midrange Example

25 Measures of Variation Same mean, but y varies more than x.

26 Three Measures of Variation
While there are other measures, we will look at only three: Variance Standard deviation Coefficient of variation Population mean and sample mean use an identical formula for calculation. There is a minor difference in the formulas for variation.

27 Population Variance The population variance, σ2, is found using either of the formulas to the right. The differences are squared to prevent the sum from being zero for all cases. N is the size of the population, μ is the population mean. Note that variance is always positive if x can take on more than one value.

28 Population Standard Deviation
The standard deviation can be thought of as the average amount we could expect the x’s in the population to differ from the mean value of the population. To get the standard deviation, simply take the square root of the variance.

29 Sample Variance The sample variance, s2, is found using either of the formulas to the right. The differences are squared to prevent the sum from being zero for all cases. The sample size is n, x-bar is the sample mean. Note that n-1 is used rather than n. This adjustment prevents bias in the estimate.

30 Sample Standard Deviation
Just like the standard deviation of a population, to find the standard deviation of a sample, take the square root of the sample variance.

31 Coefficient of Variation
The measures discussed so far are primarily useful when comparing members from the same population, or comparing similar populations. When looking at two or more dissimilar populations, it doesn’t make any more sense to compare standard deviations than it does to compare means.

32 Coefficient of Variation Cont.
Example 1: Weight loss programs A and B. Two different programs with the same goal and target population. While program B averages more weight loss, it also has less consistent results. A B Mean (weight loss per month) 20 25 Standard deviation 15 30

33 Coefficient of Variation Cont.
Example 2: Weight loss program A and tax refund B. Two different programs with different goals and different target populations. We know that average weight loss and average tax refund are not comparable. Are the standard deviations comparable? A B Mean 20 650 Standard deviation 15 30

34 Coefficient of Variation Cont.
In the last example we can see an argument that standard deviation does not give the complete picture. The coefficient of variation addresses this issue by establishing a ratio of the standard deviation to the mean. This ratio is expressed as a percentage.

35 Coefficient of Variation Cont.
Looking at the two examples. We see that in both cases the standard deviation for B is twice that of A. In the first example we have almost twice the relative variation in B. In the second example, we have a little over 16 times as much variation in A. A B CV Example 1 75% 120% CV Example 2 4.6%

36 Measures of Position The dot on the left is at about -1, the dot on the right is at approximately 0.8. But where are they relative to the rest of the values in this distribution.

37 Quartiles, Percentiles and Other Fractiles
We will only consider the quartile, but the same concept is often extended to percentages or other fractions. The median is a good starting point for finding the quartiles. Recall that to find the median, we wanted to locate a point so that half of the data was smaller, and the other half larger than that point.

38 Quartile For quartiles, we want to divide our data into 4 equal pieces. Suppose we had the following data set (already in order) Choosing the numbers 7.5, 8.5, and 18.5 as markers would Divide the data into 4 groups, each with three elements. These numbers would be the three quartiles for this data set.

39 Quartiles Continued Conceptually, this is easy, simply find the median, then treat the left hand side as if it were a data set, and find its median; then do the same to the right hand side. This is not always simple. Consider the following data set. The first difficulty is that the data set does not divide nicely. Using the rules for finding a median, we would get quartiles of 3, 6 and 8. The second difficulty is how many of the 3’s are in the first quartile, and how many in the second?

40 Quartiles Continued For this course, let’s pretend that this is not an issue. I will give you the quartiles. I will not ask how many are in a quartile.

41 Interquartile Range One method for identifying these outliers, involves the use of quartiles. The interquartile range (IQR) is Q3 – Q1. All numbers less than Q1 – 1.5(IQR) are probably too small. All numbers greater than Q (IQR) are probably too large.

42 Measures of Variation: Variance & Standard Deviation for GROUPED DATA
The grouped variance is The grouped standard deviation is

43 Example 3-24 (p130): Miles Run per Week
Find the variance and the standard deviation for the frequency distribution below. The data represents the number of miles that 20 runners ran during one week. Class f Xm f·Xm f·(Xm –X) 5.5 – 10.5 10.5 – 15.5 15.5 – 20.5 20.5 – 25.5 25.5 – 30.5 30.5 – 35.5 35.5 – 40.5 1 2 3 5 4 20 8 13 18 23 28 33 38 1·8 = 8 2·13 = 26 3·18 = 54 5·23 = 115 4·28 =108 3·33 = 99 2·38 = 76 Σf·Xm= 486 1(8-24.3)2 = 2( )2 = 3( )2 = 5( )2 = 8.45 4( )2 =54.76 3( )2 = 2( )2 = Σ f·(Xm –X) =

44 Mean Deviation The mean deviation is an average of absolute deviations of individual observations from the central value of a series. Average deviation about mean k = Number of classes xi= Mid point of the i-th class fi= frequency of the i-th class

45 Coefficient of Mean Deviation
The third relative measure is the coefficient of mean deviation. As the mean deviation can be computed from mean, median, mode, or from any arbitrary value, a general formula for computing coefficient of mean deviation may be put as follows:

46 Coefficient of Range The coefficient of range is a relative measure corresponding to range and is obtained by the following formula: where, “L” and “S” are respectively the largest and the smallest observations in the data set.

47 Coefficient of Quartile Deviation
The coefficient of quartile deviation is computed from the first and the third quartiles using the following formula:

48 Assignment-1 Find the following measurement of dispersion from the data set given in the next page: Range, Percentile range, Quartile Range Quartile deviation, Mean deviation, Standard deviation Coefficient of variation, Coefficient of mean deviation, Coefficient of range, Coefficient of quartile deviation

49 Cumulative frequencies
Data for Assignment-1 Marks No. of students Cumulative frequencies 40-50 6 50-60 11 17 60-70 19 36 70-80 53 80-90 13 66 90-100 4 70 Total


Download ppt "Measures of Central Tendency"

Similar presentations


Ads by Google