Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measure of Central Tendency

Similar presentations


Presentation on theme: "Measure of Central Tendency"— Presentation transcript:

1 Measure of Central Tendency
Dr. Anshul Singh Thapa

2 Introduction When scores or other measures have been tabulated into a frequency distribution, usually the next task is to calculate a measure of central tendency, or central position. The value of a measure of central tendency is twofold. First, it is an average which represent all of the scores made by the group, and as such gives a concise description of the performance of the group as a whole; and second it enable us to compare two o more groups in terms of typical performance. There are several statistical measures of central tendency or “averages”. The three most commonly used averages are: Arithmetic Mean Median Mode There are two more types of averages i.e. Geometric Mean and Harmonic Mean, which are suitable in certain situations.

3 ARITHMETIC MEAN FOR UNGROUPED DATA ARITHMETIC MEAN FOR GROUPED DATA
ASSUMED MEAN METHOD STEP DEVIATION METHOD DISCRETE SERIES CONTINUOUS SERIES DIRECT METHOD DIRECT METHOD ASSUMED MEAN METHOD STEP DEVIATION METHOD STEP DEVIATION METHOD DIRECT METHOD

4 Arithmetic mean Arithmetic mean is the most commonly used measure of central tendency. It is defined as the sum of the values of all observations divided by the number of observations and is usually denoted by X . In general, if there are N observations as X1, X2, X3, ..., XN, then the Arithmetic Mean is given by X = X1 + X2 + X3 +…+ XN N = ΣX Where, ΣX = sum of all observations and N = total number of observations.

5 How Arithmetic Mean is Calculated
The calculation of arithmetic mean can be studied under two broad categories: Arithmetic Mean for Ungrouped Data. Arithmetic Mean for Grouped Data.

6 Arithmetic Mean for Series of Ungrouped Data
Direct Method Arithmetic mean by direct method is the sum of all observations in a series divided by the total number of observations.

7 Example The following table gives the monthly income of 10 employees in an office: Employee Monthly income 1 1780 6 1920 2 1760 7 1100 3 1690 8 1810 4 1750 9 1050 5 1840 10 1950 X= ΣX, ΣX = 16650, N=10 N X = = 1665 10

8 Assumed Mean Method If the number of observations in the data is more and/or figures are large, it is difficult to compute arithmetic mean by direct method. The computation can be made easier by using assumed mean method. In order to save time in calculating mean from a data set containing a large number of observations as well as large numerical figures, you can use assumed mean method. Here you assume a particular figure in the data as the arithmetic mean on the basis of logic/experience. Then you may take deviations of the said assumed mean from each of the observation. You can, then, take the summation of these deviations and divide it by the number of observations in the data. The actual arithmetic mean is estimated by taking the sum of the assumed mean and the ratio of sum of deviations to number of observations. Symbolically, Let, A = assumed mean X = individual observations N = total numbers of observations d = deviation of assumed mean from individual observation, i.e. d = X – A Then sum of all deviations is taken as Σd = Σ(X − A) Then find Σd N Then add A and Σd to get X Therefore, X = A + Σd Any value, whether existing in the data or not, can be taken as assumed mean. However, in order to simplify the calculation, centrally located value in the data can be selected as assumed mean.

9 Example X = A + Σd N A = 1800, Σd = - 1350, N = 10
Deviation (D = X – A) (X ) 1780 – 1800 = -20 1760 – 1800 = -40 1690 – 1800 = -110 1750 – 1800 = -50 1840 – 1800 = +40 1920 – 1800 = +120 1100 – 1800 = -700 1810 – 1800 = +10 1050 – 1800 = -750 1950 – 1800 = +150 Σd = -1350 Income (X) 1780 1760 1690 1750 1840 1920 1100 1810 1050 1950 Employee 1 2 3 4 5 6 7 8 9 10 N = 10 X = A + Σd N A = 1800, Σd = , N = 10 X = 1800 – 1350 = 1800 – 135 = 1665 10

10 Step Deviation Method The calculations can be further simplified by dividing all the deviations taken from assumed mean by the common factor ‘c’. The objective is to avoid large numerical figures, i.e., if d = X – A is very large, then find d'. This can be done as follows: d’ = d = X – A c c The formula is given below: X = A + Σd’ x c N Where d' = (X – A)/c, c = common factor, N = number of observations, A= Assumed mean.

11 Example X = A + Σd’ x c N A = 1800, Σd’ = - 135, N = 10, c = 10
Income (X) 1780 1760 1690 1750 1840 1920 1100 1810 1050 1950 Deviation (D = X – A) (X ) 1780 – 1800 = -20 1760 – 1800 = -40 1690 – 1800 = -110 1750 – 1800 = -50 1840 – 1800 = +40 1920 – 1800 = +120 1100 – 1800 = -700 1810 – 1800 = +10 1050 – 1800 = -750 1950 – 1800 = +150 Σd = -1350 (d’ = X – A) C -20/ 10 = -2 -40/10 = -4 -110/10 = -11 -50/10 = -5 +40/10 = 4 +120/10 = 12 -700/10 = -70 +10/10 = 1 -750/ 10 = -75 +150/10 = 15 Σd = -135 Employee 1 2 3 4 5 6 7 8 9 10 N = 10 X = A + Σd’ x c N A = 1800, Σd’ = - 135, N = 10, c = 10 X = 1800 – 135 x 10 = 1800 – 135 = 1665 10

12 ARITHMETIC MEAN FOR UNGROUPED DATA
ASSUMED MEAN METHOD X = A + Σd N STEP DEVIATION METHOD X = A + Σd’ x c N DIRECT METHOD X = ΣX N

13 Calculation of arithmetic mean for Grouped data Discrete Series
Direct Method In case of discrete series, frequency against each observation is multiplied by the value of the observation. The values, so obtained, are summed up and divided by the total number of frequencies. Symbolically X = ΣfX Σf Where, ΣfX = sum of the product of variables and frequencies. Σf = sum of frequencies.

14 Example Calculate mean farm size of cultivating households in a village for the following data. Farm Size (in acres): 64, 63, 62, 61, 60, 59 No. of Cultivating Households: 8, 18, 12, 9, 7, 6

15 No. of cultivating households (f) (2)
Farm Size (X) in acres (1) No. of cultivating households (f) (2) X (1 × 2) (3) D (X - 62) (4) Fd (5) 64 8 512 +2 +16 63 18 1134 +1 +18 62 12 744 61 9 549 -1 -9 60 7 420 -2 -14 59 6 354 -3 -18 Σf = 60 Σfx = 3713 -7 X = ΣfX = = acres Σf

16 Assumed Mean Method As in case of individual series the calculations can be simplified by using assumed mean method, as described earlier, with a simple modification. Since frequency (f) of each item is given here, we multiply each deviation (d) by the frequency to get fd. Then we get Σfd. The next step is to get the total of all frequencies i.e. Σf. Then find out Σfd/ Σf. Finally the arithmetic mean is calculated by X = A + Σfd Σf

17 No. of cultivating households (f) (2)
Farm Size (X) in acres (1) No. of cultivating households (f) (2) d (X - 62) Fd 64 8 64 – 62 = +2 +16 63 18 64 = 63 = +1 +18 62 12 62 – 62 = 0 61 9 61 – 62 = -1 -9 60 7 60 – 62 = -2 -14 59 6 59 – 62 = -3 -18 Σf = 60 -3 -7 X = A + Σfd = (-7) = 61.88 Σf

18 Step Deviation Method In this case the deviations are divided by the common factor ‘c’ which simplifies the calculation. Here we estimate d’ = d = X – A c c in order to reduce the size of numerical figures for easier calculation. Then get fd' and Σfd'. The formula for arithmetic mean using step deviation method is given as, X = A + Σfd’ x c Σf

19 ARITHMETIC MEAN FOR GROUPED DATA
DISCRETE SERIES ASSUMED MEAN METHOD X = A + Σfd Σf STEP DEVIATION METHOD X = A + Σfd’ x c Σf DIRECT METHOD X = ΣfX Σf

20 Continuous Series Here, class intervals are given. The process of calculating arithmetic mean in case of continuous series is same as that of a discrete series. The only difference is that the mid-points of various class intervals are taken. We have already known that class intervals may be exclusive or inclusive or of unequal size. Class intervals may be exclusive or inclusive or of unequal size Example of exclusive class interval is, say, 0–10, 10–20 and so on. Example of inclusive class interval is, say, 0–9, 10–19 and so on. Example of unequal class interval is, say, 0–20, 20–50 and so on. In all these cases, calculation of arithmetic mean is done in a similar way.

21 Example Calculate average marks of the following students using (a) Direct method (b) Step deviation method. Marks 0–10, 10–20, 20–30, 30–40, 40–50, 50–60, 60–70 No. of Students 5, 12, 15, 25, 8, 3, 2

22 Direct Method Marks (x) No. of Students (f) Mid value (m) fm 0–10 5 25
10–20 12 15 180 20–30 375 30–40 35 875 40–50 8 45 360 50–60 3 55 165 60–70 2 65 130 70 2110 X = Σfm = 2110 = 30.14 Σf

23 Step deviation Method Marks (x) No. of Students (f) Mid value (m) fm
10 fd’ 0–10 5 25 5 – 35 = -30 -3 -15 10–20 12 15 180 15 – 35 = -20 -2 -24 20–30 375 25 – 35 = -10 -1 30–40 35 875 35 – 35 = 0 40–50 8 45 360 45 – 35 = 10 1 50–60 3 55 165 55 – 35 = 20 2 6 60–70 65 130 65 – 35 = 30 70 2110 -34 X = A + Σfd’ x c = (-34) x 10 = 30.14 Σf

24 ARITHMETIC MEAN FOR GROUPED DATA
CONTINUOUS SERIES DIRECT METHOD X = Σfm Σf STEP DEVIATION METHOD X = A + Σfd’ x c Σf

25 MEDIAN The arithmetic mean is affected by the presence of extreme values in the data. If you take a measure of central tendency which is based on middle position of the data, it is not affected by extreme items. Median is that positional value of the variable which divides the distribution into two equal parts, one part comprises all values greater than or equal to the median value and the other comprises all values less than or equal to it. The Median is the “middle” element when the data set is arranged in order of the magnitude.

26 Position of median = (N+1)th item 2 Median = size of (N+1)th item
FOR UNGROUPED DATA MEDIAN FOR GROUPED DATA EVEN NUMBER ODD NUMBER DISCRETE SERIES CONTINUOUS SERIES Position of median = (N+1)th item 2 Median = size of (N+1)th item L + (N/2 - c.f.) × i f (N+1)/2

27 Computation of Median The median can be easily computed by sorting the data from smallest to largest and counting the middle value. Example Suppose we have the following observation in a data set: 5, 7, 6, 1, 8,10, 12, 4, and 3. Arranging the data, in ascending order you have: 1, 3, 4, 5, 6, 7, 8, 10, 12. The “middle score” is 6, so the median is 6. Half of the scores are larger than 6 and half of the scores are smaller. If there are even numbers in the data, there will be two observations which fall in the middle. The median in this case is computed as the If there are even numbers in the data, there will be two observations which fall in the middle. The median in this case is computed as the

28 Example The following data provides marks of 20 students. You are required to calculate the median marks. 25, 72, 28, 65, 29, 60, 30, 54, 32, 53, 33, 52, 35, 51, 42, 48, 45, 47, 46, 33. Arranging the data in an ascending order, you get 25, 28, 29, 30, 32, 33, 33, 35, 42, 45, 46, 47, 48, 51, 52, 53, 54, 60,65, 72. You can see that there are two observations in the middle, 45 and 46. The median can be obtained by taking the mean of the two observations: Median = = 45.5 marks 2

29 In order to calculate median it is important to know the position of the median i.e. item/items at which the median lies. The position of the median can be calculated by the following formula: Position of median = (N+1)th item 2 Where N = number of items. You may note that the above formula gives you the position of the median in an ordered array, not the median itself. Median is computed by the formula: Median = size of (N+1)th item = (20+1) = 10.5th item Size of 10.5th item = 10th item + 11th item = = 45.5

30 Discrete Series In case of discrete series the position of median i.e. (N+1)/2th item can be located through cumulative frequency. The corresponding value at this position is the value of median.

31 Example The frequency distribution of the number of persons and their respective incomes (in Rs) are given below. Calculate the median income. INCOME (Rs) 10 20 30 40 NO. OF PERSON 2 4

32 Income No. of Person Cumulative Frequency 10 2 20 4 6 30 16 40 The median is located in the (N+1)/2 = (20+1)/2 = 10.5th observation. This can be easily located through cumulative frequency. The 10.5th observation lies in the c.f. of 16. The income corresponding to this is Rs 30, so the median income is Rs 30.

33 Continuous Series In case of continuous series you have to locate the median class where N/2th item [not (N+1)/2th item] lies. The median can then be obtained as follows: Median = L + (N/2 - c.f.) × i f Where, L = lower limit of the median class, c.f. = cumulative frequency of the class preceding the median class, f = frequency of the median class, i = class interval of the median class. No adjustment is required if frequency is of unequal size or magnitude.

34 Example Following data relates to daily wages of persons working in a factory. Compute the median daily wage. Daily wages 55–60 50–55 45–50 40–45 35–40 30–35 25–30 20–25 No. of workers 7 13 15 20 30 33 28 14

35 Median = L + (N/2 - c.f.) × i f = 35 + (80 - 75) x 5 = 35.83 30
Daily Wages No. of Workers Cumulative frequency 20–25 14 25–30 28 42 30–35 33 75 35–40 30 105 40–45 20 125 45–50 15 140 50–55 13 153 55–60 7 160 Median = L + (N/2 - c.f.) × i f = 35 + ( ) x 5 =

36 Thus, the median daily wage is Rs 35. 83
Thus, the median daily wage is Rs This means that 50% of the workers are getting less than or equal to Rs and 50% of the workers are getting more than or equal to this wage.

37 MODE Sometimes, you may be interested in knowing the most typical value of series or the value around which maximum concentration of items occurs. For example, a manufacturer would like to know the size of shoes that has maximum demand or style of the shirt that is more frequently demanded. Here, Mode is the most appropriate measure. The word mode has been derived from the French word “la Mode” which signifies the most fashionable values of a distribution, because it is repeated the highest number of times in the series. Mode is the most frequently observed data value. It is denoted by Mo.

38 Value which occurs with the highest frequency L + (D1) × i D1 – D2
MODE MODE FOR UNGROUPED DATA MODE FOR GROUPED DATA EVEN NUMBER ODD NUMBER DISCRETE SERIES CONTINUOUS SERIES Value which occurs with the highest frequency L (D1) × i D1 – D2 Grouping Table and Analysis Table L (D1) × i D1 – D2 Unimodal, Bimodal and Multimodal

39 Computation of Mode For determining mode count the number of times the various values repeat themselves and the value occurring maximum number of times is the modal value. The process of determining mode in case of ungrouped data essentially involves grouping of data.

40 Example Sr. No. Marks Obtained 1 10 2 27 3 24 4 12 5 6 7 20 8 18 9 15
30 Since the item 27 occurs the maximum number of times, i.e., 3, hence the modal marks are 27

41 Discrete Series Consider the data set 1, 2, 3, 4, 4, 5. The mode for this data is 4 because 4 occurs most frequently (twice) in the data. Example Look at the following discrete series: Marks Frequency

42 Here, as you can see the maximum frequency is 36, the value of mode is 20. In this case, as there is a unique value of mode, the data is Unimodal. But, the mode is not necessarily unique, unlike arithmetic mean and median. You can have data with two modes (bi-modal) or more than two modes (multi-modal). It may be possible that there may be no mode if no value appears more frequent than any other value in the distribution. For example, in a series 1, 1, 2, 2, 3, 3, 4, 4, there is no mode.

43 From the above data we can clearly say that the modal size is 20 because the value 20 has occurred the maximum numbers of times, i.e., 36. However, where the mode is determined just by inspection, an error of maximum frequency and the frequency preceding it or succeeding it is very small and items are heavily concentrated on either side. In such cases it is desirable to prepare a grouping table and an analysis table. A grouping table has six columns. In column 1 the maximum frequency is marked or put in a circle; in column 2 frequency are grouped in two’s; in column 3 leave the first frequency and then grouped the remaining in two’s; in column 4 group the frequencies in three’s; in column 5 leave the first frequency and group the frequencies in three’s; and in column 6 leave the first two frequencies and then group the remaining in three’s. in each of these cases take the maximum total and mark it in a circle or by bold type.

44 After preparing the grouping table, prepare an analysis table
After preparing the grouping table, prepare an analysis table. While preparing this table put column number on the left hand side and the various probable values of mode on the right hand side. The values against which frequencies are the highest are marked in the grouping table and then entered by means of a bar in the relevant ‘box’ corresponding to the value they represent.

45 GROUPING TABLE x f II III IV V VI 10 08 20 56 15 12 48 20 36 83 71 99
25 35 63 30 28 81 46 55 35 18 27 40 09

46 ANALYSIS TABLE Col. No. 20 25 30 I 1 II 1 1 1 1 III 1 1 IV 1 V 1 VI 1 1 1 4 5 3 Corresponding to the maximum total 5, the value of the variable is 25

47 Continuous Series In case of continuous frequency distribution, modal class is the class with largest frequency. Mode can be calculated by using the formula: Mo = L = D1 x i D1 + D2 Where L = lower limit of the modal class, D1= difference between the frequency of the modal class and the frequency of the class preceding the modal class (ignoring signs). D2 = difference between the frequency of the modal class and the frequency of the class succeeding the modal class (ignoring signs), i = class interval of the distribution.

48 Marks No. of Students 0 – 10 3 10 – 20 5 20 – 30 7 30 – 40 10 40 – 50 12 50 – 60 15 60 – 70 70 – 80 6 80 – 90 2 90 – 100 8

49 By inspection the model class is 50 – 60.
Mo = L = D1 x i D1 + D2 L = 50, D1 = ( ) = 3, D2 = ( ) = 3, i = 10 Mo = x 10 = = 55 3 + 3

50 Weight (Kg.) No. of Students 93 – 97 2 98 – 102 5 103 – 107 12 108 – 112 17 113 – 117 14 118 – 122 6 123 – 127 3 128 – 132 1

51 By inspection mode lies in the class 108 – 112
By inspection mode lies in the class 108 – 112. but the real limits of this class are – 112.5 Mo = L = D1 x i D1 + D2 L = 107.5, D1 = ( ) = 5, D2 = ( ) = 3, i = 5 Mo = x 5 = = 5 + 3

52 Weight (Ibs.) No. of person 100 – 110 4 110 – 120 6 120 – 130 20 130 – 140 32 140 – 150 33 150 – 160 17 160 – 170 8 170 – 180 2

53 GROUPING TABLE x f II III IV V VI 100 - 110 04 10 30 110 - 120 06 26
20 58 52 85 32 65 33 82 50 58 17 25 27 08 10 02

54 ANALYSIS TABLE Col. No. I 1 II 1 1 1 1 III 1 1 IV V 1 1 1 VI 1 1 1 3 5 5 This is a bi-modal series. Hence mode has to be determined by applying the formula

55 Empirical mode Mode = 3 Median – 2 Mean Mode = (3 X ) – (2 X ) Mode = – = Hence the modal weight is Ib.

56 Empirical Mode Where mode is ill-defined, its value may be ascertained by the following formula base upon the relationship between mean, median and mode. Mode = 3 Median – 2 Mean This measure is called the empirical mode.

57 Types of Series Method of Calculation 1. Ungrouped Data (individual Series) The value that occurs the most in the series is identified as Mode of the series by inspection method. If the frequency of all values is equal the same are changed into discrete frequency distribution and then mode is calculated. 2. Discrete Series/ Frequency Array If the items with highest frequency are more than one, then the grouping method is used. 3. Continuous Series Exclusive: Series with highest frequency is called modal class. The actual value of mode is determined using the following formula: Mo = L D1 x i D1 + D2 (b) Inclusive: Inclusive series are converted into exclusive. 4. Moderately asymmetrical distributions Mode = 3 Median – 2 Mean

58 RELATIVE POSITION OF ARITHMETIC MEAN, MEDIAN AND MODE
Suppose we express, Arithmetic Mean = Me, Median = Mi, Mode = Mo The relative magnitude of the three are Me>Mi>Mo or Me<Mi<Mo. The median is always between the arithmetic mean and the mode.

59 Relationship among Mean, Median and Mode
A distribution in which the values of mean, median and mode coincide (i.e., mean = median = mode)is known as a symmetrical distribution. Conversely stated, when the values of mean, median and mode are not equal the distribution is known as asymmetrical or skewed. in case of asymmetrical distribution, value of mean, mode and median will be different. As such, frequency curve will not be bell shaped. It may be skewed either to the right or to the left. If the distribution is skewed more to the right i.e. positive, the mean and the median will be grater than mode.. In order words mode s the minimum. If the distribution is skewed more to the left, i.e., negative, the mean and the median will be less than mode. In other words mode will be maximum.

60


Download ppt "Measure of Central Tendency"

Similar presentations


Ads by Google