Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed.

Measures of Central Tendency

These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed to be located or concentrated These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed to be located or concentrated There are three such measures: There are three such measures: i) Mean i) Mean ii) Median, Quartiles, Percentiles and ii) Median, Quartiles, Percentiles and Deciles Deciles iii) Mode iii) Mode

Mean Arithmetic Mean Arithmetic Mean Harmonic Mean Harmonic Mean Geometric Mean Geometric Mean 1) Ungrouped data Sum of Observations x 1 + x 2 +….x n Sum of Observations x 1 + x 2 +….x n Mean = ----------------------------- = -------------------- Number of Observations n Number of Observations n

2) Grouped data When the data is grouped, prepare frequency table Mid-point of Class Frequency ( f i ) Class Interval Mid-point of Class Frequency ( f i ) Interval ( X i ) -- x 1 f 1 -- -- -- -- x k f k ∑ f i x i x = --------------- ∑ f i Where x i is the middle point of the ith class interval. f i is the frequency of the ith class interval. f i x i is the product of f i and x i and k is the number of class intervals

Median Whenever there are some extreme values in data, calculation of A.M. is not desirable Median of a set of values is defined as the middle most value of a series of values arranged in ascending / descending order If the number of observations is odd, the value corresponding to the middle most values is the median If the number of observations is even then the average of the two middle most values is the median

Example 3144 4784 4923 5034 5424 5561 6505 6707 6874 4187 4310 4506 4745 5071 2717 2796 3144 3527 3098 3534 3144 4784 4923 5034 5424 5561 6505 6707 6874 4187 4310 4506 4745 5071 2717 2796 3144 3527 3098 3534 Ascending Order 2717 2796 3098 3144 3527 3534 3862 4187 4310 4506 4745 4784 4923 5034 5071 5424 5561 6505 6707 6874 Hence, the number of observations is 20, and therefore there is no middle observation. Two middle most observations 10 th and 11 th. 4506 + 4745 9251 4506 + 4745 9251 Median = -------------------- = --------- = 4625.5 2 2 2 2

Quartiles Median divides the data into two parts such that 50 % of the observations are less than it and 50 % are more than it. Median divides the data into two parts such that 50 % of the observations are less than it and 50 % are more than it. Similarly there are “Quartiles”. There are three quartiles viz. Q1, Q2 and Q3. These are referred to as first, second and third quartiles. Similarly there are “Quartiles”. There are three quartiles viz. Q1, Q2 and Q3. These are referred to as first, second and third quartiles. The first quartile Q1, divides the data into two parts such that 25 % of the observations are less than it and 75 % more than it. The first quartile Q1, divides the data into two parts such that 25 % of the observations are less than it and 75 % more than it. The second quartile Q2 is the same as median. The second quartile Q2 is the same as median. The third quartile divides the data into two parts such that 75 % observations are less than it and 25 % are more than it. The third quartile divides the data into two parts such that 75 % observations are less than it and 25 % are more than it.

Percentiles Percentiles Percentiles splits the data into several parts, expressed in percentage. Percentiles splits the data into several parts, expressed in percentage. A percentage is also known as centile, divides the data in such a way that “given percent of the observations are less than it. A percentage is also known as centile, divides the data in such a way that “given percent of the observations are less than it. For example, 95 % of the observations are less than the 95 th percentile For example, 95 % of the observations are less than the 95 th percentile It may be noted that the 50 th percentile denoted as P 50 is the same as the median It may be noted that the 50 th percentile denoted as P 50 is the same as the median

Deciles The deciles divides the data into ten parts The deciles divides the data into ten parts First decile (10%) First decile (10%) Second (20%) and so on Second (20%) and so on

Mode It is defined in such a way that it represents the fashion of the observations in a data. It is defined in such a way that it represents the fashion of the observations in a data. Mode is defined as the most fashionable value, which, maximum number of observations have or tend to have as compared to any other value. Mode is defined as the most fashionable value, which, maximum number of observations have or tend to have as compared to any other value. Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12 Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12 Here mode is 8 because 3 observations have this value. Here mode is 8 because 3 observations have this value.

Measures of Variation/ Dispersion Measures of variation/dispersion provide an idea of the extent of variation present among the observations These are- Measures of variation/dispersion provide an idea of the extent of variation present among the observations These are- i) Range i) Range ii) Mean Deviation ii) Mean Deviation iii) Standard Deviation iii) Standard Deviation iv) Coefficient of Variation iv) Coefficient of Variation

Range It is the simplest measure of variation, and is defined as the difference between the maximum and the minimum values of the observations It is the simplest measure of variation, and is defined as the difference between the maximum and the minimum values of the observations Range = Maximum Value – Minimum Value Range = Maximum Value – Minimum Value Since the range depends only on the two viz. the minimum and the maximum values, and does not utilize the full information in the given data, it is not considered very reliable or efficient.. Since the range depends only on the two viz. the minimum and the maximum values, and does not utilize the full information in the given data, it is not considered very reliable or efficient.. Coefficient of scatter is another based on the range of the data Coefficient of scatter is another based on the range of the data Range Maximum – Minimum -------------------- = ------------------------------ Range Maximum – Minimum -------------------- = ------------------------------ Maximum + Minimum Maximum + Minimum Maximum + Minimum Maximum + Minimum It gives an indication about variability in the data

Mean Deviation In order to study the variation in a data, one method could be to take into consideration the deviation of all the observation from their mean In order to study the variation in a data, one method could be to take into consideration the deviation of all the observation from their mean Example ( Mean 50) Example ( Mean 50) Deviation from Mean Observation Deviation from Mean 50 0 50 0 49 - 1 49 - 1 51 +2 51 +2 40 -10 40 -10 10 -40 10 -40 90 +40 90 +40

Mean Deviation for Ungrouped Data If the data is ungrouped and the observations for certain variable x, are x 1, x 2, x 3, ….., x n If the data is ungrouped and the observations for certain variable x, are x 1, x 2, x 3, ….., x n ∑ xi - x ∑ xi - x Mean Deviation = --------------- Mean Deviation = --------------- n For the data comprising observations 1,2,3, it can be calculated as follows (x i ) x i – x x i – x (x i ) x i – x x i – x 1 -1 1 1 -1 1 2 0 0 2 0 0 3 +1 1 3 +1 1 --------------------------------------- --------------------------------------- Sum 6 0 2 Sum 6 0 2 Mean 2 0 2 / 3 Mean 2 0 2 / 3 Thus the mean deviation is 2 / 3 = 0.67

Mean Deviation for Grouped Data Class Interval Middle Point of Class Interval ( x i ) Frequency ƒί ( ƒί ) ƒί ƒί x I │x i -  x│ ƒί ƒί │x i -  x│ 2000-300025002500020504100 3000-4000350051750010505250 4000-5000450062700050300 5000-600055004220009503800 6000-7000650031950019505850 Sum2091000605019300 Average4550965

∑ ƒί ∑ ƒί │x i -  x│ 19300 Mean Deviation = --------------- = -------- = 965 ∑ ƒί 20 ∑ ƒί 20 x i is the middle point of class interval x i is the middle point of class interval  x is the mean ƒί is the frequency of the i th class interval

Variance and Standard Deviation While calculating mean deviation, the absolute values of observations from the mean were taken because without doing so, the total deviation was zero for the data comprising values 1,2 and 3 even though there was variation present among these observations. While calculating mean deviation, the absolute values of observations from the mean were taken because without doing so, the total deviation was zero for the data comprising values 1,2 and 3 even though there was variation present among these observations. However another way of getting over this problem of total deviation being zero is to take the squares of deviations of the observations from the mean However another way of getting over this problem of total deviation being zero is to take the squares of deviations of the observations from the mean xi xi -  x ( xi -  x ) 2 xi xi -  x ( xi -  x ) 2 1 -1 1 1 -1 1 2 0 0 2 0 0 3 +1 1 3 +1 1 -------------------------------------------------- -------------------------------------------------- Sum 6 0 2 Sum 6 0 2 Mean 2 0 2/3 (=0.67) Mean 2 0 2/3 (=0.67)

Calculation of variance and standard deviation for ungrouped data Calculation of variance and standard deviation for ungrouped data 1 Variance (σ 2 )= ---- ∑ ( xi -  x ) 2 Variance (σ 2 )= ---- ∑ ( xi -  x ) 2 n 1 = ----- x (2) = 0.67 = ----- x (2) = 0.67 3 The square root of σ 2 i. e σ is known as the standard deviation The square root of σ 2 i. e σ is known as the standard deviation Standard Deviation (σ ) = 0.67 = 0.82 Standard Deviation (σ ) = 0.67 = 0.82

Calculation of Variance and Standard Deviation for Grouped Data Class Interval Mid Point of Class Interval (x i ) Frequency ƒ ί ƒ ί x i ƒ ί x i 2 (x i -  x)(x i -  x) 2 ƒ ί │(x i -  x) 2 2000-300025002500012500000-205042025008405000 3000-4000350051750061250000-1050110025005512500 4000-5000450062700012150000 0 -50250015000 5000-6000550042200012100000 0 9509025003610000 6000-7000650031950012675000 0 1950380250011407500 Sum209100044300000 0 1001250028950000 Average (  x ) 4550Variance = 1447500

S.D. = Variance S.D. = Variance = 1447500 = 1447500 = 1203.12 = 1203.12

Combining Variances of Two Populations The mean and S.D. of the “lives” of tyres manufactured by two factories of the “Durable” tyre company, making 50,000 tyres, annually, at each of the two factories, are given below. Calculate the mean and standard deviation of all the 100000 tyres producced in a year. Group Mean (‘000 kms.) S.D. (‘000 kms.) 1 60 8 2 55 7

We know that if there is one set of data having n1 observations with mean=m1 and s.d.= σ1 σ 2 then the mean (m) and variance (σ 2 ) of the combined data with (n 1 + n 2 ) We know that if there is one set of data having n1 observations with mean=m1 and s.d.= σ1 and another set of data having n2 observations with mean = m2 and s.d. = σ 2 then the mean (m) and variance (σ 2 ) of the combined data with (n 1 + n 2 ) observations are given as observations are given as m = n 1 m 1 + n 2 m 2 / n 1 + n 2 m = n 1 m 1 + n 2 m 2 / n 1 + n 2 σ 2 = n 1 (σ 1 2 + d 1 2 ) +n 2 (σ 2 2 + d 2 2 ) / n 1 +n 2 σ 2 = n 1 (σ 1 2 + d 1 2 ) +n 2 (σ 2 2 + d 2 2 ) / n 1 +n 2 d 1 = m 1 – m d 1 = m 1 – m d 2 = m 2 – m m= combined mean of both d 2 = m 2 – m m= combined mean of both the sets of data the sets of data

Factory 1 : n 1 = 50, m 1 = 60 and σ 1 = 8 Factory 2 : n 2 = 50, m 2 = 55 and σ 1 = 7 Substituting these values in the above formulas Mean = (50 x 60) + (50 x 55) / (50+50) = (3000 + 2750) / 100 = (3000 + 2750) / 100 = 5750 / 100 = 5750 / 100 = 57.5 = 57.5 Thus the mean life of the tyres manufactured by the company is 57,500 kms. Thus the mean life of the tyres manufactured by the company is 57,500 kms.

Therefore, d1 = m1 – m = 60 - 57.5 = 2.5 d2 = m2 – m = 55 - 57.5 = -2.5 50x ( 8 2 + 2.5 2 ) + 50 x (7 2 + 2.5 2 ) 50x ( 8 2 + 2.5 2 ) + 50 x (7 2 + 2.5 2 ) Variance (σ 2 ) = __________________________ 50 + 50 50 + 50 = (50x70.25) +(50x 55.25) / 100 = (50x70.25) +(50x 55.25) / 100 = 3512.5 + 2762.5 /100 = 3512.5 + 2762.5 /100 = 6275 /100 = 6275 /100 = 62.75 = 62.75

Variance = (σ 2 ) = 62.75 Therefore S.D. (σ) = 62.75 = 7.92 Thus, the S.D of the lives of tyres produced by the company is 7,920 kms.

Mean Deviation The mean deviation is defined as ∑ ∑ ƒ ί x i -  x Mean Deviation = ------------------- ∑ ∑ ƒ ί Where, x1 is the middle point of i th class interval ƒ ί is the frequency of the ith class interval and  x Is the arithmetic mean of the I.Q. scores

Class Interval Frequen cy ƒ ί Mid Point of Class Interval x i ƒ ί x i X i -  xƒ ί X i -  x 40-50104545026.5265.0 50-602055110016.5330.0 60-70206513006.5130.0 70-80157511253.552.5 80-901585127513.5202.5 90-1002095190023.5470.0 Summation100-71501450.0

From this data, we get ∑ ∑ ƒ ί x i 7150 Mean = ----------- = ---------- = 71.5 ∑ ∑ ƒ ί 100 ∑ ∑ ƒ ί x i -  x 1450 Mean Deviation = ------------------- = --------- = 14.5 ∑ ∑ ƒ ί 100 Thus the average score is 71.5 and the mean deviation of the score is 14.5

Class IntervalFrequency ƒ ί Mid Point of Class Interval (x i ) ƒ ί x i Xi2Xi2 ƒ ί x i 2 40-501045450202520250 50-6020551100302560500 60-7020651300422584500 70-8015751125562584375 80-901585127572251,08,375 90-1002095190090251,80,500 Summation10071505,38,500 Suppose we are required to calculate only standard deviation for the Suppose we are required to calculate only standard deviation for the above data, then the table is constructed as below above data, then the table is constructed as below

∑ ∑ ƒ ί x i 7150 Mean = ----------- = ---------- = 71.5 ∑ ∑ ƒ ί 100 ∑ ∑ ∑ ƒ ί x i 2 _ _ (∑ ƒ ί )_ (  x _ ) 2 538500 – 100 (71.5) 2 S.D. = ----------------------------- = -------------------- ∑ ∑ ƒ ί 100 = 272.75 = 16.5 Thus, the s.d. of the I.Q. scores is 16.5

Coefficient of Variation It is a relative measure of dispersion that enables us to compare two distributions. It is a relative measure of dispersion that enables us to compare two distributions. It relates the standard deviation and the mean by expressing the standard deviation as a percentage of the mean It relates the standard deviation and the mean by expressing the standard deviation as a percentage of the mean σ C. V. = --------- x 100 C. V. = --------- x 100  x  x

Example For the data For the data 103,50,68,110,105,108,174,103,150,200,225,350,103 find the range, Coefficient of Range and coefficient of quartile deviation 103,50,68,110,105,108,174,103,150,200,225,350,103 find the range, Coefficient of Range and coefficient of quartile deviation 1) Range = H –L = 350 - 50 =300 H – L 300 H – L 300 2) Coefficient of range = ----------- = ---------- = 0.7 H + L 350+50 H + L 350+50

To find Q1 and Q3 we arrange the data in ascending order To find Q1 and Q3 we arrange the data in ascending order n+1 14 n+1 14 ------- = ------ = 3.5 ------- = ------ = 3.5 4 4 4 4 3 (n+1) 3 (n+1) ----------- = 10.5 ----------- = 10.5 4 Q1 = 103 + 0.5 103-103) = 103 Q2 = 174 + 0.5 (200 – 174) = 187

Q 3 – Q 1 187 - 103 Q 3 – Q 1 187 - 103 Coefficient of QD = ------------- = ------------ = 0.2896 Q 3 + Q 1 187+103 Q 3 + Q 1 187+103

Example A A purchasing agent obtained a sample of incandescent lamps from two suppliers. He had the sample tested in his laboratory for length of life with the following results. Length of light Sample A Sample B in hours 700 – 900 10 3 900 – 1100 16 42 1100 - 1300 26 12 1300 – 1500 8 3 Which company’s lamps are more uniform?

Class interval Sample A Midpoint x X - 1000 U = ------------ - 200 f uf u 2 700 - 90010800-1010 900 - 1100161000000 1100 - 1300261200126 1300 - 15008140021632 Total603268 Sample A

32 32  u A = -------- = 0.533 60 60  x A = 1000 + 200  u  x A = 1000 + 200  u = 1000 + 200 (0.533) = 1106.67 = 1000 + 200 (0.533) = 1106.67 1 68 1 68 σ 2 u = ---- ∑ f u 2 - (  u ) 2 = ------- - (0.533) 2 N 60 N 60 σ 2 u = 1.133 – 0.2809 = 0.8524 σ x = 200 x 0.9233 = 184.66 C. V. for sample A = σ A /  x A x 100 = 184.66 / 1106.67 x 100 = 184.66 / 1106.67 x 100 = 16.68 % = 16.68 %

Sample 2 Class interval Sample A Midpoint x X - 1000 U = ------------- 200 f uf u 2 700 - 9003800-33 900 - 1100421000000 1100 - 1300121200112 1300 - 1500314002612 Total6027

Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed.

Similar presentations

Presentation on theme: "Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed.

Similar presentations

Presentation on theme: "Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed."— Presentation transcript:

Similar presentations

About project

Feedback