Presentation is loading. Please wait.

Presentation is loading. Please wait.

GROUPED DATA LECTURE 5 OF 6 8.DATA DESCRIPTIVE SUBTOPIC

Similar presentations


Presentation on theme: "GROUPED DATA LECTURE 5 OF 6 8.DATA DESCRIPTIVE SUBTOPIC"— Presentation transcript:

1 GROUPED DATA LECTURE 5 OF 6 8.DATA DESCRIPTIVE SUBTOPIC
8.3 : Measures of Location 8.4 : Measures of Dispersion

2 LEARNING OUTCOMES 8.3(b) Find and interpret the mean, mode, median, quartiles and percentile for grouped data 8.3(c) Describe the symmetry and skewness for a data distribution 8.4(b) Find and interpret variance, standard deviation and coefficient of variation for grouped data

3 Sketch of Median, Quartiles, Interquartiles, Decile and Percentile from ogive
Cumulative frequency P75 = Q3 = D7.5 Median = P50 = Q2 = D5 P25=Q1 = D2.5 X1 X2 X3 Class boundaries

4 Using the ogive drawn below, determine the
Example 1 Using the ogive drawn below, determine the 5 10 15 20 25 30 40 35 Median First quartile Third decile Seventieth percentile

5 Solution Median: 60/2= 30th observation From the ogive, the median = 20 (b) First quartile:60/4=15th observation From the ogive, the first quartile =12.5 (c) Third decile;3/10 X 60=18th From the ogive, the third decile =14 (d) Seventieth percentile; 70/100 X 60=42th From the ogive percentile is = 24.5

6 Seventieth percentile
5 10 15 20 25 30 40 35 Seventieth percentile Median Third decile First quartile 12.5 14 20 24.5

7 Shape of data distribution Symmetry and Skewness
The general shape of the data distribution can be determine from mean, median and mode as illustrated in the histogram or frequency curve. For largely skewed distribution, median is more appropriate measure of central tendency. For symmetrical distribution or almost symmetrical distribution, mean is the appropriate measure of central tendency.

8 Shape of data distribution Symmetry and Skewness
Three important shapes: i. Symmetry ii. Positively skewed or right skewed distribution iii. Negatively skewed or left-skewed distribution

9 Mean = Median = Mode SYMMETICAL
Symmetrical ~The values of the mean, median and mode are identical. ~They lie at the center. frequency Mean = Median = Mode SYMMETICAL Mean Median Mode variable

10 IN DETAIL A set of observations is symmetrically distributed if its graphical representation (histogram, bar chart) is symmetric with respect to a vertical axis passing through the mean. For a symmetrically distributed population or sample, the mean, median and mode have the same value. Half of all measurements are greater than the mean, while half are less than the mean.

11 Mean > Median > Mode POSITIVELY SKEWED
(ii) Positively skewed or Skewed to the right ~The value of the mean is the largest ~The mode is the smallest ~The median lies between these two values frequency Mean > Median > Mode POSITIVELY SKEWED Mode Mean variable Median

12 IN DETAIL A set of observations that is not symmetrically distributed is said to be skewed. It is positively skewed if a greater proportion of the observations are less than or equal to (as opposed to greater than or equal to) the mean; this indicates that the mean is larger than the median. The histogram of a positively skewed distribution will generally have a long right tail; thus, this distribution is also known as being skewed to the right.

13 Mean < Median < Mode NEGATIVELY SKEWED
(iiI) Negatively skewed or Skewed to the left ~The value of the mean is the smallest ~The mode is the largest ~The median lies between these two values Mean < Median < Mode NEGATIVELY SKEWED frequency Mean Mode variable Median

14 IN DETAIL A negatively skewed distribution has more observations that are greater than or equal to the mean. Such a distribution has a mean that is less than the median. The histogram of a negatively skewed distribution will generally have a long left tail; thus, the phrase skewed to the left is applied here.

15 MEASURES OF DISPERSION
VARIANCE STANDARD DEVIATION MEASURES OF DISPERSION RANGE INTER-QUARTILE RANGE

16 RANGE INTERQUARTILE RANGE Range = upper boundary of the last data
- lower boundary of the first class INTERQUARTILE RANGE Defined as the difference between the third quartile and the first quartile Interquartile range = Q3 - Q1

17 Variance and standard deviation

18 Find the range, variance and standard deviation
Example 2: Find the range, variance and standard deviation Class Intervals Frequency Class mark x 1-3 5 2 10 20 4-6 3 15 75 7-9 8 16 128 10-12 1 11 121 13-15 6 14 84 1176 16-18 4 17 68 1156

19 Solution: Range = upper boundary of the last data - lower boundary of the first class = 18.5 – 0.5 = 18

20 REMARK Sometimes we would like to compare the variability of two different data sets that have different units of measurement. Standard deviation is not suitable since it is a measure of absolute variability and not of relative variability. The most appropriate measure is the coefficient of variation (CV) which expresses standard deviation as a percentage of the mean.

21 Coefficient of variation
Note: A larger coefficient of variation means that the data is more dispersed and less consistent.

22 Example : Suppose we want to compare two production process that fill containers with products
Process A is filling fertilizer bags, which have a nominal weight of 80 pounds. For process A : For process A, Process B is filling cornflakes boxes, which have a nominal weight of 24 ounces. For process B : For process B,

23 Is process A much more variable than process B because 1
Is process A much more variable than process B because 1.2 is three times larger than 0.4? No because the two processes have very similar variability relative to the size of their means


Download ppt "GROUPED DATA LECTURE 5 OF 6 8.DATA DESCRIPTIVE SUBTOPIC"

Similar presentations


Ads by Google