Presentation on theme: "Skewness and choice of data analysis"— Presentation transcript:
1Skewness and choice of data analysis S1 Representing dataSkewness and choice of data analysis
2SkewnessThe first distribution shown has a positive skew. This means that it has a long tail in the positive direction.The distribution below it has a negative skew since it has a long tail in the negative direction.Finally, the third distribution is symmetric and has no skew.Distributions with positive skew are sometimes called "skewed to the right" whereas distributions with negative skew are called "skewed to the left."
3Skewness – visuals and calculations Calculate Q1, Q2, Q3, mode, mean and standard deviationDraw all 3 boxplots on one piece of graph paperData set , 3, 5, 5, 5, 7, 10Data set , 7, 7, 8, 12, 14, 20Data set , 6, 7, 9, 10, 10, 11For each data set find a relationship between the mode, median and mean using =,>,< symbolsFor each data set find a relationship between Q2-Q1 and Q3-Q2Work out 3(mean-median)standard deviation
5Skewness – Using the Quartiles Q2-Q1 = Q3-Q2Q2-Q1 < Q3-Q2Q2-Q1 > Q3-Q2
6Skewness – Using mode, median, mean Q2-Q1 = Q3-Q2Q2-Q1 < Q3-Q2Q2-Q1 > Q3-Q2Mode=median=meanMode<median<meanMode>median>mean
7Skewness calculations You can calculate3(mean-median)Standard deviationThis gives you a value to tell you how skewed the data are.The closer the number to zero the more symmetrical the dataNegative value means the data has a negative skew and vice versa
8Comparing data sets You should always compare data sets using a measure of location (mean, median, mode)a measure of spread (range, IQR, standard deviation)skewnessRange gives a rough idea of spread, but is affected by extreme values.Generally only used with small data groupsIQR not affected by extreme valuesTells you the spread of middle 50%Often used in conjunction with medianMean and standard deviation generally used when data are fairly symmetricaldata size is reasonably large