Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summary descriptive statistics: means and standard deviations:

Similar presentations


Presentation on theme: "Summary descriptive statistics: means and standard deviations:"— Presentation transcript:

1 Summary descriptive statistics: means and standard deviations:
Measures of central tendency ("averages") Measures of dispersion (spread of scores)

2 1. The Mode: The most frequent score in a set of scores. 6, 11, 22, 22, 96, 98. Mode = 22

3 Advantages of the mode: (i) Simple to calculate, easy to understand
Advantages of the mode: (i) Simple to calculate, easy to understand. (ii) The only average which can be used with nominal data. Disadvantages of the mode: (i) May be unrepresentative and hence misleading. e.g.: 3, 4, 4, 5, 6, 7, 8, 8, 96, 96, 96. Mode is 96 - but most of the scores are low numbers. (ii) May be more than one mode in a set of scores. e.g.: 3, 3, 3, 4, 4, 4, 6, 6, 6 has three modes!

4 2. The Median: When scores are arranged in order of size, the median is either
(a) the middle score (if there is an odd number of scores). 4, 5 ,6 ,7, 8, 8, 96. Median = 7. or (b) the average of the middle two scores (if there is an even number of scores). 4, 5, 6, 7, 8, 8, 96, 96. Median = (7+8)/2 = 7.5.

5 Advantages of the median:
(i) Resistant to the distorting effects of extreme high or low scores. Disadvantages of the median: (i) Ignores scores' numerical values, which is wasteful if data are interval or ratio. 50, 50, 60, 70, 80, 80, 100 0, 0, 0, 70, 800, 900, 1000 (ii) More susceptible to sampling fluctuations than the mean. (iii) Less mathematically useful than the mean.

6 Add all the scores together and divide by the total number of scores.
3. The Mean: Add all the scores together and divide by the total number of scores. e.g. ( ) / = / =

7 Advantages of the mean:
(i) Uses information from every single score. (ii) Resistant to sampling fluctuation - i.e., varies the least from sample to sample. (Important since we normally want to extrapolate from samples to populations). Disadvantages of the mean: (i) Susceptible to distortion from extreme scores. e.g.: 4, 5, 5, 6 : mean = 5. 4, 5, 5, 106: mean = 30. (ii) Can only be used with interval or ratio data, not with ordinal or nominal data.

8 1. The Range: The difference between the highest and lowest scores. (i.e. range = highest - lowest). Advantages: Quick and easy to calculate, easy to understand. Disadvantages: Unduly influenced by extreme scores. 3, 4, 4, 5, 100. Range = (100-3) = 97. 3, 4, 4, 5, 5. Range = (5-3) = 2. Conveys no information about the spread of scores between the highest and lowest scores. e.g. 2, 2, 2, 2, 2, 20 and 2, 20, 20, 20, 20, 20 have exactly the same range (18) but very different distributions.

9 2. The Standard Deviation (SD):
The "average difference of scores from the mean". The bigger the SD, the more scores differ from the mean and between themselves, and the less satisfactory the mean becomes as a summary of the data. mean = 6 SD = 1.69 mean = 6 SD = 5.32 Advantages: Like the mean, uses information from every score. Disadvantages: Not intuitively easy to understand! Can only be used with interval or ratio data.

10 X How to calculate the standard deviation:
For the set of scores 5, 6, 7, 9, 11: X (a) Work out the mean: = 38 / 5 = 7.6

11 ) ( - X X = s n å 2 (b) Subtract the mean from each score:
= -2.6 = -1.6 = -0.6 = 1.4 = 3.4

12 ) ( - X X = s n å 2 (c) Square the differences just obtained:
= 6.76 = 2.56 = 0.36 = 1.96 = 11.56

13 ) ( - X X = s n å 2 (d) Add up the squared differences:
= 23.20

14 ( ) 2 - å X X = s n (e) Divide this by the total number of scores, to get the variance: 23.20 / 5 = 4.64

15 ( ) 2 - å X X = s n (f) Standard deviation is the square root of the variance (we do this to get back to the original units): 4.64 = 2.15 is our sample standard deviation.

16 Complications in using the mean and SD.:
We usually obtain the mean and SD from a sample - very rarely from the parent population. Sometimes we are content to describe our sample per se, but usually we want to extrapolate to the population from our sample. A sample mean is a good estimate of the population mean. A sample SD tends to underestimate the population SD. Hence, when using the sample SD as a description of the sample, divide by n. When using the sample SD as an estimate of the population SD, divide by n-1 (to make the SD larger than it would otherwise have been).

17 sample SD as a description of a sample
(n ("sigma n") on calculators): sample SD as an estimate of the population SD (n-1 on calculators): In most cases, we use the n-1 version of the SD formula

18 The Standard Error of the Mean:   This is the standard deviation of a set of sample means.   Shows how much variation there is within a set of sample means, and hence how likely our particular sample mean is to be in error, as an estimate of the true population mean. means of different samples actual population mean

19 Formula for the standard error:
SE = standard deviation / square root of n (where n = sample size) (NB: we usually estimate this from our available data, so use the n-1 version of the SD formula)

20 If the SE is small, our obtained sample mean is more likely to be similar to the true population mean than if the SE is large.   Increasing n (the sample size) reduces the size of the SE: A sample mean based on 100 scores is probably closer to the population mean than a sample mean based on 10 scores.

21 Error bars show the mean plus and minus 1 standard deviation
Error bars show the mean plus and minus 1 standard deviation. This graph shows variability of scores within each group.

22 Error bars show the mean plus and minus 1 standard error of the mean
Error bars show the mean plus and minus 1 standard error of the mean. This graph shows how much each mean is likely to vary if you did the study many times over – it indicates the reliability of each sample mean as an estimate of the true population mean.

23 Conclusions: Mean shows "typical" performance - but that is only half the story! Need to also know about the spread of scores - how representative is the mean? Standard deviation - spread of scores around a sample mean. Tells us how well the mean summarises the sample. Standard error - spread of sample means around a "true" population mean. Tells us how reliable our sample mean is likely to be as an estimate of the population mean.


Download ppt "Summary descriptive statistics: means and standard deviations:"

Similar presentations


Ads by Google