Download presentation
Presentation is loading. Please wait.
Published byScott Glenn Modified over 9 years ago
1
Revision Analysing data
2
Measures of central tendency such as the mean and the median can be used to determine the location of the distribution of data values.
3
Measures of spread such as the range (minimum - maximum), the standard deviation, and the variance tell you how spread out the data is.
4
In a statistical investigation, you should discuss each variable in terms of its central tendency and spread.
5
Another important step in evaluating a set of data is to look at the overall shape of the distribution of each set of data.
6
Think about SHAPE: Unimodal, bimodal, multimodal,uniform Symmetry, skewness Outliers, clusters, gaps CENTER: Mean, median, SPREAD: Standard deviation, variance, range, interquartile range (IQR)
7
A good way to portray the shape of the distribution is with a histogram. You would look for evidence of distinct groupings (e.g. male/female or different species) and outliers.
8
No! No! No!- this is not a histogram!
10
The following are examples of histograms and box and whisker plots of various distributions. (Note: the box and whisker plot does not relate to the histogram above it. It is just an example of what it could look like.)
11
Negatively skewed (unimodal)
12
Positively skewed
13
If the shape is skewed: Report the median and IQR. You may want to include the mean and standard deviation, but you should point out why the mean and median differ. The fact that the mean and median do not agree is a sign that the distribution may be skewed.
14
Symmetric
15
If the shape is symmetric: report the mean and standard deviation and possibly the median and IQR as well.
16
Uniform
17
Groupings (bimodal)
18
Outlier
19
If there are any clear outliers and you are reporting the mean and standard deviation, report them with the outliers present and with outliers removed. The differences may be revealing. (Of course, the median and IQR are not likely to be affected by the outliers.)
20
An outlier can be the most informative part of your data. (Or it might be just an error.)
21
An example of what to say…
22
The main body of the distribution is unimodal and nearly symmetric around $500,000, with slightly more than half of CEOs earning salaries higher than that. But there are some high outliers. The outliers are CEOs whose salaries are higher than what is typical for most CEOs of large corporations. Even though the vast majority of CEOs have salaries below $1,000,000 a year, there are a few with salaries between $2,500,000 and $3,000,000 a year. unimodal high outliers
23
Unimodal Back
24
High outliers Back
25
Most have salaries below $100,000
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.