Presentation is loading. Please wait.

Presentation is loading. Please wait.

Central Tendency and Variability

Similar presentations


Presentation on theme: "Central Tendency and Variability"— Presentation transcript:

1 Central Tendency and Variability
The two most essential features of a distribution

2 Questions Define Mean Median Mode What is the effect of distribution shape on measures of central tendency? When might we prefer one measure of central tendency to another?

3 Questions (2) Define Range Average Deviation Variance Standard Deviation When might we prefer one measure of variability to another? What is a z score? What is the point of Tchebycheff’s inequality?

4 Variables have distributions
A variable is something that changes or has different values (e.g., anger). A distribution is a collection of measures, usually across people. Distributions of numbers can be summarized with numbers (called statistics or parameters).

5 Central Tendency refers to the Middle of the Distribution

6 Variability is about the Spread

7 1. Central Tendency: Mode, Median, & Mean
The mode – the most frequently occurring score. Midpoint of most populous class interval. Can have bimodal and multimodal distributions.

8 Median Score that separates top 50% from bottom 50%
Even number of scores, median is half way between two middle scores. | – Median is 4.5 Odd number of scores, median is the middle number – Median is 4

9 Mean Sum of scores divided by the number of people. Population mean is (mu) and sample mean is (X-bar). We calculate the sample mean by: We calculate the population mean by:

10 Deviation from the mean
x = X – Deviations sum to zero. Deviation score – deviation from the mean Raw scores Deviation scores 9 8 10 7 11 -1 1 -2 2

11 Comparison of mean, median and mode
Good for nominal variables Good if you need to know most frequent observation Quick and easy Median Good for “bad” distributions Good for distributions with arbitrary ceiling or floor

12 Comparison of mean, median & mode
Used for inference as well as description; best estimator of the parameter Based on all data in the distribution Generally preferred except for “bad” distribution. Most commonly used statistic for central tendency.

13 Best Guess interpretations
Mean – average of signed error will be zero. Mode – will be absolutely right with greatest frequency Median – smallest absolute error

14 Expectation Discrete and continuous variables
Mean is expected value either way Discrete: Continuous: (The integral looks bad but just means take the average)

15 Influence of Distribution Shape

16 Review What is central tendency? Mode Median Mean

17 2. Variability aka Dispersion
4 Statistics: Range, Average Deviation, Variance, & Standard Deviation Range = high score minus low score. – range=20-12=8 Average Deviation – mean of absolute deviations from the median: Note difference between this definition & undergrad text- deviation from Median vs. Mean

18 Variance Population Variance: Where means population variance,
means population mean, and the other terms have their usual meaning. The variance is equal to the average squared deviation from the mean. To compute, take each score and subtract the mean. Square the result. Find the average over scores. Ta da! The variance.

19 Computing the Variance
5 15 -10 100 10 -5 25 20 Total: 75 250 Mean: Variance Is  50

20 Standard Deviation Variance is average squared deviation from the mean. To return to original, unsquared units, we just take the square root of the variance. This is the standard deviation. Population formula:

21 Standard Deviation Sometimes called the root-mean-square deviation from the mean. This name says how to compute it from the inside out. Find the deviation (difference between the score and the mean). Find the deviations squared. Find their mean. Take the square root.

22 Computing the Standard Deviation
5 15 -10 100 10 -5 25 20 Total: 75 250 Mean: Variance Is  50 Sqrt SD

23 Example: Age Distribution

24 Review Range Average deviation Variance Standard Deviation

25 Standard or z score A z score indicates distance from the mean in standard deviation units. Formula: Converting to standard or z scores does not change the shape of the distribution. Z-scores are not normalized.

26 Tchebycheff’s Inequality (1)
General form Suppose we know mean height in inches is 66 and SD is 4 inches. We assume nothing about the shape of the distribution of height. What is the probability of finding people taller than 74 inches? (Note that b is a deviation from the mean; in this case 74-66=8.). Also 74 inches is 2 SDs above the mean; therefore, z = 2. [If we assume height is normally distributed, p is much smaller. But we will get to that later.]

27 Tchebycheff (2) Z-score form
Probability of z score from any distribution being more than k SDs from mean is at most 1/k2. Z-scores from the worst distributions are rarely more than 5 or less than -5. For symmetric, unimodal distributions, |z| is rarely more than 3. For the problem in the previous slide:

28 Review Z-score in words Z-score in symbols
Meaning of Tchebycheff’s theorem

29 Median House Price Data
Find data Show Univariate Show plots


Download ppt "Central Tendency and Variability"

Similar presentations


Ads by Google