Download presentation

Presentation is loading. Please wait.

Published byAugust Freeney Modified over 2 years ago

1
MEASURES OF CENTRALITY

2
Last lecture summary Mode Distribution

3
Life expectancy data

4
Minimum Sierra Leone minimum = 47.8

5
Maximum Japan maximum = 84.3

6
Life expectancy data all countries

7
Life expectancy data 1 197 Egypt 99 73.2 half larger half smaller

8
Life expectancy data Minimum = 47.8 Maximum = 83.4 Median = 73.2

9
Q1 1 197 Sao Tomé & Príncipe 50 (¼ way) 1 st quartile = 64.7

10
Q1 ¾ larger¼ smaller 1 st quartile = 64.7

11
Q3 1 197 Netherland Antilles 148 (¾ way) 3 rd quartile = 76.7

12
Q3 3 rd quartile = 76.7 ¾ smaller¼ larger

13
Life expectancy data Minimum = 47.8 Maximum = 83.4 Median = 73.2 1 st quartile = 64.7 3 rd quartile = 76.7

14
Box Plot

15
Box plot 1 st quartile 3 rd quartile median minimum maximum

16
Quartiles, median – how to do it? 79, 68, 88, 69, 90, 74, 87, 93, 76 Find min, max, median, Q1, Q3 in these data. Then, draw the box plot.

18
Another example Min. 1st Qu. Median 3rd Qu. Max. 68.00 75.00 81.00 88.50 93.00 78, 93, 68, 84, 90, 74

19
Percentiles věk [roky] http://www.rustovyhormon.cz/on-line-rustove-grafy

20
Skeleton data Estimate age at death from skeletal remains Common problem in forensic anthropology Based on wear and deterioration of certain bones Measurements on 400 skeletons Two estimation methods Di Gangi et al., aspects of the first rib Suchey-Brooks, most common, pubic bone http://www.bestcoloringpagesforkids.com/wp-content/uploads/2013/07/Skeleton-Coloring-Page.gif

21
400 skeletons, the estimated and the actual age of death

22
DiGangi

23
Modified boxplot Min. Q1 Median Q3 Max. -60.00 -23.00 -13.00 -5.00 32.00

24
Mean

25
Median = -13 Mean = -14.2 Mean is not a robust statistic. Median is a robust statistic. Robust statistic

26
Median = -13 Mean = -14.2 10% trimmed mean … eliminate upper and lower 10% of data (i.e. 40 points). 10% trimmed mean = mean of 320 middle data values = -13.8 Trimmed mean is more robust. Trimmed mean

27
Salary o 25 players of the American football (NY red Bulls) in 2012. 33 750 44 000 45 566 65 000 95 000 103 500 112 495 138 188 141 666 181 500 185 000 190 000 194 375 195 000 205 000 292 500 301 999 4 600 000 5 600 000 median = 112 495 mean = 518 311 8% trimmed mean = 128 109

28
MEASURES OF VARIABILITY

29
Navození atmosféry

30
QUESTION Mean1 Mean2 Mode1 Mode2 Median1 Median2

31
range (variační rozpětí) MAX - min

32
Range Range changes when we add new data into dataset Always Sometimes Never

33
Adding Mark Zuckerberg

34
Cut off data IQR, mezikvartilové rozpětí

35
Interquartile range, IQR Let’ take this quiz, answer yes ot not. 1. About 50% of the data fall within the IQR. 2. The IQR is affected by every value in the data set. 3. The IQR is not affected by outliers. 4. The mean is always between Q1 and Q3. 0 1 1 1 2 2 2 2 2 3 3 3 90 Q2Q1=1 Q3=3

36
Define outlier Sample $38,946 $43,420 $49,191 $50,430 $50,557 $52,580 $53,595 $54,135 $60,181 $10,000,000 What values are outliers for this data set? 1.$60,000 2.$80,000 3.$100,000 4.$200,000

37
Problem with IQR normal bimodal uniform

38
Options for measuring variability Find the average distance between all pairs of data values. Find the average distance between each data value and either the max or the min. Find the average distance between each data value and the mean.

39
Average distance from mean Sample 10 5 3 2 19 1 7 11 1 1

40
Average distance from mean Sample 10 5 3 2 19 1 7 11 1 1

41
Average distance from mean Sample 104 5 3-3 2-4 1913 1-5 71 115 1-5 1 Find the average distance between each data value and the mean.

42
Preventing cancellation How can we prevent the negative and positive deviations from cancelling each out? 1. Ignore (i.e. delete) the negative sign. 2. Multiply each deviation by two. 3. Square each deviation. 4. Take absolute value of each deviation.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google