Download presentation

Presentation is loading. Please wait.

Published byClarence Lambert Modified over 2 years ago

1
1 Standardization of variables Maarten Buis 5-12-2005

2
2 Recap Central tendency Dispersion SPSS

3
3 Standardization Is used to improve interpretability of variables. Some variables have a natural interpretable metric: e.g. income, age, gender, country. Others, primarily ordinal variables, do not: e.g. education, attitude items, intelligence. Standardizing these variables makes them more interpretable.

4
4 Standardization Transforming the variable to a comparable metric –known unit –known mean –known standard deviation –known range Three ways of standardizing: –P-standardization (percentile scores) –Z-standardization (z-scores) –D-standardization (dichotomize a variable)

5
5 When you should always standardize When averaging multiple variables, e.g. when creating a socioeconomic status variable out of income and education. When comparing the effects of variables with unequal units, e.g. does age or education have a larger effect on income?

6
6 P-Standardization Every observation is assigned a number between 0 and 100, indicating the percentage of observation beneath it. Can be read from the cumulative distribution In case of knots: assign midpoints The median, quartiles, quintiles, and deciles are special cases of P-scores.

7
7

8
8 P-standardization Turns the variable into a ranking, i.e. it turns the variable into a ordinal variable. It is a non-linear transformation: relative distances change Results in a fixed mean, range, and standard deviation; M=50, SD=28.6, This can change slightly due to knots A histogram of a P-standardized variable approximates a uniform distribution

9
9 Linear transformation Say you want income in thousands of guilders instead of guilders. You divide INCMID by f1000,- MSD Incmid ƒ 2543,- ƒ 1481,- Incmid/1000 kƒ 2,543k ƒ 1,481

10
10 Linear transformation Say you want to know the deviation from the mean Subtract the mean (f2543,-) from INCMID MSD Incmid ƒ 2543,- ƒ 1481,- Incmid-M ƒ0,-ƒ 1481,-

11
11 Recap: multiplication and addition and the number line

12
12 Linear transformation Adding a constant (X’ = X+c) –M(X’) = M(X)+c –SD(X’) = SD(X) Multiply with a constant (X’ = X*c) –M(X’) = M(X)*c –SD(X’) = SD(X) * |c|

13
13 Z-standardization Z = (X-M)/SD two steps: –center the variable (mean becomes zero) –divide by the standard deviation (the unit becomes standard deviation) Results in fixed mean and standard deviation: M=0, SD=1 Not in a fixed range! Z-standardization is a linear transformation: relative distances remain intact.

14
14 Z-standardization Step 1: subtract the mean c = -M(X) M(X’) = M(X)+c M(X’) = M(X)-M(X)=0 SD(X’)=SD(X)

15
15 Z-standardization Step 2: divide by the standard deviation c is 1/SD(X) M(Z) = M(X’) * c M(Z) = 0 * 1/SD(X) = 0 SD(Z) = SD(X’) * c SD(Z) = SD(X) * 1/SD(X) = 1

16
16 Normal distribution Normal distribution = Gauss curve = Bell curve Formula (McCall p. 120) –Note the (x- ) 2 part –apart from that all you have to remember is that the formula is complicated Normal distribution occurs when a large number of small random events cause the outcome: e.g. measurement error

17
17 Normal distribution Other examples the height of individuals, intelligence, attitude But: the variables Education, Income and age in Eenzaam98 are not normally distributed

18
18 Z-scores and the normal distribution Z-standardization will not result in a normally distributed variable Standardization in NOT the same as normalization We will not discuss normalization (but it does exist) But: If the original distribution is normally distributed, than the z-standardized variable will have a standard normal distribution.

19
19 Standard normal distribution Normal distribution with M=0 and SD=1. Table A in Appendix 2 of McCall Important numbers (to be remembered): –68% of the observations lie between ± 1 SD –90% of the observations lie between ± 1.64 SD –95% of the observations lie between ± 1.96 SD –99% of the observations lie between ± 2.58 SD

20
20 Why bother? If you know: –That a variable is normally distributed –the mean and standard deviation Than you know the percentage of observations above or below and observation These numbers are a good approximation, even if the variable is not exactly normally distributed

21
21 P & Z standardization Both give a distribution with fixed mean, standard deviation, and unit P-standardization also gives a fixed range Both are relative to the sample: if you take observations out, than you have to re- compute the standardized variables

22
22 P & Z-standardization When interpreting Z-standardized variables one uses percentiles With P-standardization one decreases the scale of measurement to ordinal, BUT this improves interpretability.

23
23 Student recap

24
24 Do before Wednesday Read McCall chapter 5 Understand Appendix 2, table A make exercises 5.7-5.28

Similar presentations

OK

Z-Scores Standardized Scores. Standardizing scores With non-equivalent assessments it is not possible to develop additive summary statistics. –e.g., averaging.

Z-Scores Standardized Scores. Standardizing scores With non-equivalent assessments it is not possible to develop additive summary statistics. –e.g., averaging.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on folk dances of india Download ppt on turbo generator construction Free ppt on time management Ppt on working of ac generator Ppt on robert frost free download Ppt on introduction to brand management Ppt on instrument landing system outer Ppt on rivers of india in hindi Ppt on id ego superego diagram Ppt on aerobics steps