Presentation is loading. Please wait.

Presentation is loading. Please wait.

Centre for Longitudinal Studies Incorporating information about non-response into analyses of NCDS data Ian Plewis Centre for Longitudinal Studies Bedford.

Similar presentations


Presentation on theme: "Centre for Longitudinal Studies Incorporating information about non-response into analyses of NCDS data Ian Plewis Centre for Longitudinal Studies Bedford."— Presentation transcript:

1 Centre for Longitudinal Studies Incorporating information about non-response into analyses of NCDS data Ian Plewis Centre for Longitudinal Studies Bedford Group for Lifecourse and Statistical Studies Institute of Education, University of London 29 June 2006 www.ioe.ac.uk/bedfordgroup

2 NCDS longitudinal target sample, sweeps 0 to 6 SWEEP (AGE) 0 (0)1 (7)2 (11)3 (16)4 (23)5 (33)6 (42) Target sample 17634 (100%) 16500 (93.6%) 16253 (92.2%) 16068 (91.1%) 15885 (90.1%) 15567 (88.3%) 15451 (87.6%) Permanent emigrants 0322 (1.8%) 552 (3.1%) 705 (4.0%) 869 (4.9%) 1090 (6.2%) 1190 (6.7%) Deaths 0812 (4.6%) 829 (4.7%) 861 (4.9%) 880 (5.0%) 977 (5.5%) 993 (5.6%) Total 17634

3 NCDS longitudinal target and observed samples, sweeps 0 to 6 SWEEP (AGE) 0 (0)1 (7)2 (11)3 (16)4 (23)5 (33)6 (42) Observed sample 17415 (98.8%) 15051 (91.2%) 14757 (90.8%) 13917 (86.6%) 12044 (75.8%) 10986 (70.6%) 10979 (71.1%) Non- response: refusal 080 (0.5%) 783 (4.8%) 1114 (6.9%) 1130 (7.1%) 1735 (11.1%) 2043 (13.2%) Non- response: other 219 (1.2%) 1178 (7.1%) 491 (3.0%) 708 (4.4%) 1705 (10.7%) 1100 (7.1%) 308 (2.0%) Uncertain eligibility 0191 (1.2%) 222 (1.4%) 329 (2.0%) 1006 (6.3%) 1746 (11.2%) 2121 (13.7%) Target sample 17634 (100%) 16500 (100%) 16253 (100%) 16068 (100%) 15885 (100%) 15567 (100%) 15451 (100%)

4 The substantive question of interest is whether and how well we can predict whether or not someone has any educational qualifications at age 23 (i.e. at sweep 4) from circumstances in early childhood (up to age 7 or sweep 1). The target sample at age 23 = 15885 - attrition1837 - wave non-response2001 - status not known 3

5 So, observed sample at age 23 = 12044 Item non-response 1765 So, analysis sample = 10279

6 SUBSTANTIVE MODEL Explanatory variableEstimates.e. Constant -1.60.078 In care 0.650.096 In social housing 0.580.035 Inverse birthweight 738.5 Mother’s age at birth -0.0170.0037 Mother’s age squared 0.00200.00047 Age*housing 0.0130.0054 Age squared* housing -0.000780.00068 Estimates from probit model for no qualifications at age 23 (n = 10279):

7 RESPONSE MODEL (1): Estimates from multivariate logistic model for response at age 23 (n = 12853): Explanatory variableAttritionWave non-response Estimates.e.Estimates.e. Constant -0.530.24-1.30.24 Single mother 0.380.170.180.16 SEN help -0.500.12-0.200.12 No. children 0.00860.0170.0380.015 No. of moves, birth to 7 0.0620.0210.100.017 Reading score, age 7 -0.0340.0044-0.0170.0040

8 This model generates an estimate of the probability of a response at age 23 and we can use the inverse of this probability as a weight. The application of inverse probability weights assumes that data are ‘missing at random’ or that missingness is ignorable. RESPONSE MODEL (1):

9 N.B. n = 10279 for ‘no weights’; 9767 for ‘response weights’ SUBSTANTIVE MODEL WEIGHTED FOR NON-RESPONSE FROM (1) Explanatory variableEstimates.e. No weights Response weights No weights Response weights Constant -1.6-1.50.0780.082 In care 0.650.750.0960.11 In social housing 0.580.570.0350.036 Inverse birthweight 73748.59.0 Mother’s age at birth -0.017-0.0160.00370.0040 Mother’s age squared 0.00200.00170.000470.00050 Age*housing 0.0130.0110.00540.0057 Age squared* housing -0.00078-0.000280.000680.00071

10 Estimates from multivariate logistic model for response at age 23 (n = 8072): From Hawkes and Plewis, JRSS(A), 2006, 3, 479-492. RESPONSE MODEL (2): Explanatory variable AttritionWave non-response Estimates.e.Estimates.e. Constant -1.70.20-2.10.18 Sex -0.330.098-0.200.076 Social adjustment, age11 0.0210.00520.0160.0044 No. of moves, birth to 16 0.0830.0250.130.019 Reading score, age 16 -0.0460.0067-0.0120.0057

11 N.B. n = 10279 for ‘no weights’; 5996 for ‘response weights’ SUBSTANTIVE MODEL WEIGHTED FOR NON- RESPONSE FROM (2) Explanatory variableEstimates.e. No weights Response weights No weights Response weights Constant-1.6-1.70.0780.11 In care0.650.730.0960.14 In social housing0.580.510.0350.046 Inverse birthweight73808.512 Mother’s age at birth-0.017-0.0110.00370.0053 Mother’s age squared0.00200.000830.000470.00066 Age*housing0.0130.00260.00540.0075 Age squared* housing-0.00078-0.000170.000680.00094

12 Jointly modelling: (i) the probability of no qualifications at age 23 (probit) and (ii) the probability of being included in the sample at age 23 (probit). Need ‘instruments’ for the selection model – use ‘sex’ and ‘number of family moves, birth to 7’. HECKMAN SELECTION MODEL

13 Model allows for correlated residuals, i.e. for non- ignorable or informative non-response. Obtain ML estimates from ‘heckprob’ in STATA. HECKMAN SELECTION MODEL

14 N.B. n = 10279 for ‘no weights’; 10150 for ‘selection’ SUBSTANTIVE MODEL ALLOWING FOR SELECTION Explanatory variable Estimates.e. No weights Selection No weights Selection Constant-1.6-1.30.0780.15 In care0.650.800.0960.10 In social housing0.580.530.0350.047 Inverse birthweight73728.58.8 Mother’s age at birth-0.017 0.00370.0036 Mother’s age squared0.0020 0.000470.00045 Age*housing0.013-0.0120.00540.0051 Age squared* housing-0.00078-0.000770.000680.00064 Residual correlation = -0.58

15 Applying these corrections for non-response has little affect on the substantive conclusions for this particular model. Methodological issues: Inverse probability weighting: (1)Standard errors of the estimates should be adjusted to allow for the fact that the weights are themselves estimated. (2)Might a better adjustment take account of the differences between the attrition cases and the wave non-respondents? CONCLUSIONS

16 (3)Missing weights – assume they are one (rather than zero)? Selection models: (1)Vulnerable to mis-specification (2)Depend on the validity of the instruments. Other approaches: (1)Imputation, especially multiple imputation. CONCLUSIONS


Download ppt "Centre for Longitudinal Studies Incorporating information about non-response into analyses of NCDS data Ian Plewis Centre for Longitudinal Studies Bedford."

Similar presentations


Ads by Google