Presentation on theme: "What you don’t see can’t matter? The effects of unobserved differences on causal attributions Robert Coe CEM, Durham University Randomised Controlled Trials."— Presentation transcript:
What you don’t see can’t matter? The effects of unobserved differences on causal attributions Robert Coe CEM, Durham University Randomised Controlled Trials in the Social Sciences, York, September 2009
2 The problem … Evaluations of education policy initiatives often use regression methods to control for initial differences They often seem to make clear and confident causal claims about programme effects They seldom seem to consider other possible explanations (eg unobserved differences) If regression models incorporate lots of explanatory variables and have high R 2, does it matter if there might have been initial unobserved differences?
3 Are you a researcher or a policy maker? Would you rather have an answer to your question that is –Simple, clear and confident, but probably wrong –Complex, nuanced, and carefully justified, but completely useless for making a decision
4 Model simulated Measured variable (M) Unmeasured variable (U) Outcome variable (Y) r Treatment group (X) p q s d Y = 0 M + d (X –X) + e – Observed relationship Y = 0 M + 1 U + e True relationship
5 The simulation 1.Generate random variables Y, M, U with desired inter-correlations (p, q, r) 2.Assign group membership (X) to be correlated with U 3.Apply regression model (Y on M and X) to estimate spurious ‘effect’ of X on Y if U were omitted
6 Example evaluations Effects of study support on achievement –MacBeath et al (2001) ‘The Impact of Study Support: A report of a longitudinal study into the impact of participation in out-of- school-hours learning on the academic attainment, attitudes and school attendance of secondary school students’. Published by the Department for Education and Skills Effects of gifted and talented provision on achievement –Kendall, et al (2005) ‘Excellence in Cities: The National Evaluation of a Policy to Raise Standards in Urban Schools 2000-2003’. Published by the Department for Education and Skills. Effects of Assisted Places on achievement –Power et al (2006) ‘The Educational and Career Trajectories of Assisted Place Holders’. Published by the Sutton Trust.
7 Summary of the examples StudyMacBeath et al. (2001) Kendall et al. (2005) Power et al. (2006) Intervention / programme Study support (Y11 Easter School) Gifted & Talented provision Assisted Places Scheme Outcome(s)GCSE English; GCSE maths KS3 average level; GCSE capped 8 score A-level points CovariatesKS3 SATs average, Gender, School type Prior attainment, FSM status, gender, ethnicity Parents’ SES and education R 2 in the model63%;70%66%;80%30% Estimate of the effect, from regression model 0.18;0.110.30;0.220.51
9 Summary of simulations Possible relevant unmeasured variable(s) Socioeconomic status; Motivation; Self-discipline Attainment used to identify G&T status Performance on entrance test Range of possible spurious effects 0.0 – 0.190.0 – 0.45 -0.06 – 0.40 0.0 – 1.0 Range of likely spurious effects 0.05 – 0.130.16 – 0.32 0.10 – 0.21 0.3 – 0.5 Best guess at spurious effect 0.080.260.170.37
10 Interpretations Interpretation given by the researchers ‘Study support can improve attainment in Maths and English by half a grade’ ‘Pupils designated as gifted and talented had higher levels of attainment at the end of Key Stages 3 and 4 than those of otherwise similar pupils not designated.’ ‘AP holders did better attending a private school than if they had gone to a state school’ Justified conclusion, taking account of bias due to omitted factors Possible small residual effect (0.10) on English but pretty much no genuine effect on maths Any genuine effect for both outcomes is very close to zero Possible positive effect (0.14), but much caution and uncertainty surrounds this
11 Conclusions Even high R 2 with the variables you have got does not necessarily mean you can ignore others you haven’t The size of artefactual spurious effects is quite sensitive to assumptions about parameter values, which are themselves contentious Which is all the more reason to consider unobserved variables The examples considered are otherwise of relatively high quality
12 Recommendations Need to replicate, simplify application, extend to other methods Be more cautious about making – and believing – causal claims based on statistical control Before interpreting as causal: –List possible alternative factors –Evaluate the case for their impact Use stronger designs