Presentation on theme: "What could go wrong? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop Africa Program for Education."— Presentation transcript:
What could go wrong? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop Africa Program for Education Impact Evaluation (APEIE) Accra, Ghana May
What could go wrong? A lot! – Threats to internal and external validity Randomization is undermined Hawthorne effect Spillover effects Attrition Non-compliance Pilot phase “startup” problems … Let’s just focus on a few areas
Just a reminder Internal validity – Extent to which the evaluation is indeed estimating the parameter of interest E.g. Impact of a scholarship program on attendance External validity – Extent to which the evaluation provides relevant information about the likely effectiveness of a program in a different setting, or if implemented at a different scale
Internal validity – Extent to which the evaluation is indeed estimating the parameter of interest Are you estimating what you think you are? Some issues – Attrition – Spillover effects – Partial compliance and sample selection bias
Internal validity: Attrition Attrition is a generic problem in data analysis Some people drop out of the analysis or the sample Is this a problem in impact evaluation? Yes, if the attrition is related to with the intervention, or its likely impact.
Attrition bias: An Example Study: Impact of a scholarship program on learning achievement – Baseline = students in grade 5 – Intervention = offer of a scholarship for grade 6 – Follow-up = test scores of students at end of grade 6 Impact evaluation question: – Did the scholarship program improve test scores? Can you see a problem with this setup? How might you go about addressing any potential bias?
Addressing attrition bias Check that attrition is not different between treatment and control groups Also check that it is not correlated with observables. Try to bound the extent of the bias – For example: Recalculate impact assuming everyone who dropped out from the treatment got the lowest score that anyone got Recalculate impact assuming everyone who dropped out of control got the highest score that anyone got
Internal validity: Spillover effects/externalities Spillover effects or Externalities occur when other people than the target population are affected by the intervention How is this a problem?
Spillover effects/externalities: Example A teacher training program affects all teachers in a school, not just those who were randomly selected to go through the training
Addressing spillover effects/externalities Main way to address problem is to Randomize in such a way as to encompass externalities – E.g. Randomize training to all teachers in a school
Internal validity: Partial compliance and sample selection bias Occurs when the treatment and control groups are not comparable, either – Because the randomization “didn’t work” – Study population’s behavior “undermined” the randomization
Randomization “didn’t work” The validity of randomization as an approach to ensure equal characteristics depends on sufficiently large samples In any single application the samples could be different – That is, average characteristics might differ between treatment and control groups How might you address this in the analysis?
Behavior “undermined” the randomization For example, – Example 1: Students who were offered a scholarship drop out after 1 year (and could therefore be considered “control” children) – Example 2: Students from non-recipient schools move to schools that were randomly chosen to receive a grant How might you address this in the analysis?
Addressing partial compliance and sample selection bias Use the “Intent To Treat” (ITT) approach – Frame the evaluation in terms of the original design For example – Example 1: Study the impact of offering a scholarship on outcomes – Example 2: Study the impact of the grants program in terms of the school they were originally enrolled in
Addressing partial compliance and sample selection bias The “Intent To Treat” approach is powerful, but does raise issues of its own – It often captures what is in the control of those implementing the intervention E.g. offering a scholarship – But it may not reflect what is likely to happen if the program goes to scale E.g. if all schools were in the grants program then students wouldn’t switch One can use the results to estimate the “Average Treatment on the Treated” but this requires further modeling
External validity Extent to which the evaluation provides relevant information about the likely effectiveness of a program in a different setting, or if implemented at a different scale Are the findings be meaningful for policy?
External validity: Behavioral responses The behaviors of the study population may be affected by the study itself (as opposed to the intervention) – Treatment group behavior changes: Hawthorne effect People in the treatment group are being closely observed and studies, so alter their behavior (e.g. teachers being observed). – Control group behavior changes: John Henry effect People in the control group views themselves as being in competition with the treatment group and so changes their behavior (e.g. students denied a scholarship).
External validity: Generalizability Is the program—as evaluated—truly possible to replicate at scale? – Did the intervention require a lot of careful attention in order to make it work Was the evaluation carried out on a truly representative population? – Was it restricted to a province that was “ready” for the intervention … and therefore not like other places it might be carried out in
Conclusion Many “threats” to both internal and external validity Some may come from pressure – From study population – From higher levels of government – From donors The “technocratic” approach becomes something of an “art” in needing to balance these various goals