Download presentation
Presentation is loading. Please wait.
Published byDevi Hardja Modified over 5 years ago
1
Evaluating Impacts: An Overview of Quantitative Methods
Evaluating the Impact of Projects and Programs Beijing, China, April 10-14, 2006 Shahid Khandker, WBI
2
A Case of China village development program
150,000 poor villages out of 700,000 villages 50,000 villages were treated – did the program matter? New villages to be treated– how to design an impact evaluation?
3
Impact evaluation focused on latter stages of logframe
The rationale of a program is to alter an outcome or impact from what it would have been without the program Long- and short-term impacts Purpose Impact Objective Outcomes Outputs How do you measure the impact (effects) of a program? This is one of the key methodological issues for impact evaluation Inputs Allocation
4
Evaluation Using a Simple Post-Implementation Observation
Effect or impact ? 30 Impossible to reach a conclusion regarding the impact; possible to say whether objective has been reached, the result cannot be attributed to the program. Revenue level Time Program Beneficiaries
5
Evaluation without a Comparison Group, Using a Before/After Comparison
30 21 Effect or impact? Base Line Study Income level 17 14 ? Findings on the impact lack precision and soundness. The time series make it possible to reach better conclusions. 10 Time series Time Program Broad descriptive survey Beneficiaries
6
The Counterfactual The evaluator’s key question: What would have happened to the beneficiaries if the program had not existed? How do you rewrite history? How do you get baseline data?
7
Impact Assessment Methods
Solutions must be found for two problems: To what extent is it possible to identify the effect (revenues increase, the prevalence of a disease goes down, etc.) To what extent can this effect be attributed to the program (and not to some other cause) To find the best answers possible to these two questions, methods that are specific to impact evaluation are used
8
Alternative impact methods
Experimental design (randomization) Non-experimental design - cross-sectional data Matching methods (propensity score method) Instrumental variable (IV) method - Panel data Difference-in-difference Matched double difference
9
Randomization Select into treatment and comparison in random way
Take the difference between outcomes of treated and non-treated
10
The Solution Beyond Doubt: Experiment
11
The Ideal Experiment With equivalent control group. 30
Effect or impact = 13 Income level 17 In theory, a single observation is not enough 10 Extreme care must be taken when selecting the control group to ensure comparability. Time Beneficiaries Equivalent control group Program
12
Problem with Experimentation
With equivalent control group In practice, it is extremely difficult to assemble an exactly comparable control group: Ethical problem (to condemn a group to not being beneficiaries) Difficulties associated with finding an equivalent group outside the project Costs Therefore, this solution is hardly ever used
13
1. Matching Matched comparators identify counterfactual.
Match participants to non-participants from a larger survey; The matches are chosen on the basis of similarities in observed characteristics; This assumes no selection bias based on unobservable heterogeneity; Validity of matching methods depends heavily on data quality.
14
2. Propensity-score Matching (PSM)
Propensity-score matching match on the basis of the probability of participation. Ideally we would match on the entire vector X of observed characteristics. However, this is practically impossible. X could be huge. Rosenbaum and Rubin: match on the basis of the propensity score = This assumes that participation is independent of outcomes given X. If no bias given X then no bias given P(X).
15
Steps in Score Matching
Representative, highly comparable, surveys of the non-participants and participants; (ii) Pool the 2 samples and estimate a logit/ probit model of program participation: Predicted values are “propensity scores.” (iii) Restrict samples to assure common support; (iv) Failure of common support is an important source of bias in observational studies (Heckman et al.).
16
Density of scores for participants
17
Density of scores for non-participants
18
Density of scores for non-participants
19
(v) For each participant find a sample of non-participants that have similar propensity scores;
(vi) Compare the outcome indicators. The difference is the estimate of the gain due to the program for that observation; (vii) Calculate the mean of these individual gains to obtain the average overall gain. -Various weighting schemes.
20
The Mean Impact Estimator
21
5. Instrumental Variables
Identifying exogenous variation using a 3rd variable Outcome regression: D = 0,1 is our program – not random “Instrument” (Z) influences participation, but does not affect outcomes given participation (the “exclusion restriction”). This identifies the exogenous variation in outcomes due to the program. Treatment regression: 5. Instrumental Variables
22
Reduced-form outcome regression:
where and Instrumental variables (two-stage least squares) estimator of impact:
23
3. Difference-in-difference
Observed changes over time for non-participants provide the counterfactual for participants. Collect baseline data on non-participants and (probable) participants before the program; Compare with data after the program; Subtract the two differences, or use a regression with a dummy variable for participant; This allows for selection bias but it must be time-invariant and additive.
24
Selection Bias Selection bias
25
DD requires that the bias is additive and time-invariant
26
Method fails if the comparison group is on a different trajectory
27
Outcome Indicator = impact (“gain”); Where: = counterfactual;
= comparison group
28
Difference-In-Difference
(i) if change over time for comparison group reveals counterfactual, and (ii) if baseline is uncontaminated by the program,
29
4. Matched double difference
Matching helps control for bias in diff-in-diff. Score match participants and non-participants based on observed characteristics in baseline; Then do a double difference; This deals with observable heterogeneity in initial conditions that can influence subsequent changes over time.
30
Difference in difference technique-- example
30 Effect or impact = 13 Income level 21 Projection for beneficiaries if no policy 17 14 10 Data on before and after situations is needed. Time Program Beneficiaries control group
31
China village development fund
Ex-post impact evaluation Propensity score method based on a comparison of treated villages and matched non-treated villages Instrument variable method based on any exogenous selection criteria of villages for treatment Ex-ante impact evaluation Randomly select treated and non-treated villages Use PSM technique to match treated and non-treated villages Use IV method through exclusion restriction of village eligibility Baseline for treated and non-treated villages and then do follow-up treatment
32
Four Broad Categories of Impact Evaluation Methods
Equivalent control group (true experimentation) Non - equivalent control group Evaluation without a control group Evaluation by comparison with Post implementation observation Observatio n before and after - Quality + ++ +++
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.