Evaluating Impacts: An Overview of Quantitative Methods

Evaluating Impacts: An Overview of Quantitative Methods
Evaluating the Impact of Projects and Programs Beijing, China, April 10-14, 2006 Shahid Khandker, WBI

A Case of China village development program
150,000 poor villages out of 700,000 villages 50,000 villages were treated – did the program matter? New villages to be treated– how to design an impact evaluation?

Impact evaluation focused on latter stages of logframe
The rationale of a program is to alter an outcome or impact from what it would have been without the program Long- and short-term impacts Purpose Impact Objective Outcomes Outputs How do you measure the impact (effects) of a program? This is one of the key methodological issues for impact evaluation Inputs Allocation

Evaluation Using a Simple Post-Implementation Observation
Effect or impact ? 30 Impossible to reach a conclusion regarding the impact; possible to say whether objective has been reached, the result cannot be attributed to the program. Revenue level Time Program Beneficiaries

Evaluation without a Comparison Group, Using a Before/After Comparison
30 21 Effect or impact? Base Line Study Income level 17 14 ? Findings on the impact lack precision and soundness. The time series make it possible to reach better conclusions. 10 Time series Time Program Broad descriptive survey Beneficiaries

The Counterfactual The evaluator’s key question: What would have happened to the beneficiaries if the program had not existed? How do you rewrite history? How do you get baseline data?

Impact Assessment Methods
Solutions must be found for two problems: To what extent is it possible to identify the effect (revenues increase, the prevalence of a disease goes down, etc.) To what extent can this effect be attributed to the program (and not to some other cause) To find the best answers possible to these two questions, methods that are specific to impact evaluation are used

Alternative impact methods
Experimental design (randomization) Non-experimental design - cross-sectional data Matching methods (propensity score method) Instrumental variable (IV) method - Panel data Difference-in-difference Matched double difference

Randomization Select into treatment and comparison in random way
Take the difference between outcomes of treated and non-treated

The Solution Beyond Doubt: Experiment

The Ideal Experiment With equivalent control group. 30
Effect or impact = 13 Income level 17 In theory, a single observation is not enough 10 Extreme care must be taken when selecting the control group to ensure comparability. Time Beneficiaries Equivalent control group Program

Problem with Experimentation
With equivalent control group In practice, it is extremely difficult to assemble an exactly comparable control group: Ethical problem (to condemn a group to not being beneficiaries) Difficulties associated with finding an equivalent group outside the project Costs Therefore, this solution is hardly ever used

1. Matching Matched comparators identify counterfactual.
Match participants to non-participants from a larger survey; The matches are chosen on the basis of similarities in observed characteristics; This assumes no selection bias based on unobservable heterogeneity; Validity of matching methods depends heavily on data quality.

2. Propensity-score Matching (PSM)
Propensity-score matching match on the basis of the probability of participation. Ideally we would match on the entire vector X of observed characteristics. However, this is practically impossible. X could be huge. Rosenbaum and Rubin: match on the basis of the propensity score = This assumes that participation is independent of outcomes given X. If no bias given X then no bias given P(X).

Steps in Score Matching
Representative, highly comparable, surveys of the non-participants and participants; (ii) Pool the 2 samples and estimate a logit/ probit model of program participation: Predicted values are “propensity scores.” (iii) Restrict samples to assure common support; (iv) Failure of common support is an important source of bias in observational studies (Heckman et al.).

Density of scores for participants

Density of scores for non-participants

(v) For each participant find a sample of non-participants that have similar propensity scores;
(vi) Compare the outcome indicators. The difference is the estimate of the gain due to the program for that observation; (vii) Calculate the mean of these individual gains to obtain the average overall gain. -Various weighting schemes.

The Mean Impact Estimator

5. Instrumental Variables
Identifying exogenous variation using a 3rd variable Outcome regression: D = 0,1 is our program – not random “Instrument” (Z) influences participation, but does not affect outcomes given participation (the “exclusion restriction”). This identifies the exogenous variation in outcomes due to the program. Treatment regression: 5. Instrumental Variables

Reduced-form outcome regression:
where and Instrumental variables (two-stage least squares) estimator of impact:

3. Difference-in-difference
Observed changes over time for non-participants provide the counterfactual for participants. Collect baseline data on non-participants and (probable) participants before the program; Compare with data after the program; Subtract the two differences, or use a regression with a dummy variable for participant; This allows for selection bias but it must be time-invariant and additive.

Selection Bias Selection bias

DD requires that the bias is additive and time-invariant

Method fails if the comparison group is on a different trajectory

Outcome Indicator = impact (“gain”); Where: = counterfactual;
= comparison group

Difference-In-Difference
(i) if change over time for comparison group reveals counterfactual, and (ii) if baseline is uncontaminated by the program,

4. Matched double difference
Matching helps control for bias in diff-in-diff. Score match participants and non-participants based on observed characteristics in baseline; Then do a double difference; This deals with observable heterogeneity in initial conditions that can influence subsequent changes over time.

Difference in difference technique-- example
30 Effect or impact = 13 Income level 21 Projection for beneficiaries if no policy 17 14 10 Data on before and after situations is needed. Time Program Beneficiaries control group

China village development fund
Ex-post impact evaluation Propensity score method based on a comparison of treated villages and matched non-treated villages Instrument variable method based on any exogenous selection criteria of villages for treatment Ex-ante impact evaluation Randomly select treated and non-treated villages Use PSM technique to match treated and non-treated villages Use IV method through exclusion restriction of village eligibility Baseline for treated and non-treated villages and then do follow-up treatment

Four Broad Categories of Impact Evaluation Methods
Equivalent control group (true experimentation) Non - equivalent control group Evaluation without a control group Evaluation by comparison with Post implementation observation Observatio n before and after - Quality + ++ +++

Evaluating Impacts: An Overview of Quantitative Methods

Similar presentations

Presentation on theme: "Evaluating Impacts: An Overview of Quantitative Methods"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Evaluating Impacts: An Overview of Quantitative Methods

Similar presentations

Presentation on theme: "Evaluating Impacts: An Overview of Quantitative Methods"— Presentation transcript:

Similar presentations

About project

Feedback