Evaluating Impacts: An Overview of Quantitative Methods

Slides:



Advertisements
Similar presentations
Module 6 – Evaluation Methods and Techniques. 13/02/20142 Questions and criteria Methods and techniques Quality How the evaluation will be done Overview.
Advertisements

Evaluating Anti-Poverty Programs Part 1: Concepts and Methods Martin Ravallion Development Research Group, World Bank.
Impact analysis and counterfactuals in practise: the case of Structural Funds support for enterprise Gerhard Untiedt GEFRA-Münster,Germany Conference:
REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Advantages and limitations of non- and quasi-experimental methods Module 2.2.
Mywish K. Maredia Michigan State University
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
© Institute for Fiscal Studies The role of evaluation in social research: current perspectives and new developments Lorraine Dearden, Institute of Education.
RESEARCH DESIGNS FOR QUANTITATIVE STUDIES. What is a research design?  A researcher’s overall plan for obtaining answers to the research questions or.
I want to test a wound treatment or educational program but I have no funding or resources, How do I do it? Implementing & evaluating wound research conducted.
I want to test a wound treatment or educational program but I have no funding or resources, How do I do it? Implementing & evaluating wound research conducted.
Non Experimental Design in Education Ummul Ruthbah.
Matching Methods. Matching: Overview  The ideal comparison group is selected such that matches the treatment group using either a comprehensive baseline.
AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.
Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.
Welfare Reform and Lone Parents Employment in the UK Paul Gregg and Susan Harkness.
Evaluating Job Training Programs: What have we learned? Haeil Jung and Maureen Pirog School of Public and Environmental Affairs Indiana University Bloomington.
Impact Evaluation in Education Introduction to Monitoring and Evaluation Andrew Jenkins 23/03/14.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Applying impact evaluation tools A hypothetical fertilizer project.
Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.
WBI WORKSHOP Randomization and Impact evaluation.
Randomized Assignment Difference-in-Differences
The Evaluation Problem Alexander Spermann, University of Freiburg 1 The Fundamental Evaluation Problem and its Solution SS 2009.
Alexander Spermann University of Freiburg, SS 2008 Matching and DiD 1 Overview of non- experimental approaches: Matching and Difference in Difference Estimators.
Copyright © 2015 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0 Attribution-Non Commercial-No Derivatives (CC-IGO.
The Evaluation Problem Alexander Spermann, University of Freiburg, 2007/ The Fundamental Evaluation Problem and its Solution.
Looking for statistical twins
Issues in Evaluating Educational Research
ECON 4009 Labor Economics 2017 Fall By Elliott Fan Economics, NTU
Chapter 11: Quasi-Experimental and Single Case Experimental Designs
CHAPTER 4 Designing Studies
Quasi Experimental Methods I
General belief that roads are good for development & living standards
TRACER STUDIES—Assessments and Evaluations
Quasi Experimental Methods I
Propensity Score Matching
An introduction to Impact Evaluation
Design (3): quasi-experimental and non-experimental designs
Chapter 8 Experimental Design The nature of an experimental design
Impact evaluations at IFAD-IOE
Randomized Trials: A Brief Overview
Impact Evaluation Methods
Quasi-Experimental Methods
Introduction to Design
An introduction to Impact Evaluation
Pooling Cross Sections across Time: Simple Panel Data Methods
Impact evaluation: The quantitative methods with applications
Matching Methods & Propensity Scores
CHAPTER 4 Designing Studies
Matching Methods & Propensity Scores
Methods of Economic Investigation Lecture 12
Explanation of slide: Logos, to show while the audience arrive.
Impact Evaluation Methods
Impact Evaluation Methods
Matching Methods & Propensity Scores
CHAPTER 4 Designing Studies
Impact Evaluation Methods: Difference in difference & Matching
CHAPTER 4 Designing Studies
Applying Impact Evaluation Tools: Hypothetical Fertilizer Project
CHAPTER 4 Designing Studies
Positive analysis in public finance
CHAPTER 4 Designing Studies
Module 3: Impact Evaluation for TTLs
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Reminder for next week CUELT Conference.
CHAPTER 4 Designing Studies
Presentation transcript:

Evaluating Impacts: An Overview of Quantitative Methods Evaluating the Impact of Projects and Programs Beijing, China, April 10-14, 2006 Shahid Khandker, WBI

A Case of China village development program 150,000 poor villages out of 700,000 villages 50,000 villages were treated – did the program matter? New villages to be treated– how to design an impact evaluation?

Impact evaluation focused on latter stages of logframe The rationale of a program is to alter an outcome or impact from what it would have been without the program Long- and short-term impacts Purpose Impact Objective Outcomes Outputs How do you measure the impact (effects) of a program? This is one of the key methodological issues for impact evaluation Inputs Allocation

Evaluation Using a Simple Post-Implementation Observation Effect or impact ? 30 Impossible to reach a conclusion regarding the impact; possible to say whether objective has been reached, the result cannot be attributed to the program. Revenue level Time Program Beneficiaries

Evaluation without a Comparison Group, Using a Before/After Comparison 30 21 Effect or impact? Base Line Study Income level 17 14 ? Findings on the impact lack precision and soundness. The time series make it possible to reach better conclusions. 10 Time series Time Program Broad descriptive survey Beneficiaries

The Counterfactual The evaluator’s key question: What would have happened to the beneficiaries if the program had not existed? How do you rewrite history? How do you get baseline data?

Impact Assessment Methods Solutions must be found for two problems: To what extent is it possible to identify the effect (revenues increase, the prevalence of a disease goes down, etc.) To what extent can this effect be attributed to the program (and not to some other cause) To find the best answers possible to these two questions, methods that are specific to impact evaluation are used

Alternative impact methods Experimental design (randomization) Non-experimental design - cross-sectional data Matching methods (propensity score method) Instrumental variable (IV) method - Panel data Difference-in-difference Matched double difference

Randomization Select into treatment and comparison in random way Take the difference between outcomes of treated and non-treated

The Solution Beyond Doubt: Experiment

The Ideal Experiment With equivalent control group. 30 Effect or impact = 13 Income level 17 In theory, a single observation is not enough 10 Extreme care must be taken when selecting the control group to ensure comparability. Time Beneficiaries Equivalent control group Program

Problem with Experimentation With equivalent control group In practice, it is extremely difficult to assemble an exactly comparable control group: Ethical problem (to condemn a group to not being beneficiaries) Difficulties associated with finding an equivalent group outside the project Costs Therefore, this solution is hardly ever used

1. Matching Matched comparators identify counterfactual. Match participants to non-participants from a larger survey; The matches are chosen on the basis of similarities in observed characteristics; This assumes no selection bias based on unobservable heterogeneity; Validity of matching methods depends heavily on data quality.

2. Propensity-score Matching (PSM) Propensity-score matching match on the basis of the probability of participation. Ideally we would match on the entire vector X of observed characteristics. However, this is practically impossible. X could be huge. Rosenbaum and Rubin: match on the basis of the propensity score = This assumes that participation is independent of outcomes given X. If no bias given X then no bias given P(X).

Steps in Score Matching Representative, highly comparable, surveys of the non-participants and participants; (ii) Pool the 2 samples and estimate a logit/ probit model of program participation: Predicted values are “propensity scores.” (iii) Restrict samples to assure common support; (iv) Failure of common support is an important source of bias in observational studies (Heckman et al.).

Density of scores for participants

Density of scores for non-participants

Density of scores for non-participants

(v) For each participant find a sample of non-participants that have similar propensity scores; (vi) Compare the outcome indicators. The difference is the estimate of the gain due to the program for that observation; (vii) Calculate the mean of these individual gains to obtain the average overall gain. -Various weighting schemes.

The Mean Impact Estimator

5. Instrumental Variables Identifying exogenous variation using a 3rd variable Outcome regression: D = 0,1 is our program – not random “Instrument” (Z) influences participation, but does not affect outcomes given participation (the “exclusion restriction”). This identifies the exogenous variation in outcomes due to the program. Treatment regression: 5. Instrumental Variables

Reduced-form outcome regression: where and Instrumental variables (two-stage least squares) estimator of impact:

3. Difference-in-difference Observed changes over time for non-participants provide the counterfactual for participants. Collect baseline data on non-participants and (probable) participants before the program; Compare with data after the program; Subtract the two differences, or use a regression with a dummy variable for participant; This allows for selection bias but it must be time-invariant and additive.

Selection Bias Selection bias

DD requires that the bias is additive and time-invariant

Method fails if the comparison group is on a different trajectory

Outcome Indicator = impact (“gain”); Where: = counterfactual; = comparison group

Difference-In-Difference (i) if change over time for comparison group reveals counterfactual, and (ii) if baseline is uncontaminated by the program,

4. Matched double difference Matching helps control for bias in diff-in-diff. Score match participants and non-participants based on observed characteristics in baseline; Then do a double difference; This deals with observable heterogeneity in initial conditions that can influence subsequent changes over time.

Difference in difference technique-- example 30 Effect or impact = 13 Income level 21 Projection for beneficiaries if no policy 17 14 10 Data on before and after situations is needed. Time Program Beneficiaries control group

China village development fund Ex-post impact evaluation Propensity score method based on a comparison of treated villages and matched non-treated villages Instrument variable method based on any exogenous selection criteria of villages for treatment Ex-ante impact evaluation Randomly select treated and non-treated villages Use PSM technique to match treated and non-treated villages Use IV method through exclusion restriction of village eligibility Baseline for treated and non-treated villages and then do follow-up treatment

Four Broad Categories of Impact Evaluation Methods Equivalent control group (true experimentation) Non - equivalent control group Evaluation without a control group Evaluation by comparison with Post implementation observation Observatio n before and after - Quality + ++ +++