Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 th of October, 2010 University of Cape Town Kamilla Gumede Martin Abel Introduction to Randomised Evaluations.

Similar presentations


Presentation on theme: "11 th of October, 2010 University of Cape Town Kamilla Gumede Martin Abel Introduction to Randomised Evaluations."— Presentation transcript:

1 11 th of October, 2010 University of Cape Town Kamilla Gumede Martin Abel Introduction to Randomised Evaluations

2 New research programme within SALDRU Regional office of a global network Specialise in RANDOMISED IMPACT EVALUATIONS Do 3 things: – Run evaluations – Disseminate result – public good – Train others to run evaluations J-PAL Africa Fight poverty

3 I. Why do we evaluate social programmes? II. What is an IMPACT? III. Impact evaluation methodologies IV. How to run an RCT: – Advantages of randomised evaluations – Theory of Change – Randomisation Design – External vs. Internal Validity Overview

4 Surprisingly little hard evidence on what works Need #1: Can do more with given budget with better evidence. Need #2: If people knew money was going to programs that worked, could help increase pot for anti-poverty programs Instead of asking “do aid/development programs work?” should be asking: – Which work best, why and when? – How can we scale up what works? Evidence-based policy making 4

5 5 Example Aid: Optimists “I have identified the specific investments that are needed [to end poverty]; found ways to plan and implement them; [and] shown that they can be affordable.” Jeffrey Sachs End of Poverty

6 6 “After $2.3 trillion over 5 decades, why are the desperate needs of the world's poor still so tragically unmet? Isn't it finally time for an end to the impunity of foreign aid?” Bill Easterly The White Man’s Burden Example Aid: Pessimists

7 Accountability Lesson learning – Program – Organization – Beneficiaries – World So that we can reduce poverty through more effective programs Different types of evaluation contribute to these different objectives of evaluation Objective of evaluation

8 The different types of evaluation Evaluation (M&E) Program Evaluation Impact Evaluation Randomized Evaluation

9 Evaluating Social Programmes

10 What is outcome after programme? What would have happened in the absence of the programme? Take the difference between what happened (with the program) -what would have happened (without the program) =IMPACT of the program 10 How to measure impact? (I)

11 Impact is defined as a comparison between: 1.the outcome some time after the program has been introduced 2.the outcome at that same point in time had the program not been introduced (the ”counterfactual”) 11 How to measure impact? (II)

12 Impact: What is it? Time Primary Outcome Impact Counterfactual Intervention

13 Impact: What is it? Time Primary Outcome Impact Counterfactual Intervention

14 The counterfactual represents the state of the world that program participants would have experienced in the absence of the program (i.e. had they not participated in the program) Problem: Counterfactual cannot be observed Solution: We need to “mimic” or construct the counterfactual Counterfactual

15 Counterfactual is often constructed by selecting a group not affected by the program Randomized: – Use random assignment of the program to create a control group which mimics the counterfactual. Non-randomized: – Argue that a certain excluded group mimics the counterfactual. 15 Constructing the counterfactual

16 Experimental: – Randomized Evaluations Quasi-experimental – Instrumental Variables – Regression Discontinuity Design Non-experimental – Pre-post – Difference in differences – Cross Sectional Regression – Fixed Effects Analysis – Statistical Matching 16 Methodologies in impact evaluation South Africa OAP on labour supply

17 Non-experimental evaluations – Cross Sectional Regression Bertrand et al. (2003) Posel et al. (2006) We can control for observable differences (age, gender, education,...) There are also unobservable characteristics we cannot control for (motivation, etc.)  What people does a household with a pension attract?

18 Non-experimental evaluations – Panel Data Analysis with Fixed Effects Ardington et al. (2009) Fixed effects analysis limits sample to households that changed pension status over time We can control for unobservable characteristics that do not change Unobservable characteristics may change over time Data requirements: panel data, sizeable proportion of households switching

19 How to randomise

20 A. The basics Randomly assign them to either:  Treatment Group – is offered treatment  Control Group - not allowed to receive treatment (during the evaluation period) 20 Target Population Not in evaluation Evaluation Sample Random Assignment Treatment group Control group

21 A. Why randomize? – Conceptual Argument If properly designed and conducted, randomized experiments provide the most credible method to estimate the impact of a program Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment, any difference that subsequently arises between them can be attributed to the program rather than to other factors. 21

22 Example: Primary vs Secondary

23 Returns to Secondary Education (?) Standard way to measure this: – Equation But are people who complete school “the same” as those who don’t: – More patient, more ambitious, more resourced families and have lower immediate economic opportunities. 1,200 teens, qualified but cannot afford: – 300 boys, 300 girls get 4 year scholarship – Followed for 10 years

24 In class test

25 BREAK

26 Basic setup of a randomized evaluation 26 Target Population Not in evaluation Evaluation Sample Random Assignment Treatment group Control group Base- line survey End- line survey

27 Roadmap to Randomized Evaluations willing partner sufficient time interesting policy question / theory sufficient resources Environment / Context 1 mechanism of change (log frame) state assumptions identify research hypothesis identify target population identify indicators identify threads to validity Theory of Change 2 statistical validity cluster correlation Sufficient Sample Size 4 Spillovers Discouragement Attrition Political interference Strategy to Manage Threats 5 Check on competing interventions simple program packages Randomization Design encouragement gradual rollout simple lottery rotation design 3 individual cluster design block random. InterventionUnit of randomization Randomization mechanism Revise

28 Willing partner Sufficient time Interesting policy question / theory Sufficient resources B. Environment / Context

29 Programs /Policies Knowledge Evidence Experience Personal collective Ideology Own External Support Budget Political Capacity II. Evaluations: Providing evidence for policymaking

30 What are the possible chains of outcomes in the case of the intervention? What are the assumptions underlying each chain of causation? What are the critical intermediary steps needed to obtain the final results? What variables should we try to obtain at every step of the way to discriminate between various models? C. Theory of Change (I) 30

31 C. Theory of Change (II) – SA Pension System 31 Bertrand et al. (2003)Posel et al. (2006) Different theories of change determine what indicators we measure and who do include in our evaluation

32 Based on the Theory of Change, we identify indicators to test the different lines of causation and measure outcomes...room for creativity… How to measure women empowerment? – Measure fraction of time they speak during village council meeting How to measure corruption in infrastructure projects? – Drill holes in the asphalt of newly built roads and measure difference in actual and official thickness C. Indicators

33 Roadmap to Randomized Evaluations willing partner sufficient time interesting policy question / theory sufficient resources Environment / Context 1 mechanism of change (log frame) state assumptions identify research hypothesis identify target population identify indicators identify threads to validity Theory of Change 2 statistical validity cluster correlation Sufficient Sample Size 4 Spillovers Discouragement Attrition Political interference Strategy to Manage Threats 5 Check on competing interventions simple program packages Randomization Design encouragement gradual rollout simple lottery rotation design 3 individual cluster design block random. InterventionUnit of randomization Randomization mechanism Revise

34 D. Basic setup of a randomized evaluation 34 Target Population Not in evaluation Evaluation Sample Random Assignment Treatment group Control group

35 Evidence on the effectiveness of providing microfinance loans to the poor has been mixed. Some argue that financial literacy training is more effective while others propose that both loans and training needs to be provided to alleviate poverty  How can you design a randomised evaluation to assess which of these claims is true? Case Study: Microfinance and/or Financial Literacy Training

36 D. Forms of Intervention Random Assignment 6 month Financial Literacy Control group 1 month Financial Literacy Random Assignment Microfinance Control group Financial Literacy Financial Literacy AND Microfinance Random Assignment Microfinance Control group Financial Literacy Random Assignment Microfinance Control group Simple Treatment / Control Cross- cutting Design Multiple Treatment Varying levels of Treatment

37 Individual Cluster (Class room, school, district,…) Generally, best to randomize at the level at which the treatment is administered. Ethical and practical concerns E. Unit of Randomization

38 Confronted with overcrowded schools and a shortage of teachers, in 2005 the NGO ICS offered to provided funds to hire 140 extra teachers each year.  What is the best unit of randomisation for our RCT? Case Study: Extra Teachers in Kenya

39 Lottery Pull out of a hat/bucket Use random number generator in spreadsheet or STATA Phase-in design Rotation design Encouragement design F. Method of Randomization

40 How to Randomize, Part I - 40 Random assignment through lottery 2006 Income per person, per month, rupees 1000 500 0 Treat Compare 14571442

41 Alternative Mechanism: Phase-in design Round 1 Treatment: 1/3 Control: 2/3 Round 2 Treatment: 2/3 Control: 1/3 Round 3 Treatment: 3/3 Control: 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 Round 1 Treatment: 1/3 Control: 2/3 Round 2 Treatment: 2/3 Control: 1/3 Randomized evaluation ends

42 Roadmap to Randomized Evaluations willing partner sufficient time interesting policy question / theory sufficient resources Environment / Context 1 mechanism of change (log frame) state assumptions identify research hypothesis identify target population identify indicators identify threads to validity Theory of Change 2 statistical validity cluster correlation Sufficient Sample Size 4 Spillovers Sample Bias Attrition Strategy to Manage Threats 5 Check on competing interventions simple program packages Randomization Design encouragement gradual rollout simple lottery rotation design 3 individual cluster design block random. InterventionUnit of randomization Randomization mechanism Revise

43 Internal Validity: Can we estimate the treatment effect for our particular sample? – Fails when there are differences between the two groups (other than the treatment itself) that affect the outcome External Validity: Can we extrapolate our estimates to other populations? – Fails when outside our evaluation environment, the treatment has a different effect G. Internal vs. External Validity

44 Threads to Internal Validity: control group is different from the counterfactual – Spill-overs – Sample Selection Bias – Attrition Examples: – Individuals assigned to comparison group could attempt to move into treatment group (cross-over) and v.v. – Individuals assigned to treatment group could drop out of the program (Attrition) G. Threads to Internal Validity

45 Depends on three factors: Program Implementation: can it be replicated at a large scale? Study Sample: is it representative? – Does de-worming have the same effects in Kenya and South Africa? Sensitivity of results: would a similar, but slightly different program, have same impact? G. External Validity: Generalisability of results 45

46 Interested? Become part of the J-PAL research team! “You get to spend a year in Siberia, while I have to stay here in Hawaii, to apply for grants to extend your research time there.”


Download ppt "11 th of October, 2010 University of Cape Town Kamilla Gumede Martin Abel Introduction to Randomised Evaluations."

Similar presentations


Ads by Google