Why Use Randomized Evaluation? Isabel Beltran, World Bank.

Why Use Randomized Evaluation? Isabel Beltran, World Bank

Fundamental Question  What is the effect of a program or intervention?  How can vulnerable groups partake in the state and peace building process?  What political and social accountability mechanisms are most effective in a fragile state?  What measures secure stability and reduce ethnic conflict at the local level?

Objective  To Identify the causal effect of an intervention  Identify the impact of the program from other factors  Need to find out what would have happened without the program  Cannot observe the same person with and without the program at the same point of time  Create a valid counterfactual 3

Correlation is not causation Higher profits Credit Use OR ? 2) ? 1) Higher profits Business Skills Credit Question: Does providing credit increase firm profits? Suppose we observe that firms with more credit also earn higher profits.

5 (+6) increase in gross operating margin Illustration: Credit Program (Before-After) A credit program was offered in 2008. Why did operating margin increase?

Motivation  Hard to distinguish causation from correlation by analyzing existing (retrospective) data  However complex, statistics can only see that X moves with Y  Hard to correct for unobserved characteristics, like motivation/ability  May be very important- also affect outcomes of interest  Selection bias a major issue for impact evaluation  Projects started at specific times and places for particular reasons  Participants may be selected or self-select into programs  People who have access to credit are likely to be very different from the average entrepreneur, looking at their profits will give you a misleading impression of the benefits of credit 6

7 (+4) Impact of the program (+2) Impact of other (external) factors Illustration: Credit Program (Valid Counterfactual) * Macroeconomic environment affects control group * Program impact easily identified

Experimental Design  All those in the study have the same chance of being in the treatment or comparison group  By design, treatment and comparison have the same characteristics (observed and unobserved), on average  Only difference is treatment  Large sample  all characteristics average out  Unbiased impact estimates 8

Options for Randomization  Lottery (0nly some receive)  Lottery to receive new loans, credit for community  Random phase-in (everyone gets it eventually)  Some groups or individuals get credit each year  Variation in treatment  Some get matching grant, others get credit, others get business development services etc  Encouragement design  Some farmers get home visit to explain loan product, others do not 9

Lottery among the qualified Must receive the program Not suitable for the program Randomize who gets the program

Opportunities for Randomization  Budget constraint prevents full coverage  Random assignment (lottery) is fair and transparent  Limited implementation capacity  Phase-in gives all the same chance to go first  No evidence on which alternative is best  Random assignment to alternatives with equal ex ante chance of success 11

Opportunities for Randomization  Take up of existing program is not complete  Provide information or incentive for some to sign up- Randomize encouragement  Pilot a new program  Good opportunity to test design before scaling up  Operational changes to ongoing programs  Good opportunity to test changes before scaling them up 12

Different levels you can randomize at  Individual/owner/firm  Business Association  Village level  School level 13  Women’s association  Youth groups  Regulatory jurisdiction/ administrative district

Group or individual randomization?  If a program impacts a whole group-- usually randomize whole community to treatment or comparison  Easier to get big enough sample if randomize individuals Individual randomizationGroup randomization

Unit of Randomization  Randomizing at higher level sometimes necessary:  Political constraints on differential treatment within community  Practical constraints—confusing to implement different versions  Spillover effects may require higher level randomization  Randomizing at group level requires many groups because of within community correlation 15

Elements of an experimental design 16 Random assignment Treatment GroupControl Group Participants  Non-participants Evaluation sample Potential participants TailorsFurniture manufacturers Target population SMEs

External and Internal Validity (1)  External validity  The evaluation sample is representative of the total population  The results in the sample represent the results in the population  We can apply the lessons to the whole population  Internal validity  The intervention and comparison groups are truly comparable   estimated effect of the intervention/program on the evaluated population reflects the real impact on that population 17

External and Internal Validity (2)  An evaluation can have internal validity without external validity  Example: A randomized evaluation of encouraging informal firms to register in urban areas may not tell us much about impact of a similar program in rural areas  An evaluation without internal validity, can’t have external validity  If you don’t know whether a program works in one place, then you have learnt nothing about whether it works elsewhere. 18

Internal & external validity 19 Random Sample- Randomization Randomization National Population Representative Sample of National Population

Internal validity 20 Stratification Randomization Population Population stratum Samples of Population Stratum Example: Evaluating a program that targets women

21 Representative but biased: useless National Population Non-random assignment USELESS! Randomization

Efficacy & Effectiveness  Efficacy  Proof of concept  Smaller scale  Pilot in ideal conditions  Effectiveness  At scale  Prevailing implementation arrangements -- “real life”  Higher or lower impact?  Higher or lower costs? 22

Advantages of “experiments”  Clear and precise causal impact  Relative to other methods  Provide correct estimates  Much easier to analyze- Difference in averages  Easier to explain  More convincing to policymakers  Methodologically uncontroversial 23

Machines do NOT  Raise ethical or practical concerns about randomization  Fail to comply with Treatment  Find a better Treatment  Move away—so lost to measurement  Refuse to answer questionnaires  Human beings can be a little more challenging!

What if there are constraints on randomization?  Some interventions can’t be assigned randomly  Partial take up or demand-driven interventions: Randomly promote the program to some  Participants make their own choices about adoption  Perhaps there is contamination- for instance, if some in the control group take-up treatment 25

 Those who get receive marketing treatment are more likely to enroll  But who got marketing was determined randomly, so not correlated with other observables/non-observables  Compare average outcomes of two groups: promoted/not promoted  Effect of offering the encouragement (Intent-To- Treat)  Effect of the intervention on the complier population (Local Average Treatment Effect) ▪ LATE= ITT/proportion of those who took it up

Assigned to treatment Assigned to control DifferenceImpact: Average treatment effect on the treated Non-treated Treated Proportion treated 100%0%100% Impact of assignment 100% Mean outcome 1038023 Intent-to-treat estimate 23/100%=23 Average treatment on the treated

Randomly Encouraged Not encouraged DifferenceImpact: Average treatment effect on compliers Non-treated (did not take up program) Treated (did take up program) Proportion treated 70%30%40% Impact of encouragement 100% Outcome100928 Intent-to-treat estimate 8/40%=20 Local average treatment effect

Common pitfalls to avoid  Calculating sample size incorrectly  Randomizing one district to treatment and one district to control and calculating sample size on number of people you interview  Collecting data in treatment and control differently  Counting those assigned to treatment who do not take up program as control—don’t undo your randomization!! 29

When is it really not possible?  The treatment already assigned and announced and no possibility for expansion of treatment  The program is over (retrospective)  Universal take up already  Program is national and non excludable  Freedom of the press, exchange rate policy (sometimes some components can be randomized)  Sample size is too small to make it worth it 30

Thank You 31

Why Use Randomized Evaluation? Isabel Beltran, World Bank.

Similar presentations

Presentation on theme: "Why Use Randomized Evaluation? Isabel Beltran, World Bank."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Why Use Randomized Evaluation? Isabel Beltran, World Bank.

Similar presentations

Presentation on theme: "Why Use Randomized Evaluation? Isabel Beltran, World Bank."— Presentation transcript:

Similar presentations

About project

Feedback