Presentation on theme: "1 First steps in practice Daniel Mouqué Evaluation Unit DG REGIO."— Presentation transcript:
1 First steps in practice Daniel Mouqué Evaluation Unit DG REGIO
2 The story so far… Indicators useful for management, accountability, but do not give impacts For impacts, need to estimate a counterfactual
3 Notice that « classic » methods often imply counterfactuals Indicators – before vs after Indicators – with « treatment » vs without Qualitative methods – expert opinion Beneficiary surveys – beneficiary opinion Macromodels – model includes a baseline But all of these have strong assumptions, often implicit
4 How to weaken the assumptions… … and improve the estimation of impacts Comparison of similar assisted and non- assisted units (finding « twins ») There are various ways to do this - lets start with a simple example
5 Training for long term unemployed Innovative training for those who have been out of work for >12 months « Classic » evaluation: for those trained, pre-post comparison of employment status, income Whats wrong with this? So we combine with a beneficiary survey Is this much better?
6 A simple counterfactual (random assignment) 10,000 candidates for the training, randomly assign 5000 to training/5000 to traditional support Compare employment status and earnings one year after training Whats useful about this? Can you see any potential problems?
7 Lets try again (« discontinuity design ») Offer the training to all For evaluation, compare a subset of these with a similar, but non-eligible group: –Unemployed for 12-15 months (eligible) –Unemployed for 9-12 months (not eligible) Whats better about this than the previous evaluation example? Whats worse?
8 3rd time lucky (« pipeline ») This time we stagger the training over 2 years 5000 are randomly chosen to take the training this year, 5000 next year Next years treatment group is this years control group Whats good about this? What limitations can you see?
9 Some observations Notice: This is not just one method, but a family of methods Two families in fact - well come back to this Different possibilities have different strengths & weaknesses, therefore different applications Varies from simple to very complicated Well look at common features and requirements now (with Kai)
10 What do we need? Kai Stryczynski Evaluation Unit DG REGIO
11 The methods require Large « n », ie a large number of similar units (to avoid random differences) Good data for treated and non-treated units –Basic data (who are the beneficiaries?) –Target variables (what is policy trying to change?) –Descriptive variables (eg to help us find matches) –Ability to match the various datasets
12 Sectoral applicability Good candidates (large « n ») include: Enterprise support (including R&D) Labour market and training measures Other support to individuals (eg social) But…. only where good data exist Bad candidates (small « n ») include: Large infrastructure (transport, waste water etc) Networks (eg regional innovation systems)
13 Rule of thumb < 50% of cases applicable of which < 50% have enough data And even then, be selective. Its a powerful learning tool, but can be hard work & expensive.
14 A pragmatic strategy 2 pronged approach (monitor all, evaluation for a selection) CIE where can, classic methods where cant (survey better than nothing) Mix methods (triangulate, qualitative to explain CIE results) Be honest and humble about what we know... And dont know Use working hypotheses, build picture over time
15 Lets get started Daniel Mouqué Evaluation Unit DG REGIO
16 The options There are many options… …. But two broad families of counterfactual impact evaluation
17 Experimental methodsQuasi-experimental methods Eg random assignment, pipeline Eg Albertos difference in difference, my discontinuity From the outset, some form of random assignment – evaluation drives selection process Selection process as normal – does not interfere with policy process Must be installed from the outset of the measure Can be conducted ex post (although earlier better, for data collection) Weakest assumptions, best estimate of impact Relatively weak assumptions, can usually be considered a good estimate of impact
18 A « rule of thumb » Randomised/experimental methods most likely to be useful for: –Pilot projects –Different treatment options (especially genuine policy choices, such as grants vs loans) Quasi-experimental – more generally applicable However, randomised simpler, so a good introduction (Quasi-experimental methods in depth tomorrow)
19 New friends part 1 Experimental (« randomised ») methods Some experimental/randomised options for your exercises in the group work: Random assignment Pipeline (delaying treatment for some) Random encouragement Tip: most costly (mess with selection process), but most reliable
20 New friends part 2 Quasi-experimental methods You dont need to know all these yet (tomorrow will treat in depth Intuition: treat as usual, compare with similar, but not quite comparable, treated units Difference-in-difference Discontinuity design (comparing « just qualified for treatment » with « just missed it »)
21 In your group work, we want you to start thinking What are policy/impact questions in my field(s)? Can I randomise from the beginning, to get an insight into these results? Random or not (and often the answer will be not!) can I get outcome data for similar non-treated units?
22 To clarify, a real example (from enterprise support)
23 The set-up Eastern Germany Investment and R&D grants to firms Really increases investment, employment? Could not randomise (too late, too political) Clever matching procedures (well tell you more later in the course) to compare similar assisted/non-assisted firms
24 The results Investment grants of 8k/employee led to estimated extra investment of 11-12k Same grants led to an extra 25-30,000 extra jobs R&D grants of 8k/employee led to 8k extra investment
25 What does this tell us? This gives comfort to the views: Enterprise and R&D grants work in lagging regions (at the very least, generate private investment) Grants have a bigger effect on productivity than on jobs Gross jobs - especially jobs safeguarded - overstate case
26 What does this not tell us? We still do not know for certain: If same pattern would hold outside E. Germany (specific situation, specific selection process) If investment will translate into long term growth and R&D (but have weakened assumption) If other instruments better than grants Crowding out in other enterprises How to cure cancer (astonishingly, 1 study did not crack all the secrets of the universe) But know more than before, and this is not the last evaluation we will ever do
27 Potential benefits - motivation for the coming days Learning what works, by how much (typically) Learning what instrument is appropriate in a given situation (eg grants or advice to enterprise) Learning on whom to target assistance (« stratification », eg target training measure on the group most likely to benefit) Building up a picture over time