Download presentation

Presentation is loading. Please wait.

Published byMorgan Stevenson Modified over 2 years ago

1
Propensity Score Matching and the EMA pilot evaluation Lorraine Dearden IoE and Institute for Fiscal Studies RMP Conference 22 nd November 2007

2
The Evaluation Problem Question which we want to answer is Question which we want to answer is –What is the effect of some treatment (D i =1) on some outcome of interest (Y 1i ) compared to the outcome (Y 0i ) if the treatment had taken place (D i =0) Problem is that it is impossible to observed both outcomes of interest to get the true causal effect Problem is that it is impossible to observed both outcomes of interest to get the true causal effect

3
How can we solve this problem? Randomised experiment Randomised experiment –Randomly assign people to treatment group and control group –If groups large enough, the distribution of all pre-treatment characteristics in the two groups should be identical so any difference in outcome can be attributed to the treatment –Not generally available –Not always solution

4
Propensity Score Matching Instead have to rely on non-experimental approaches Instead have to rely on non-experimental approaches Propensity score matching is one such method that is gaining popularity because of simplicity Propensity score matching is one such method that is gaining popularity because of simplicity Crucial, however, to understand the assumptions underlying the approach (and all approaches) Crucial, however, to understand the assumptions underlying the approach (and all approaches) Again NOT always appropriate Again NOT always appropriate –may need to rely on other method e.g. instrumental variables, control function

5
Assumptions Need to have a treatment group and some type of appropriate non-treated group from which you can select a control group Need to have a treatment group and some type of appropriate non-treated group from which you can select a control group –Finding an appropriate and convincing control group is often the most difficult evaluation task Assume ALL relevant differences between the groups pre-treatment can be captured by observable characteristics in your data (X) Assume ALL relevant differences between the groups pre-treatment can be captured by observable characteristics in your data (X) –Having high quality and extensive pre-treatment observables is crucial! –Conditional Independence Assumption (CIA) assumption Common support – return to this Common support – return to this

6
What are we trying to measure? Average treatment effect for the population (ATE) Average treatment effect for the population (ATE) Average treatment effect on the treated (ATT) Average treatment effect on the treated (ATT) Average treatment effect on the non-treated (ATNT) Average treatment effect on the non-treated (ATNT) Usually interested in ATT: E(Y 1 – Y 0 |D=1) = E(Y 1 |D=1) – E(Y 0 |D=1) Usually interested in ATT: E(Y 1 – Y 0 |D=1) = E(Y 1 |D=1) – E(Y 0 |D=1) –OLS - ATT=ATE=ATNT –IV – LATE –Matching and control function - ATE, ATT & ATNT How can we find E(Y 0 |D=1)? How can we find E(Y 0 |D=1)?

7
What is treatment? Most robust design is Intention to Treat (ITT) analysis – treatment is all individuals who could have taken up program whether they did or not Most robust design is Intention to Treat (ITT) analysis – treatment is all individuals who could have taken up program whether they did or not Another approach is receipt of treatment approach – but here sometimes much more difficult to find an appropriate control group Another approach is receipt of treatment approach – but here sometimes much more difficult to find an appropriate control group

8
Matching Involves selecting from the non-treated pool a control group in which the distribution of observed variables is as similar as possible to the distribution in the treated group Involves selecting from the non-treated pool a control group in which the distribution of observed variables is as similar as possible to the distribution in the treated group There are a number of ways of doing this but they almost always involve calculating the propensity score p i (x) Pr{D=1|X=x} There are a number of ways of doing this but they almost always involve calculating the propensity score p i (x) Pr{D=1|X=x}

9
The propensity score The propensity score is the probability of being in the treatment group given you have characteristics X=x The propensity score is the probability of being in the treatment group given you have characteristics X=x How do you do this? How do you do this? Use parametic methods (i.e. logit or probit) and estimate the probability of a person being in the treatment group for all individuals in the treatment and non-treatment groups Use parametic methods (i.e. logit or probit) and estimate the probability of a person being in the treatment group for all individuals in the treatment and non-treatment groups Rather than matching on the basis of ALL Xs can match on basis of this propensity score (Rosenbaum and Rubin (1983)) Rather than matching on the basis of ALL Xs can match on basis of this propensity score (Rosenbaum and Rubin (1983))

10
How do we match? Nearest neighbour matching Nearest neighbour matching –each person in the treatment group choose individual(s) with the closest propensity score to them –can do this with (most common) or without replacement –not very efficient as discarding a lot of information about the control group

11
Kernel based matching Kernel based matching – each person in the treatment group is matched to a weighted sum of individuals who have similar propensity scores with greatest weight being given to people with closer scores –Some kernel based matching use ALL people in non-treated group (e.g. Gaussian kernel) whereas others only use people within a certain probability user-specified bandwidth (e.g. Epanechnikov ) –Choice of bandwidth involves a trade-off of bias with precision

12
Other methods Radius matching Radius matching Caliper matching Caliper matching Mahalanobis matching Mahalanobis matching Local linear regression matching Local linear regression matching Spline matching….. Spline matching…..

13
Imposing Common Support In order for matching to be valid we need to observe participants and non- participants with the same range of characteristics In order for matching to be valid we need to observe participants and non- participants with the same range of characteristics –i.e for all characteristics X there are treated and non-treated individuals If this cannot be achieved If this cannot be achieved –treated units whose p is larger than the largest p in the non-treated pool are left unmatched

14
How do we get standard errors? Asymptotics of propensity score matching hard/impossible to define Asymptotics of propensity score matching hard/impossible to define Generally need to Bootstrap standard errors Generally need to Bootstrap standard errors Take a random draw from your sample with replication Take a random draw from your sample with replication Repeat this 500 to 1000 times Repeat this 500 to 1000 times Standard Deviation of these estimates gives you your standard error Standard Deviation of these estimates gives you your standard error

15
What was the EMA pilot? EMA pilots involved payment of up to £40 per week for year olds who remained in full-time education EMA pilots involved payment of up to £40 per week for year olds who remained in full-time education 4 different variants tested: 4 different variants tested: V1 – up to £30 per week, £50 retention and achievement bonus V2 – V1 but up to £40 per week V3 – V1 but paid to mother V4 – V1 but more generous bonuses V1 – up to £30 per week, £50 retention and achievement bonus V2 – V1 but up to £40 per week V3 – V1 but paid to mother V4 – V1 but more generous bonuses

16
Justifications for intervention Low levels of participation in post-16 education among low income families Low levels of participation in post-16 education among low income families Presence of liquidity constraints? Presence of liquidity constraints? –need evidence on the returns to education –Card (2000), Cameron & Heckman (2001) suggest that these may not be that important –Meghir & Palme (1999) find evidence of liquidity constraints using Swedish data

17
Design of the evaluation Interviews with young people and parents in 10 EMA pilot areas and 11 control areas Interviews with young people and parents in 10 EMA pilot areas and 11 control areas Information collected both among those income- eligible and income-ineligible for the EMA Information collected both among those income- eligible and income-ineligible for the EMA First survey involved young people who completed Year 11 in 1999 (cohort 1) First survey involved young people who completed Year 11 in 1999 (cohort 1) Parental questionnaire only in initial survey Parental questionnaire only in initial survey Cohort 1 followed up 3 times Cohort 1 followed up 3 times

18
The data Questionnaires have detailed information on: Questionnaires have detailed information on: –all components of family income –household composition –GCSE results –mothers and fathers education, occupation and work history –early childhood circumstances –current activities of young people

19
Matching approach Involves taking all eligible individuals in the pilot areas and matching them with a weighted sum of individuals who look like them in control areas Involves taking all eligible individuals in the pilot areas and matching them with a weighted sum of individuals who look like them in control areas Difference in full-time education outcomes in pilot and control areas in this matched sample is the estimate of the EMA effect (ATT) Difference in full-time education outcomes in pilot and control areas in this matched sample is the estimate of the EMA effect (ATT) Crucial assumption is that we observe everything that determines education participation Crucial assumption is that we observe everything that determines education participation

20
How do we do this? Dont match on all Xs, but can instead match on the propensity score (Rosenbaum and Rubin, 1983) Dont match on all Xs, but can instead match on the propensity score (Rosenbaum and Rubin, 1983) Propensity score is just predicted probability of being in a pilot area given all the observables in our data Propensity score is just predicted probability of being in a pilot area given all the observables in our data Use kernel-based matching (Heckman, Ichimura & Todd, 1998) Use kernel-based matching (Heckman, Ichimura & Todd, 1998) We do this matching for each sub-group of interest We do this matching for each sub-group of interest

21
Family background Family background –household composition, housing status, ethnicity, early childhood characteristics, older siblings education and parents age, education, work status and occupation Family income Family income –current family income, whether on means- tested benefits Ability (GCSE results) Ability (GCSE results) School variables School variables Indicators of ward level deprivation Indicators of ward level deprivation Variables we match on:

22
Results Y12: urban men Note: Income eligibles only

23
Results Y12: urban women Note: Income eligibles only

24
Results Y13: Note: Income eligibles only

25
Results by Eligibility Groups In Year 12 impact concentrated on those who are fully eligible ( % pts) In Year 12 impact concentrated on those who are fully eligible ( % pts) –No significant effect for boys or girls on taper –No effect on ineligibles In Year 13 impact on both groups In Year 13 impact on both groups –EMA impacts significantly on retention for those on the taper

26
Does is matter who EMA paid to? No difference if we do not distinguish by eligibility No difference if we do not distinguish by eligibility For variant where paid to child impact is concentrated on those fully eligible For variant where paid to child impact is concentrated on those fully eligible For variant where paid to mother impact on those who are fully and partially eligible For variant where paid to mother impact on those who are fully and partially eligible

27
Credit Constraints? Follow consumption literature (see Zeldes (1989)) split the sample by assets, the idea being that those with assets are not liquidity constrained. Follow consumption literature (see Zeldes (1989)) split the sample by assets, the idea being that those with assets are not liquidity constrained. –Compare results for home-owners and non home- owners The key assumption here is that house ownership in itself does not lead to different responses to financial incentives, other than because it implies different access to funds. The key assumption here is that house ownership in itself does not lead to different responses to financial incentives, other than because it implies different access to funds.

28
Results Significant impact for non home-owners of 9.1 percentage points Significant impact for non home-owners of 9.1 percentage points Insignificant impact of home-owners of 3.8 percentage points Insignificant impact of home-owners of 3.8 percentage points But difference of 5.3 percentage points is not significant at conventional levels (p- value 12%) But difference of 5.3 percentage points is not significant at conventional levels (p- value 12%)

29
Conclusions EMA effect around 4.5 percentage points EMA effect around 4.5 percentage points Plays a role in reducing gender differences in stay-on rates particularly retention in Year 13 Plays a role in reducing gender differences in stay-on rates particularly retention in Year 13 Important to control for local area effects Important to control for local area effects –matching on ward level data important

30
Other conclusions More effective paying to child rather than parent for those fully eligible More effective paying to child rather than parent for those fully eligible More effective paying to mother for those who are partially eligible More effective paying to mother for those who are partially eligible Increase drawn from both work and NEET groups Increase drawn from both work and NEET groups Some evidence it may be alleviating credit constraints Some evidence it may be alleviating credit constraints

31
What else can you do with Matching? What is the policy question you are interested in? What is the policy question you are interested in? Is ATT the appropriate measure? Is ATT the appropriate measure? In returns to schooling evaluation we are much more interested in ATNT In returns to schooling evaluation we are much more interested in ATNT What is treatment – ITT versus receipt of treatment What is treatment – ITT versus receipt of treatment –Take-up usually an important policy implication therefore usually inappropriate (& difficult) to compare actual participants with an appropriate control group but sometimes no choice!

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google