# Selection-on-observables methods (matching) Nicolas STUDER (DREES)

## Presentation on theme: "Selection-on-observables methods (matching) Nicolas STUDER (DREES)"— Presentation transcript:

Selection-on-observables methods (matching) Nicolas STUDER (DREES)

Contents Reminder Gerfin, Lechner, Steiger (2005) Sianesi (2004) Conclusions

Reminder - Evaluation Evaluation = missing data problem (counterfactual) In practice, identify a group (control) of individuals who didn’t participate in the program and would exhibit the same results as the individuals who participated if it would participate (same potential effect)

Reminder - Matching on observables Rubin causal model (no externalities, no general equilibrium effects) For every « treated » individual, look for a non- treated one with the same caracteristics (or close) The causal effet is identified if the CIA (conditional independence assumption) holds: Y 0i = (Y 0j | T i =1, T j = 0, X i = X j )

Reminder - Propensity score matching The CIA requires a huge amount of conditioning variables to hold, then the matching is very bad and the estimator doesn’t converge The score s(X)=Prob(T=1|s(X)=s) allows to reduce the dimensionality s i = s j is enough for the CIA to hold « Balancing score »: the treated group and non treated group with the same score should be similar

Does subsidized temporary employment get the unemployed back to work? (Gerfin, Lechner, Steiger, 2005) 3 types of programs in Schwitzerland: - executive education (courses) - subsidized temporary job TEMP - job in non-profit organizations EP Programs take place simultaneously Compare the programs’ effects on: - « good » reemployment (>3 continuous months, >90% of last earnings) at date t - earnings at date t (0 si sans emploi) - months of unemployment in the following year

Public policy context A number of active labour market policy instruments in different countries France : PPE, allègements de charges, emplois jeunes, emplois tremplins Rationale: Increase human capital or fight again its depreciation (Lazarsfeld and al., 1932), show one’s motivation, testing Stigmatization, creation of a parallel labour market?

Method and data « Propensity score matching » Multinominal Probit (Imbens, 2000): EP, TEMP or no program Mahalanobis distance, only one « match », but the same observation may be the « match » of several Administrative data (social security) : history over last 10 years and future over 24 months Sample = unemployed for less than a year on December, 31 1997 aged 25-55, first program in 98 3 proxies of inobservables: - motivation = benefits sanctions - abilities = last earnings - personal appearance = counsellor’s (placement officer’s) subjective evaluation

Descriptive statistics

Results – Which program is the best?

Heterogeneity - Skills EP may be bad for those with high skills No long-term effect for EP and TEMP For those with low skills, TEMP has a positive compared to EP and NOTHING

Heterogeneity – Unemployment duration One expects a bigger effect if unemployment duration is already high True for both programs Stronger « lock-in » if < 180 jours No evidence of an EP stigma, positive signalling for TEMP

Discussion – Internal validity (1) Conditional on CIA CIA needs a lot of control variables to hold Here two different selection processes: - EP based on counsellor’s decision - unemployed need to find themselves a TEMP job Counsellor’s evaluations may be colinear to observables characteristics Matching on inobservables (treatment’s instrumentation) more suitable

Discussion - Validité interne (2) No standard deviations, must be estimated by bootstrapping Difference of groups in size = small groups are over-weighted No robustness checks, especially for propensity score’s specification « n nearest neighbors » and « kernel » approaches more robust Heckman’s specification test (1989) of the propensity score = use history

Discussion critique – External validity Matching only possible on common support (small loss here: 3% only) Bigger restriction on population (20%) for homogeneisation purposes Results apply only for individuals aged 25-55, without other occupations, unemployed for the first time Swiss context : low unemployment rate = lower competition on the labour market General equilibrium effects = negative externalities on the non-treated because of competition and stigmatisation One could look how the program’s effect varies with the number of spots available in the district

Comparison with randomized controlled trials (RCT) Two non-parametrical (flexible) methods Internal validity: - RCT = « golden standard » if the protocol is strictly enforced in spite of Henry and Hawthorne effects - Matching on observables = CIA, needs lots of data (different points in time), dependent on score’s specification, bias - Attrition, externalities and general equilibrium effects are a problem for both methods External validity: - Both methods provide a local estimator - Often larger sample with matching, in spite of common support restriction - But CIA will not hold with a very heterogeneous population

Doing better (?) : Sianesi (2004) Unemployment duration on entrance in the program taken into account Important because participation renew entitlment of benefits Compare participating at T to not participate for t <= T : modelisation of sequential choices Data allow to follow individuals over 6 years, survey on factors influencing choices, data on local labour market situation Re-weighting within common support Robustness checks concerning attrition et misqualification problems

Context Activation policy in Sweden during the 90-s In addition to placement, unemployed can take part in training et « motivation » activities who are considered as jobs and thus renew entitlement of benefits « Generous » unemployment benefits : up to 80% of last wage during 60 semaines (if employed more than 5 months during last 12) + possibility of 30 additionnal weeks (KAS) Programs are considered as a whole, « treatement » = date of entrance in the first program during the first unemployment period Sample of individuals who became unemployed in 94 (recession peak)

Factors influencing choices Subjective probability of finding a job (Harckman, 2000) Depends on unemployment duration, part-time occupations, sociodemographic characteristics (age, gender, nationality), human capital Data on all this Counsellor’s evaluation for appearance and motivation CIA  Myopia cond. on observables so that one control for last job caracteristics and the month of entrance in employment

Results (1)

Results (2)

« Managing » attrition Results show that attrition is differential If misclassification rate (« lost » who found a job) is 50% (Bring et Carling, 2000), the effect would be halved 2 alternatives: considering each individual with misclassification probability >= u as an employed one, counting a individual with probability as 1/ p i of an employed one Assumes Prob(employed|lost) equal among treated and non-treated, in practice one look at best and worst cases

A disincentive ? (1) The fact that program’s participation renew entitlement of benefits created an opportunistic behaviour Effect on employment is not significative for individuals who enter the program after 15 months of unemployment

A disincentive ? (2) Heterogeneous effect between entitled and non-entitled « Compensation cycle »: the fact that an entitled individual enters a programme after 15 months increase its probability to enter a programme 14 months later

Conclusion (1) Activation policies - Lock-in effect in the short term - Subsidized private sector job more efficient, especially for those with low qualification - Effect is stronger on long-term unemployed - In Sweden, positive effect in the short term on participation in other programs, on employment in the long run - No effect on individuals who are at the end of their entitlement period = evidence of an opportunistic behaviour

Conclusion (2) Selection-on-observables methods - Propensity score matching is almost a « must » for the CIA to hold and the estimator to be convergent - CIA credibility depends on selection process and data richness First-differencing allows to control for individual (fixed) effects and improve the results. - Reweighting and common support are important source of bias - Specification of score is important (Smith et Todd, 2005), « kernel » most robust - Attrition => best and worst cases Matching on inobservables: need to specify the joint distribution of treatement and potential output

Essays Huber, Lechner, Wunsch and Walter, 2009, « Do german welfare-to-work programmes reduce welfare and increase work », IZA Discussion Paper No. 4090 Blundell, Dearden, Sianesi, 2003, « Evaluation the impact of education on earnings in the UK: Results from the NCDS », IFS, WP03/20 Dearden, Emmerson, Frayne, Meghir, 2005, « Education Subsidies and School Drop-Out Rates », IFS, WP05/11