Presentation is loading. Please wait.

Presentation is loading. Please wait.

SREE workshop march 2010sean f reardon using instrumental variables in education research.

Similar presentations


Presentation on theme: "SREE workshop march 2010sean f reardon using instrumental variables in education research."— Presentation transcript:

1 SREE workshop march 2010sean f reardon using instrumental variables in education research

2 outline  a little background on the potential outcomes framework  what is an instrumental variable? and what’s it good for?  assumptions needed to instrumental variables  practical methods of estimating IV models  sources of bias in IV models  additional topics © 2010 by sean f. reardon. all rights reserved.

3 potential outcomes framework

4 a stylized example  what is the effect of receiving tutoring in math on student math achievement?  some made-up data for illustration: © 2010 by sean f. reardon. all rights reserved.

5

6 Definition of an “effect”  The effect, , [on some outcome Y] [for some unit i] [of some treatment condition t relative to some other condition c] is defined as the difference between the value of Y that would be observed if unit i were exposed to treatment t and the value of Y that would be observed if unit i were exposed to treatment c.  More formally, we define the effect of t relative to c on Y for unit i as:  We define the average effect of t relative to c in a population P as: © 2010 by sean f. reardon. all rights reserved.

7 The “Fundamental Problem of Causal Inference” (Holland, 1986)  Although both and are defined in principle, it is impossible to observe both of them for the same unit (because any given unit can be exposed to only one of t or c).  Thus, the causal effect cannot be observed.  The problem of causal inference is thus a problem of missing data. The outcome Y i under its “counterfactual” condition is never observed.  How can we construct unbiased estimates of the average potential outcomes and under the counterfactual conditions? © 2010 by sean f. reardon. all rights reserved.

8

9

10 What if we can’t conduct an RCT?  If we can randomize students to receive either tutoring or no tutoring, and ensure that every student complies with his or her assigned treatment status, the randomization will allow us to estimate the effect of tutoring very easily.  but what if students don’t comply with their treatment assignment?  some assigned to tutoring don’t go to tutoring  some assigned to no tutoring get tutored anyway  this means tutoring is no longer randomly assigned – at least some of the variation in treatment status is potentially endogenous  so a comparison of those assigned to tutoring and no tutoring won’t give us an estimate of the effect of tutoring (but only the effect of being assigned to tutoring)  this is one case where instrumental variables are useful

11 instrumental variables models

12 What is an instrumental variable?  an instrumental variable is an exogenous factor that causes some of the variation in treatment status (though need not be all)  we use it to identify the portion of variation in treatment that is exogenous and then only rely on that exogenous variation to estimate the effect of treatment © 2010 by sean f. reardon. all rights reserved.

13 A general structural model T: treatment status Y: outcome measure X: observed confounders U: unobserved confounders W: observed ignorable causes of Y  Y : unobserved ignorable causes of Y  T : unobserved ignorable causes of T Z: instrument (observed ignorable cause of T) © 2010 by sean f. reardon. all rights reserved. T ZX Y W TT U YY

14 Relating treatments and outcomes  we would like to estimate the effect of T on Y  this involves seeing how T and Y are related  but to infer a causal relationship from the covariance of T and Y, we need to understand the source of variation in T  why do some people get different types/degrees of the treatment? © 2010 by sean f. reardon. all rights reserved. TY

15 Relating treatments and outcomes  variation in T may be caused by factors unrelated to the outcome Y  these may be observed (Z)  or unobserved (  T )  if the only variation in Z comes from factors unrelated to Y, then T is as good as randomly assigned, so getting a causal estimate is easy © 2010 by sean f. reardon. all rights reserved. T Z Y TT

16 Relating treatments and outcomes  variation in T may be caused, in part, by observed factors that are related to the outcome Y  observed confounders (X)  as long as there is some variation in T that is caused by some (not necessarily observable) ignorable cause (Z or  T ), we can still easily get an estimate of the effect of T  statistically control for X (compute relationship between T and Y, conditional on X) © 2010 by sean f. reardon. all rights reserved. T X Y TT

17 Relating treatments and outcomes  variation in T may be caused, in part, by observed and unobserved factors that are related to the outcome Y  observed confounders (X)  unobserved confounders (U)  reverse causality (Y affects T)  here, we cannot get an unbiased estimate of the effect of T  statistical control can’t adjust for U  the ignorable cause (  T ) is not observed © 2010 by sean f. reardon. all rights reserved. T X Y TT U

18 Relating treatments and outcomes  if we cannot observe all the confounders (or if Y affects T), then we need some observed factor that affects T but does not otherwise affect Y  this (Z) is called an instrument (or instrumental variable).  because the part of the variation in T that is induced is ignorable (as good as random), we can use this part of the variation in T to identify the effect of T on Y © 2010 by sean f. reardon. all rights reserved. T ZX Y TT U

19 Tutoring example, revisited  the observed data is not sufficient to estimate the average effect of tutoring  what if we can’t do an experiment, or if we do an experiment and not everyone complies? © 2010 by sean f. reardon. all rights reserved.

20 tutoring voucher as an instrument  randomly assign eligible students to receive a either voucher allowing them to receive free tutoring (Z=1) or no voucher (Z=0).  observe whether students attend tutoring (T=1) or not (T=0).  note: this choice is not random—students may choose tutoring or not, regardless of voucher status (T i ≠Z i ).  observe later achievement (Y)  we want to estimate the effect of T (tutoring vs no tutoring) on Y (achievement). © 2010 by sean f. reardon. all rights reserved.

21 Four subpopulations (angrist, imbens, & rubin, 1996)  compliers  those who would comply with treatment assignment (those for whom T i =Z i )  non-compliers  always-takers those who would always receive the treatment, regardless of assignment (those for whom T i =1)  never-takers those who would never receive the treatment, regardless of assignment (those for whom T i =0)  defiers those who would always do the opposite of treatment assignment (those for whom T i =1-Z i ) © 2010 by sean f. reardon. all rights reserved.

22 Observed Outcomes  N=100, 50% receive vouchers, but not all comply with assignment (only 60% comply): OfferedTutored Proportion VoucherNoYes Tutored No455.10 Yes1535.70 Total6040.40 © 2010 by sean f. reardon. all rights reserved.

23 Observed Outcomes  N=100, 50% receive vouchers, but not all comply with assignment (only 60% comply): OfferedTutored Proportion VoucherNoYes Tutored No455.10 Yes1535.70 Total6040.40 might be compliers or never-takers © 2010 by sean f. reardon. all rights reserved.

24 Observed Outcomes  N=100, 50% receive vouchers, but not all comply with assignment (only 60% comply): OfferedTutored Proportion VoucherNoYes Tutored No455.10 Yes1535.70 Total6040.40 might be defiers or never-takers © 2010 by sean f. reardon. all rights reserved.

25 Observed Outcomes  N=100, 50% receive vouchers, but not all comply with assignment (only 60% comply): OfferedTutored Proportion VoucherNoYes Tutored No455.10 Yes1535.70 Total6040.40 might be defiers or always-takers © 2010 by sean f. reardon. all rights reserved.

26 Observed Outcomes  N=100, 50% receive vouchers, but not all comply with assignment (only 60% comply): OfferedTutored Proportion VoucherNoYes Tutored No455.10 Yes1535.70 Total6040.40 might be compliers or always-takers © 2010 by sean f. reardon. all rights reserved.

27 estimating the proportion of compliers  assume there are no defiers  then everyone with Z=1, T=0 is a never-taker (15 of 50 (30%) with Z=1 in our example)  there should be the same proportion (30%) of never-takers among those with Z=0, because Z is random  the same logic implies there are 10% of the population who are always-takers  thus, 60% (100% - 30% - 10%) are compliers © 2010 by sean f. reardon. all rights reserved.

28 Estimating the proportion of compliers  we can also estimate this by regressing the treatment variable on the instrument tutor = G0 + G1*voucher + e tutor =.10 + 0.60*voucher  Thus, the average effect of being assigned a voucher on tutoring status is +0.60, meaning that the average student’s probability of receiving tutoring increases by 0.60 if assigned a voucher (which means that 60% of the students comply with the voucher assignment). © 2010 by sean f. reardon. all rights reserved.

29 Observed Outcomes  Estimated effect of the voucher offer on test scores = 56.6 – 50.5 = +6.1 OfferedTutored VoucherNoYesTotal No48.370.050.5 Yes44.961.656.6 Total47.562.653.5 © 2010 by sean f. reardon. all rights reserved.

30 Observed Outcomes  Estimated effect of the voucher offer on test scores = 56.6 – 50.5 = +6.1 OfferedTutored VoucherNoYesTotal No48.370.050.5 Yes44.961.656.6 Total47.562.653.5 average outcome among untutored compliers and never-takers here we’re assuming no defiers (later we will see why this is necessary) © 2010 by sean f. reardon. all rights reserved.

31 Observed Outcomes  Estimated effect of the voucher offer on test scores = 56.6 – 50.5 = +6.1 OfferedTutored VoucherNoYesTotal No48.370.050.5 Yes44.961.656.6 Total47.562.653.5 average outcome among untutored compliers and never-takers average outcome among tutored compliers and always-takers here we’re assuming no defiers (later we will see why this is necessary) © 2010 by sean f. reardon. all rights reserved.

32 OLS estimates  OLS yields: test = 47.5 + 15.1*(tutored)  the estimated effect of tutoring is +15.1 points  but we should worry about whether this is biased, because some students chose whether to get tutoring or not.  the tutored group includes compliers and always- takers; the control group includes compliers and never-takers; so they are not equivalent groups © 2010 by sean f. reardon. all rights reserved.

33 The Wald IV estimator  if we are willing to assume that the voucher offer had no effect on the outcome of the non- compliers (because it did not alter their treatment status and does not affect their outcome through any other way), then we can estimate the effect of tutoring like this:  The average effect of the voucher in the population is estimated to be +6.1  but only 60% of students’ decisions about whether to get tutoring were affected by the voucher offer (only 60% of sample are compliers) © 2010 by sean f. reardon. all rights reserved.

34 Wald estimator  average effect in population ( ) = average effect on compliers ( ) x proportion who are compliers ( ) + average effect on non-compliers ( ) x proportion who are non-compliers ( ) © 2010 by sean f. reardon. all rights reserved.

35 Wald estimator  this says that the average effect of the treatment among the compliers equals the average effect in the population divided by the proportion of the population who are compliers  thus, the average effect among the compliers is = +6.1/.60 = +10.1 © 2010 by sean f. reardon. all rights reserved.

36 What have we learned?  An instrumental variable allows us to estimate the average effect of the treatment among those whose treatment status is affected by the instrument (“compliers”)  called the “local average treatment effect” (LATE)  note that we can’t identify who the compliers are  We can’t estimate the average treatment effect in the population, because we can’t estimate the effect among non-compliers  because the instrument doesn’t affect their treatment status, there is no exogenous variation in their treatment status that we can use. © 2010 by sean f. reardon. all rights reserved.

37 What assumptions have we made?  the instrument only affects the outcome through its impact on the treatment (this is called the exclusion restriction)  the instrument is ignorably (randomly) assigned  this allows us to estimate the effect of the instrument on the outcome and on the treatment  the instrument affects the treatment for at least some people  otherwise there are no compliers  there are no defiers © 2010 by sean f. reardon. all rights reserved.

38 more general IV models

39 what if treatment is not binary?  above we assumed the treatment (tutoring) was binary  but not all treatments are binary  we could offer vouchers of different amounts  students could receive different amounts of tutoring  as a result, compliance may take on many values  for some students, the amount of tutoring received may be strongly affected by the instrument; for others, it may be weakly affected or not at all affected. © 2010 by sean f. reardon. all rights reserved.

40 a more general model of the IV estimator  for a given individual i, is the effect of Z on Y  this effect may vary across individuals  we would like to estimate the average effect, © 2010 by sean f. reardon. all rights reserved.

41 1. exclusion restriction  if the only way that Z affects Y is through its effect on T, then we have.  or, put differently,  the assumption that the only way that Z affects Y is through its effect on T is called the exclusion restriction. © 2010 by sean f. reardon. all rights reserved.

42 2. zero compliance-effect covariance  we can write the average effect of Z on Y as  if we assume, then we have  the assumption that is called the zero compliance-effect covariance assumption. © 2010 by sean f. reardon. all rights reserved.

43 3. instrument relevance  as long as, we can rewrite the above as  the assumption that is sometimes called the instrument relevance assumption; or sometimes just referred to as the assumption that the instrument affects the treatment.  if is small (close to zero), we say that the instrument is a weak instrument. © 2010 by sean f. reardon. all rights reserved.

44 4. the instrument is ignorably assigned  if the above three assumptions are met, we have  if Z is ignorably assigned, then we can easily estimate both (the average effect of Z on Y) and (the average effect of Z on T).  the assumption of ignorable assignment thus makes estimation of the effect of T on Y possible. © 2010 by sean f. reardon. all rights reserved.

45 what do these assumptions mean?  exclusion restriction: the offer of a tutoring voucher does not affect students’ achievement except by affecting the amount of tutoring they receive  zero compliance-effect covariance: there is no correlation between how strongly a voucher offer affects the amount of tutoring a student gets and how effective tutoring is for that student © 2010 by sean f. reardon. all rights reserved.

46 what do these assumptions mean?  instrument relevance: the offer of a voucher has some effect, on average, on the amount of tutoring students receive (at least one student is affected by the offer).  ignorable assignment of the instrument: the voucher offer is randomly assigned (this would be violated, for example, if the principal gave vouchers to students she deemed most in need of tutoring). © 2010 by sean f. reardon. all rights reserved.

47 some examples  NYC voucher experiment (howell et al, 2002; krueger & zhu, 2004)  Effect of schooling on wages, using quarter of birth as instrument ( angrist & kreuger, 1991 ).  Effect of teacher absence on student achievement, using snowfall as instrument (miller, murnane & willet, 2007)  Effects of segregation on educational attainment and wages, using railroads as an instrument (ananat 2007) © 2010 by sean f. reardon. all rights reserved.

48 estimating IV models

49 estimating IV models in practice  in practice, we don’t usually compute the effect of Z on Y and Z on T and divide them  because we made need more complex models (if we want to include other covariates in the model, for example)  because we need to compute standard errors  most common methods of estimating IV models is with two-stage least squares (TSLS or 2SLS). © 2010 by sean f. reardon. all rights reserved.

50 Three relevant equations  1:  is the person-specific effect of Z on Y.  2:  is the person-specific effect of Z on T.  but the equation we really are interested in is  3:  is the person-specific effect of T on Y. © 2010 by sean f. reardon. all rights reserved.

51 Three relevant equations  1:  is the person-specific effect of Z on Y.  2:  is the person-specific effect of Z on T.  but the equation we really are interested in is  3:  is the person-specific effect of T on Y. the “reduced form” equation © 2010 by sean f. reardon. all rights reserved.

52 Three relevant equations  1:  is the person-specific effect of Z on Y.  2:  is the person-specific effect of Z on T.  but the equation we really are interested in is  3:  is the person-specific effect of T on Y. the “reduced form” equation the “first stage” equation © 2010 by sean f. reardon. all rights reserved.

53 Three relevant equations  1:  is the person-specific effect of Z on Y.  2:  is the person-specific effect of Z on T.  but the equation we really are interested in is  3:  is the person-specific effect of T on Y. the “reduced form” equation the “first stage” equation the “second stage” equation © 2010 by sean f. reardon. all rights reserved.

54 two-stage least squares  fit the first-stage equation (estimate the effect of Z on T); compute fitted values:  fit the second-stage equation, using predicted values of T in place of observed values of T: © 2010 by sean f. reardon. all rights reserved.

55 two-stage least squares  fit the first-stage equation (estimate the effect of Z on T); compute fitted values:  fit the second-stage equation, using predicted values of T in place of observed values of T: © 2010 by sean f. reardon. all rights reserved.

56 two-stage least squares  fit the first-stage equation (estimate the effect of Z on T); compute fitted values:  fit the second-stage equation, using predicted values of T in place of observed values of T: © 2010 by sean f. reardon. all rights reserved.

57 two-stage least squares  because the predicted values of T from the first- stage equation include only the variation in T that is caused by the instrument, the estimated coefficient from the second-stage equation will be unbiased (as long as the 4 IV assumptions are met).  if you do this by hand, you’ll get the wrong standard errors; statistical software usually has built-in routines (e.g., -ivregress- command in Stata) to compute correct standard errors. © 2010 by sean f. reardon. all rights reserved.

58 Effects of attending charter school  we can’t randomize students to charter or traditional public schools  Abdulkadiroglu, et al (2009) examine students who apply to oversubscribed charter schools, whose admission is determined by lottery (randomization)  instrument is winning the lottery  treatment is # of years in a charter school © 2010 by sean f. reardon. all rights reserved.

59 example: effect of charter schooling first stage reduced form 2sls (compliance) (effect of winning (effect of a lottery on ach.) year in charter) © 2010 by sean f. reardon. all rights reserved.

60 are the IV assumptions valid in this study?  exclusion restriction?  zero compliance-effect covariance?  instrument relevance?  ignorable assignment? © 2010 by sean f. reardon. all rights reserved.

61 sources of bias in IV models

62 sources of bias in IV  failure of exclusion restriction assumption  failure of ignorability assumption  failure of zero compliance-effect covariance assumption  finite sample bias  weak instruments cause 3 problems:  exacerbate bias due to failure of assumptions (exclusion restriction, ignorability, zero covariance)  exacerbate finite sample bias  lead to incorrect estimation of standard errors when using two- stage least squares © 2010 by sean f. reardon. all rights reserved.

63 failure of the exclusion restriction  recall that the exclusion restriction says that the only way that Z affects Y is through its effect on T.  as a result, we can write © 2010 by sean f. reardon. all rights reserved.

64 failure of the exclusion restriction  if the exclusion restriction is violated, then there is some other path through which Z affects Y  as a result, we can write TiTi ZiZi YiYi γiγi ii ZiZi YiYi i © 2010 by sean f. reardon. all rights reserved.

65 failure of the zero covariance assumption  averaging the above in the population  now, dividing through by, we get  so the IV estimator (the ratio of the average effect of Z on Y to the average effect of Z on T) will be biased  if is small, the biases will be larger © 2010 by sean f. reardon. all rights reserved.

66 failure of the zero covariance assumption  averaging the above in the population  now, dividing through by, we get  so the IV estimator (the ratio of the average effect of Z on Y to the average effect of Z on T) will be biased  if is small, the biases will be larger bias due to failure of the exclusion restriction bias due to failure of the zero compliance-effect covariation assumption © 2010 by sean f. reardon. all rights reserved.

67 failure of the zero covariance assumption  if all the assumptions except the zero compliance-effect covariance assumption are met, we have  so the IV model will estimate the compliance- weighted average treatment effect (CWATE).  if T is binary and there are no defiers, this will be the same as the average effect among the compliers (LATE), because non-compliers will get 0 weight. © 2010 by sean f. reardon. all rights reserved.

68 failure of the ignorability assumption  if the instrument is not ignorably assigned, then we cannot obtain unbiased estimates of the effect of Z on Y or of the effect of Z on T.  Thus, the ratio of the two may be biased. © 2010 by sean f. reardon. all rights reserved.

69 weak instruments  weak instruments do not, strictly-speaking, violate any of the IV assumptions, but they do exacerbate the bias from other assumptions  rule of thumb: an instrument is weak if the F- statistic on the instrument(s) from the first stage equation is <10. © 2010 by sean f. reardon. all rights reserved.

70 weak instruments and bias the IV estimator  weak instruments cause 3 problems with IV estimator:  exacerbate bias due to failure of the exclusion restriction, ignorability, and monotonicity  exacerbate finite sample bias  lead to incorrect estimation of standard errors when using two-stage least squares  finite sample bias  even if the 4 IV assumptions are met, IV estimation is biased unless using an infinite sample  most pronounced with weak instruments and small samples © 2010 by sean f. reardon. all rights reserved.

71 additional uses

72 mediation models  suppose we randomly assign a treatment (e.g., teacher professional development) that we think will affect student learning by affecting instructional practice  we can treat the PD as an instrument, and the mediator (instructional practice) as the ‘treatment’ and use IV to estimate the effect of instructional practice (which can’t be randomized) on learning  but worry about exclusion restriction (are there other ways that the PD could affect learning?) © 2010 by sean f. reardon. all rights reserved.

73 multiple mediator models  suppose we have a randomize students to 3 treatment conditions.  two first stage equations:  second stage equation: © 2010 by sean f. reardon. all rights reserved.

74 IV to correct for measurement error  suppose we want to estimate the effect of cognitive skill on wages:  if cognitive skill is measured with error by ACH, OLS will give a biased estimate of .  if we have a second test of skills, we can use one test as an instrument for the second test, and then use the predicted value of the second test in the wage equation.  called “errors-in-variables” (EIV) model. © 2010 by sean f. reardon. all rights reserved.


Download ppt "SREE workshop march 2010sean f reardon using instrumental variables in education research."

Similar presentations


Ads by Google