Presentation is loading. Please wait.

Presentation is loading. Please wait.

Andrew Ryan, Ph.D. Associate Professor of Health Management and Policy

Similar presentations


Presentation on theme: "Andrew Ryan, Ph.D. Associate Professor of Health Management and Policy"— Presentation transcript:

1 Everything you wanted to know about difference-in-differences but were afraid to ask
Andrew Ryan, Ph.D. Associate Professor of Health Management and Policy University of Michigan February 3, 2017

2 Colleagues on this journey
James F. Burgess Jr., PhD VA Boston Health Care System Boston University School of Public Health Justin B. Dimick, MD, MPH Center for Healthcare Outcomes and Policy, University of Michigan Ariel Linden, Dr.PH Health Management and Policy, Division of General Medicine, University of Michigan Evangelos Kontopantelis, PhD University of Manchester, UK

3 I’ll provide and overview about difference-in-differences estimation along with some common extensions I’ll describe the intuition and basic application Discuss some variations to the basic design and challenges implementing the estimator Finish with a guide for implementation

4 Differences-in-differences is a common analysis strategy to evaluate the effect of policies
DID=(B2-B1) – (A2-A1) (Source: Dimick and Ryan 2014)

5 DID=(B2-B1) – (A2-A1) (Source: Dimick and Ryan 2014)

6 Question: Why use difference-in-differences
Question: Why use difference-in-differences? Why not just estimate pre-post differences for a group exposed to an intervention?

7 Difference-in-differences analysis has two key ingredients
Data must exist on a study outcome: Before and after an intervention occurs For groups that were exposed and not exposed to the intervention

8 Difference-in-differences estimation relies on the parallel trends assumption
Treatment and comparison groups may have different levels of the outcome but trends in pre-treatment outcomes should be the same Absent treatment, outcomes for the treatment and comparison groups are expected to change at the same rate Can be partially tested by evaluating differences in pre-intervention trends (Source: Layton and Ryan 2015)

9 Difference-in-differences estimation also relies on the common shocks assumption
Events in the post-intervention period that are unrelated to the interventionwill have the same effect on treatment and comparison group Can’t really be tested (Source: Ryan and Dimick 2017)

10 Difference-in-difference estimation can be performed using regression
For hospital j at time t: Yjt = b0 + b1 Treatj + b2 Postt + α (Treatj ∙ Postt)+ ejt Treat is a dummy variable indicating that a unit is in the treated group (versus comparison) Post is a dummy variable indicating that the observation occurred in the post-intervention period The parameter estimate for the interaction between treat and post (α) is the difference-in-differences estimator In treatment effects lingo, the DID estimate is typically thought of as the average treatment effect on the treated (ATT)

11 Why is difference-in-differences so popular?
It works! For my money, it’s the best quasi-experimental design It is intuitive Strong face validity It is (relatively) easy to implement

12 The credibility of difference-in-differences and their ease of implementation has lead to an explosion in popularity (Source: Ryan et al. 2015)

13 Variations on the basic DID design
Source: Wikipedia.com Goldberg Variations

14 Difference-in-differences can be conducted at different levels of analysis
Estimation at higher level For hospital j at time t: Yjt = b0 + b1 Treatj + b2 Postt + α (Treatj ∙ Postt)+ ejt Estimation at lower level For patient i, in hospital j, at time t: Yijt = b0 + b1 Treatj + b2 Postt + b3 Xijt + α (Treatj ∙ Postt)+ eijt Where X is vector of patient-level controls

15 If conducted at lower level (e.g. patient level):
You may or may not want to control for other factors in a DID model (1) If conducted at lower level (e.g. patient level): It’s important to control for confounders (e.g. severity) Also important that the same higher-level units are in the same for the entire study period Want to avoid confounding from compositional effects

16 You may or may not want to control for other factors in a DID model (2)
If conducted at a high level, you may want to control for time time-varying factors More important if analysis is conduct at a geographic level, with high potential for confounding by other factors

17 You probably don’t want to control for time trends in a DID
For hospital j at time t: Yjt = b0 + b1 Treatj + b2 Postt + b3 Timet + + b4 (Timet ∙ Treatt ) α (Treatj ∙ Postt)+ ejt Where Time is a linear time trend

18 For hospital j at time t:
You can use DID to estimate treatment effects for different post-intervention periods For hospital j at time t: Yjt = b0 + b1 Treatj + b2 Postt + b3 Postt-1 + b4 Postt-n + α1 (Treatj ∙ Postt) + α2 (Treatj ∙ Postt-1) + α3 (Treatj ∙ Postt-n)+ ejt Then take dy/dTreat at different post-intervention periods Allows you to estimate effects in first period following intervention, second period, etc.

19 You can use DID to estimate treatment effects for different sub-groups
For hospital j at time t: Yjt = b0 + b1 Treatj + b2 Postt + b3 Teachj + b4 (Teachj ∙ Postt) + α1 (Treatj ∙ Postt) + α2 (Treatj ∙ Teachj ∙ Postt)+ ejt α2 is a test whether the treatment effect varies for teaching hospitals

20 Choosing the comparison group is a major specification issue in DID
Options include: All untreated units Untreated units identified to be similar “We selected as controls neighboring states without major Medicaid expansions that were closest in population and demographic characteristics to the three states with Medicaid expansions” (Sommers et al. 2012) A matched comparison group Matching based on observable characteristics Matching based on levels of the pre-intervention outcomes Synthetic comparison / generalized synthetic comparison Sensitivity analysis with multiple comparison groups

21 Example of matching based on pre-intervention outcomes
(Source: Ryan and Dimick 2017)

22 Simulation evidence suggests that matching can improve the accuracy of DID estimates
(Source: Ryan and Linden 2017)

23 Dealing with cases where the intervention doesn’t start at the same time for all treated units (1)
Option 1: Have treated units stay in the comparison group until the time of treatment: For hospital j at time t: Yjt = b0 + b1 Yeart + α Treat jt + uj + ejt Year is a vector of year dummies Treat is indexed to j and t, “switches on” for treatment group when intervention starts u is a vector of hospital fixed effects accounts for pre-existing differences between treatment and comparison group

24 Dealing with cases where the intervention doesn’t start at the same time for all treated units (2)
Option 2: Normalize time based on the exposure period of different treatment units Assume: Cohort A: treated in 2012 Cohort B: treated in 2013 Issues: It’s important to “cut” the data so that all cohorts have the same number of pre and post observations Data loss Comparison group isn’t treated, so you have to arbitrarily assign a date of treatment to them Time from treatment effects can be confounded by secular trends 2010 2011 2012 2013 2014 Cohort A T-2 T-1 T T+1 T+2 Cohort B T-3

25 Example of normalizing time to the start of an intervention
(Source: Sommers et al. 2012)

26 Another example of normalizing time to the start of the intervention
(Source: Chen, Ryan, Thumma, and Dimick 2017)

27 Option 3: estimate separate DID models for each treated cohort
Dealing with cases where the intervention doesn’t start at the same time for all treated units (3) Option 3: estimate separate DID models for each treated cohort Model 1: includes treated cohort A and all controls Model 2: includes treated cohort B and all controls This approach was taken by McWilliams et al (2015) in ACO evaluation Issue: doesn’t allow for: overall estimate of the program easy statistical test of effect of cohort A versus B

28 You need to account for non-independence of errors in DID
(Source: Bertand et al. 2004)

29 Our simulation work confirmed high rates of false rejection when using conventional standard errors in DID i.i.d errors had higher false rejection rates (Source: Ryan et al. 2015)

30 What should be done to get the right p-values?
Options Cluster se’s at highest level (e.g .hospitals, not physicians) Doesn’t work that well with small numbers of units Aggregate data series to pre-post E.g. 3 pre-periods and 3-post periods could be collapsed to one pre and one post period Use non-parametric randomization / permutation tests I like this approach, but it is effortful to code up, is computationally intensive, and reviewers typically don’t care about it

31 Dealing with violations to parallel trends: What if we have this?:
Outcome Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 Treatment Comparison Start of intervention Time

32 Outcome Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 Treatment Comparison
Pre-intervention difference Comparison Start of intervention Time

33 Outcome Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 Treatment Comparison
Start of intervention Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 Treatment Pre-intervention difference Comparison Time

34 Outcome Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 DID counterfactual
Start of intervention Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 DID counterfactual Treatment Pre-intervention difference Comparison Time

35 ITSA counterfactual Outcome Time Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4
Start of intervention Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 DID counterfactual Treatment Pre-intervention difference Comparison

36 Interrupted time-series counterfactual Outcome
Start of intervention Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 DID counterfactual Treatment Pre-intervention difference Comparison

37 ITSA counterfactual Outcome Time Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4
Start of intervention Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 Multi-group ITSA counterfactual Treatment DID counterfactual Pre-intervention difference Comparison

38 Outcome Time Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 Treatment
Matched Comparison Comparison Start of intervention

39 Outcome Time Year0 Yeart-1 Yeart-2 Yeart-3 Yeart-4 Matched
counterfactual Treatment Matched Comparison Comparison Start of intervention

40 DID versus other designs: how to choose (1)
Pre-post: For hospital j at time t: Yjt = b0 + α Post t + ejt | treat=1 Just don’t do it

41 DID versus other designs: how to choose (2)
Interrupted time series estimator: For hospital j at time t: (1) Yjt = b0 + b1 Time t + α (Timet ∙ Postt) + ejt | treat=1 (2) Yjt = b0 + b1 Time t + α1 Post t + α2 (Timet ∙ Postt) + ejt | treat=1 (3) Yjt = b0 + b1 Time t + b2 Treatj + b3 (Timet ∙ Treatj) + b4 (Timet ∙ Postt) + α (Timet ∙ Treatj ∙ Postt) + ejt Model 3 is used a lot by the Epstein, Jha, Orav team

42 Example of interrupted time series model 3
(Source: Zuckerman et al. 2015)

43 Some reasons to not use interrupted time series models
You have to make the assumption that pre-intervention trends will continue Models don’t incorporate non-linearity very well This is particularly problematic as performance approaches a natural limit

44 Some reasons to use interrupted time series models
There may be no other comparison group If there is no group that is not exposed to treatment, an option is to find a condition that is not exposed to treatment E.g. A program targets AMI, heart failure, and pneumonia, but not gastrointestinal hemorrhage or hip fracture Non-targeted conditions can serve as the comparison group There may be strong evidence of spillovers to all non-affected groups ITSA is still better than pre-post Personal opinion: if you have pre and post data for treatment and comparison group, I strongly prefer DID to ITSA-style models Assumptions of DID are less restrictive, results are easier to interpret

45 Presenting the results from DID models (1)
(Source: Dimick, Thumma, Ryan et al. 2013)

46 Presenting the results from DID models (2)
(Source: Sommers et al. 2012)

47 Presenting the results from DID models (3)
(Source: Layton and Ryan 2015)

48 Presenting the results from DID models (4)
(Source: McWilliams et al. 2016)

49 We developed a difference-in-differences checklist
(Source: Ryan et al. 2015)

50 DID checklist continued
(Source: Ryan et al. 2015)

51 Example of policy spillover
(Source: Zuckerman et al. 2015)

52 Additional readings M. Bertrand, E. Duflo, S.Mullainathan. How Much Should We Trust Differences-In-Differences Estimates?. The Quarterly Journal of Economics 2004; 119 (1): J.D Angrist, J.S. Pischke. Mostly harmless econometrics: An empiricist's companion Princeton university pressT J.D. Angrist, JS. Pischke The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics. Journal of Economic Perspectives, 24(2): 3-30. G. W. Imbens,. J. M. Wooldridge Recent developments in the econometrics of program evaluation. Journal of Economic Literature 47, no. 1: 5-86. Dimick JB, Ryan AM. Methods for evaluating changes in health care policy: the difference-in-differences approach. JAMA ;312(22):2401-2 Ryan AM, Burgess JF Jr, Dimick JB. Why We Should Not Be Indifferent to Specification Choices for Difference-in-Differences. Health Serv Res (4):


Download ppt "Andrew Ryan, Ph.D. Associate Professor of Health Management and Policy"

Similar presentations


Ads by Google