Presentation on theme: "A workshop introducing doubly robust estimation of treatment effects"— Presentation transcript:
1A workshop introducing doubly robust estimation of treatment effects Michele Jonsson Funk, PhDUNC/GSK Center for Excellence in PharmacoepidemiologyUniversity of North Carolina at Chapel Hill
2Conflict of Interest Statement Macro development funded by the Agency for Healthcare Research and Quality via a supplemental award to the UNC CERTs (U18 HS S1)Additional support from the UNC/GSK Center for Excellence in Pharmacoepidemiology and Public Health.No potential conflicts of interest with respect to this work.
3Regression models assume that… The parametric form is correct.Should we use logistic regression, or log- binomial?We have included correct predictors.Should we really include age in this model?Those predictors have been specified correctly.Should age be coded continuously or in 10 year categories? Is there an interaction with race? What about higher order terms? Etc…
4What if the model is wrong? Lunceford & Davidian, Stat Med, 2004Omit a true confounder (extreme example)True relationships known (simulated data)Vary associations betweenRisk factor – outcomeConfounder – exposure
5ML outcome regression: false model %biasRisk factor – outcome assnLunceford & Davidian, Stat Med, 2004
6Doubly robust (DR) estimator: false model for outcome regression %biasRisk factor – outcome assnLunceford & Davidian, Stat Med, 2004
7ML outcome regression: true model CI CoverageRisk factor – outcome assnLunceford & Davidian, Stat Med, 2004
8DR: true models for propensity score & outcome regression CI CoverageRisk factor – outcome assnLunceford & Davidian, Stat Med, 2004
9ML outcome regression: false model CI CoverageRisk factor – outcome assnLunceford & Davidian, Stat Med, 2004
10DR: true model for propensity score & false model for outcome regression CI CoverageRisk factor – outcome assnLunceford & Davidian, Stat Med, 2004
11Doubly robust (DR) estimation from 30,000 feet Robins & colleagues recognized the doubly robust property in mid-90’sCombines standardization (or reweighting) with regressionPart of the family of methods that includes propensity scores and inverse probability weighting
12Conceptual description Doubly robust (DR) estimation uses two models:Propensity score model for the confounder - exposure (or treatment) relationshipOutcome regression model for the confounder – outcome relationship, under each exposure conditionThese two stages can use:different subsets of covariates, anddifferent parametric forms.If either model is correct, then the DR estimate of treatment effect is unbiased.
13Risk factors (potential confounders) Propensity Score Model (1) Two stagesRisk factors (potential confounders)Exposure (Treatment)OutcomePropensity Score Model (1)Outcome Regression (2)
14Causal effect of interest Comparing counterfactual scenariosE(Y1): Whole population treated (exposed) vs.E(Y0): Whole population untreated (unexposed)Average causal effect of treatmentE(Y1) – E(Y0) : differenceE(Y1) / E(Y0) : ratioIn non-randomizes studies, the unexposed may not fairly reflect what would have happened to the exposed had they been unexposed (confounding)
15Doubly robust estimator Y: outcome Z: binary treatment (exposure)X: baseline covariates (confounders plus other prognostic factors)e(X,β): model for the true propensity scorem0(X,α0) and m1(X,α1): regression models for true relationship between covariates and the outcome within each strata of treatmentCausal effect of interest (deltaDR): difference in mean response if everyone in the population received treatment versus everyone receiving no treatment; E(Y1)-E(Y0).ΔDR = E(Y1) - E(Y0)Adapted from Davidian M, DR Presentation, 2007
16Doubly robust estimator E(Y1): average popn response with treatment / exposureAdapted from Davidian M, DR Presentation, 2007
17Average population response with treatment (μ1,DR) IPTW Estimator“Augmentation”Adapted from Davidian M, DR Presentation, 2007
18True PS model; false regression model (I) Propensity score modelRegression modelAdapted from Davidian M, DR Presentation, 2007
19True PS model; false regression model (II) Assuming nounmeasured confounders!Adapted from Davidian M, DR Presentation, 2007
20False PS model; true regression model (I) Propensity score modelRegression modelAdapted from Davidian M, DR Presentation, 2007
21False PS model; true regression model (II) Assuming nounmeasured confounders!Adapted from Davidian M, DR Presentation, 2007
22Overly simplified statistics ΔDR = [E(Y1) + junk] - [E(Y0) + junk]Where junk = 0 if either the propensity score or the regression model is true…ΔDR = E(Y1) - E(Y0)Adapted from Davidian M, DR Presentation, 2007
23Standard errors Option 1: Sandwich estimator Option 2: Bootstrap Adapted from Davidian M, DR Presentation, 2007
24Simulation findings Bang & Robins 2005 N=500, 1000 iterations False propensity score model1 of 4 true predictors of tx1 ‘noise’ variable, independent of txFalse outcome regression modelOmit one risk factor, an higher order term and an interaction term
25Bias under false models AnalysisMethodTrue Model(s)False ModelPSORBoth-0.010.860.00-1.56DR-0.090.92H Bang & JM Robins, Biometrics (2005).
26Variance under false models AnalysisTrue ModelFalse ModelPSORBoth0.210.150.07DR0.090.080.28H Bang & JM Robins, Biometrics (2005).
27Recapping L&D simulations Compare performance of propensity score analysis, IPW, outcome regression (OR) and DROmit a true confounder (extreme example)True relationships known (simulated data)Vary associations betweenRisk factor – outcomeConfounder – exposureVary sample size
28If all models are true… Bias <3% for all methods except for PS analysis using strata (due to residual confounding)Variance similar in generalVarOR < VarDR (slightly) if confounder-exp relationship is strongVarDR < VarIPWIf OR model is right, most efficient. But we have no way of knowing whether or not it’s right.Lunceford & Davidian, Stat Med, 2004
29If outcome regression model is false… BiasDR always <1%; OR biased by 10-20% in most scenariosEfficiencyDR nearly as efficient as correct model except when conf-exp relationship strongDR always more efficient than IPWConfidence interval coverageDR coverage nominalML coverage poorAdding risk factors to PS model improves precisionIf both are nearly right (only a little wrong), bias is smallLunceford & Davidian, Stat Med, 2004
30DiscussionIf method offers some protection against model misspecification, why isn’t it being used by pharmacoepidemiologists?
31SAS macro for DR estimation ObjectivesFacilitate wider use of DR estimationImprove performance by implementing sandwich estimator for SEsEnhance usability by following SAS conventionsProvide user with relevant diagnostic details
32SAS macro for doubly robust estimation including documentation Dataset for sample analyses (1.7MB, optional)
33Running the DR macroBy design, the DR macro uses common SAS® syntax for specifying the source dataset, variables for modeling, and additional options:%dr(%str(options data=SAS-data-set descending;wtmodel exposure = x y z / method=dr dist=bin showcurves;model outcome = x y z / dist=n ; ) );
34Running the DR macro %dr(%str(options data=SAS-data-set descending; wtmodel exposure = x y z / method=dr dist=bin showcurves;model outcome = x y z / dist=n; ) );
35Running the DR macro %dr(%str(options data=SAS-data-set descending; wtmodel exposure = x y z / method=dr dist=bin showcurves;model outcome = x y z / dist=n; ) );
36Running the DR macro %dr(%str(options data=SAS-data-set descending; wtmodel exposure = x y z / method=dr dist=bin showcurves;model outcome = x y z / dist=n; ) );
37DR macro: output Propensity score (wtmodel) results Descriptive statistics for weightsGraph of propensity score curves by exposure statusReweighted regression model among the unexposed (dr0)Reweighted regression model among the exposed (dr1)Doubly robust estimate and standard error
38DR macro: outputaverage response had all been unexposed, adjusted for risk factorsaverage response had all been exposed, adjusted for risk factorsn used in the analysis. usedobs<totalobs due to missing data or use of common support optiondr1 – dr0; difference in mean response for continuous outcome; risk difference for dichotomous outcomen in datasetSE of deltaDRObs totalobs usedobs dr dr deltadr se
39Example analysis CVD Outcomes Continuous: CVD score (i.e. LDL)Binary: acute MIExposure (treatment): statin use (yes/no)50% of population exposed10 covariates (5 continuous, 5 binary)Data are simulated, so true relationships among exposure, covariates & outcome are known
50CaveatsSEs conservative when sample size is small; bootstrapping may be used in this case to get more appropriate SEsMacro only provides difference estimates (not RR or OR) for nowExposure must be dichotomous; outcome must be continuous or dichotomous (time-to-event analysis not supported)Some SAS conventions not recognized within the macro codewhere and class statements not recognizedinteraction terms and higher order polynomials must be created in a prior data step
51Practical considerations How to choose which covariates to include?Good question.Based on simulations from PS literatureInclude all risk factors for outcomeMay omit predictors of tx that do not affect outcome
52Practical considerations What to do with estimates from various models that differ?Effect EstimatesResult%biasSECrude1.90?0.089Maximum likelihood-1.090.023Propensity score-1.500.050Doubly robust-1.120.024III. SAS Macro
53Practical considerations What sort of diagnostics should be checked?Potentially influential obs with extreme PS values‘common_support’ option in SAS macroDistribution of PS scores stratified by treatment / exposure group‘showcurves’ option in SAS macro
54Checking PS distribution StrataTx=0Tx=1Propensity score
55Checking PS distribution StrataTx=0Tx=1Propensity score
56Checking PS distribution StrataTx=0Tx=1Propensity score
57Limitations DR estimation is not a panacea for unmeasured confounding. Recall- ‘junk’ only reduces to 0 with assumption of no unmeasured confoundersOne of the models must be correct for the estimator to be unbiasedBang & Robins suggest that it will be minimally biased if both models are nearly right…Standard errors tend to be slightly larger compared to a single correctly specified regression modelExplaining DR estimation in your methods section could be interesting…
58ApplicationsDR estimation potentially valuable for comparative effectiveness studies, and in particular for head-to-head comparisons of treatment effectiveness or adverse events from observational data when RCTs can’t or won’t be done...for ethical reasons,for economic reasons,for reasons of rare or late-effect outcomes, orfor reasons of the need to conduct faster analyses of possible sentinel events
59Extensions Missing data Longitudinal marginal structural models Incomplete follow-up in RCTsLongitudinal marginal structural modelsGoodness of fit test?
60SummaryObservational studies of treatment effects depend on statistical models to disentangle causal effects from confoundingWe can never be certain that the statistical model we have chosen is correctDR estimate unbiased if at least one of the two component models is right and therefore provides some protection against model misspecificationThe ‘price’ of double robustness is slightly larger standard errors than a single correctly specified regression modelAssumption of no unmeasured confounders required
61ReferencesBang, H. & J.M. Robins: Doubly-robust estimation in missing data and causal inference models. Biometrics 2005, 61, 962–973.Lunceford, J. K. and Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Statistics in Medicine 23, 2937–2960.Robins, J. M. (2000). Robust estimation in sequentially ignorable missing data and causal inference models. Proceedings of the American Statistical Association Section on Bayesian Statistical Science, 6–10.Robins, J. M., Rotnitzky, A., and Zhao L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association 89, 846–866.Rotnitzky, A., Robins, J. M., and Scharfstein, D. O. (1998). Semiparametric regression for repeated outcomes with nonignorable nonresponse. Journal of the American Statistical Association 93, 1321–1339.Scharfstein, D. O., Rotnitzky, A., and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association 94, 1096–1120 (with Rejoinder, 1135–1146).Van der Laan, M. J. and Robins, J. M. (2003). Unified Methods for Censored Longitudinal Data and Causality. New York: Springer-Verlag.
62Acknowledgements Collaborators on the development of the SAS macro: Chris Wiesen, PhD, Odum Institute for Research in Social Science, University of North Carolina, Chapel Hill, NCDaniel Westreich, MSPH, Department of Epidemiology, University of North Carolina, Chapel Hill, NCMarie Davidian, PhD, Department of Statistics, North Carolina State University, Raleigh, NC
63Acknowledgements (II) Agency for Healthcare Research and Quality Supplemental Award to the UNC CERTs (U18 HS S1)UNC/GSK Center for Excellence in Pharmacoepidemiology and Public HealthKevin Anstrom, Lesley Curtis, Brad Hammill, and Rex Edwards from the Duke CERTs team for valuable feedback on the alpha version.Thanks to students from UNC’s EPID 369/730, a causal modeling course, for valuable feedback on the beta version.Presented in memory of Harry Guess, MD, PhD, , who co-authored the initial proposal to develop a SAS macro for doubly robust estimation.
64Contact InformationMichele Jonsson Funk, PhD Research Assistant Professor Department of Epidemiology University of North Carolina Chapel Hill NC(ph)(fax)