Presentation is loading. Please wait.

Presentation is loading. Please wait.

Causal inference from observational studies: emulating a target trial

Similar presentations


Presentation on theme: "Causal inference from observational studies: emulating a target trial"— Presentation transcript:

1 Causal inference from observational studies: emulating a target trial
1/26/2015 Causal inference from observational studies: emulating a target trial Miguel Hernán departments of epidemiology and biostatistics

2 The situation: We need to make decisions NOW
Treat with A or with B? Treat now or later? When to switch to C? A relevant randomized trial would, in principle, answer each comparative effectiveness and safety question Interference/scaling up issues aside Hernán - Target trial

3 But we rarely have randomized trials
expensive, untimely, unethical, impractical And deferring decisions is not an option no decision is a decision: “Keep status quo” Question: What do we do? Hernán - Target trial

4 Answer: We analyze observational data (pre-existing or collected for research)
Epidemiologic studies Electronic medical records Administrative claims databases National registers Disease registries Other Hernán - Target trial

5 We analyze observational data
but only because we cannot conduct a randomized trial Observational studies are not our preferred choice For each observational study, we can imagine a hypothetical randomized trial that we would prefer to conduct If only it were possible Hernán - Target trial

6 The Target Trial An analysis of observational data (e.g., large health care database) can be viewed as an attempt to emulate a hypothetical pragmatic randomized trial As suggested more or less explicitly by many authors, including Cochran, Rubin, Feinstein, Dawid, Robins… Hernán, Robins. Am J Epidemiol 2016 If the observational analysis succeeds at emulating the target trial, both studies would yield identical effect estimates except for random variability Hernán - Target trial

7 Procedure to answer clinical/policy questions
Step #1 Describe the protocol of the target trial Step #2 Option A: Conduct the target trial Option B Use observational data to explicitly emulate the target trial Apply appropriate causal inference analytics to estimate the effects of interest Hernán - Target trial

8 Observational study needs to emulate Key elements of
target trial protocol Eligibility criteria Strategies assigned at start of follow-up Randomized assignment Start/End follow-up Outcomes Causal contrast(s) of interest Analysis plan Eligibility criteria Strategies followed from start of follow-up Randomized assignment Start/End follow-up Outcomes Causal contrast(s) of interest Analysis plan Hernán - Target trial

9 Example Suppose we use observational data to emulate a target trial
a large health care claims database to emulate a target trial of hormone therapy and heart disease First we need to outline the protocol of the target trial Hernán - Target trial

10 Target trial: Hormone therapy and coronary heart disease
Protocol summary Eligibility criteria Postmenopausal women within 5 years of menopause between the years 2005 and 2010, and with no history of cancer and no use of hormone therapy in the last 2 years. Treatment strategies Initiate estrogen plus progestin hormone therapy at baseline and remain on it during the follow-up, unless deep vein thrombosis, pulmonary embolism, myocardial infarction, or cancer are diagnosed Refrain from taking hormone therapy during the follow-up Assignment procedures Participants will be randomly assigned to either strategy at baseline, and will be aware of the strategy they have been assigned to. Follow-up period Starts at randomization and ends at diagnosis of coronary heart disease, death, loss to follow-up, or 5 years after baseline, whichever occurs earlier. Outcome Coronary heart disease diagnosed by a cardiologist Causal contrasts Intention-to-treat effect, per-protocol effect Analysis plan Intention-to-treat analysis, non-naïve per-protocol analysis (to be discussed) Hernán - Target trial

11 Procedure to answer clinical/policy questions
Step #1 Describe the protocol of the target trial Step #2 Option A: Conduct the target trial Option B Use observational data to explicitly emulate the target trial Apply appropriate causal inference analytics to estimate the effects of interest Hernán - Target trial

12 Target trial emulation requires data experts
When observational data were not collected for research purposes e.g., “coronary heart disease” may be recorded when a woman was diagnosed with it, or when her physician suspected it and ordered a diagnostic test Must consult with knowledgeable data users Time-varying clinical workflows, idiosyncratic coding practices, software versions… Hernán - Target trial

13 Besides expert knowledge of the data
Validation studies to quantify data accuracy Internal consistency checks to detect problems Cross-datasets comparisons to better understand coding differences Let’s say we have consulted with experts and done the above before attempting to emulate the target trial Hernán - Target trial

14 Eligibility criteria Emulation
Apply same criteria as the target trial to women who at baseline have been included in the database for at least 2 years Potential problems Insufficient data to characterize individuals eligible for the target trial Example: If target trial required baseline screening to exclude prevalent cases, emulation may be hard if database records the performance of a test (for billing purposes) but not its findings Hernán - Target trial

15 Treatment strategies Emulation
Eligible individuals assigned to the strategy consistent with their baseline data Strategy 1: women who start estrogen plus progestin therapy Strategy 2: women who do not start hormone therapy Excluded: women who start a different hormone therapy Target trial is typically a pragmatic trial observational data cannot be used to emulate trials with tight monitoring and enforcement of adherence to the study protocol cannot emulate a placebo-controlled trial at most a trial with a “usual care” group Hernán - Target trial

16 Assignment procedures Emulation of blinding
Generally impossible individuals in the dataset, and their health care workers, are usually aware of the treatment they receive Observational data can only emulate target trials without blind assignment standard for pragmatic trials not a limitation if the goal is comparing real-world treatment strategies Hernán - Target trial

17 Randomized assignment Emulation of randomization
Generally requires adjustment for all confounding factors via matching, stratification or regression, standardization or inverse probability (IP) weighting, g-estimation… If insufficient information on baseline confounders or we fail to identify them, then successful emulation of the target trial’s random assignment is not possible Confounding bias Hernán - Target trial

18 Start of follow-up When is time zero (baseline)?
In true trials the time of eligibility and randomization In emulated trials the time of eligibility and treatment assignment Failure to assign time zero correctly may lead to misunderstandings e.g., immortal time bias, others to be discussed Hernán - Target trial

19 Outcome Emulation Use the database to identify women with a diagnosis of coronary heart disease during the follow-up Potential problem: observational data cannot be generally used to emulate a target trial with systematic and blind outcome ascertainment Except if outcome ascertainment cannot be affected by treatment history, e.g., if the outcome is mortality independently ascertained from a death registry Hernán - Target trial

20 The observational analysis needs to emulate
Eligibility criteria Treatment strategies randomly assigned at start of follow-up Start/End of follow-up Outcomes Causal contrast(s) of interest Analysis plan Hernán - Target trial

21 Analysis plan Identical for true and emulated trials
Except that no adjustment for baseline confounding is expected in intention-to-treat analysis of true trials Both true and emulated trials require adjustment for post-baseline confounding and selection bias Possibly using Robins’s g-methods Which means longitudinal data on treatment, confounders, and outcomes are required Hernán - Target trial

22 The target trial will be a compromise
between the ideal trial we would really like to conduct and the trial we may reasonably emulate using the available data The drafting of the protocol of the target trial is typically an iterative process That requires detailed knowledge of the database Hernán - Target trial

23 Examples of trial emulation using Big Data
Classic cohort study Nurses’ Health Study Postmenopausal hormone therapy and coronary heart disease Electronic medical records - THIN Statins and coronary heart disease Static strategies treat vs.no treat Clinical cohorts – HIV-CAUSAL Collaboration (not today) Antiretroviral therapy and mortality Static and dynamic strategies intervention depends on response to previous intervention Hernán - Target trial

24 EXAMPLE #1 Hormone therapy and heart disease
Question What is the intention-to-treat effect of hormone therapy on the risk of coronary heart disease in postmenopausal women? Hernán - Target trial

25 Answers (shocking discrepancy)
Observational studies >30% lower risk in current users compared with never users e.g., HR 0.68 in Nurses Health Study Grodstein et al. J Women’s Health 2006 Randomized trial >20% higher risk in initiators compared with noninitiators HR 1.24 in Women’s Health Initiative Manson et al. NEJM 2003 Hernán - Target trial

26 The WHI randomized trial Manson et al, NEJM 2003
Double-blind Placebo-controlled Large >16,000 U.S. women aged yrs Randomly assigned to estrogen plus progestin therapy or placebo Women followed approximately every year like in many large observational studies No intervention after baseline Hernán - Target trial

27 WHI: Effect estimates Intention-to-treat hazard ratio (95% CI) of CHD
Overall (0.99, 1.53) Years of follow-up (1.06, 2.14) > (0.93, 1.83) > (0.41, 1.09) Years since menopause < (0.54, 1.44) (0.86, 1.80) > (1.14, 2.40) Hernán - Target trial

28 Why did observational studies get it “wrong”?
Popular theory: residual confounding insufficient adjustment for lifestyle and socioeconomic indicators Corollary: causal inference from observational data is a hopeless undertaking An alternative theory: Observational and randomized studies asked different questions Hernán - Target trial

29 Randomized trial estimated the intention-to-treat effect
What is the CHD risk in women who initiate hormone therapy compared with women who do not? Design and analysis: Women randomly assigned to initiation of hormone therapy or placebo Analytic approach Compare risk between incident users and nonusers of hormone therapy Hernán - Target trial

30 Observational studies did not estimate intention-to-treat effect
What is the CHD risk in women who are currently taking hormone therapy compared with women who are not? Design and analysis: Women are asked about therapy use Analytic approach Compare risk between prevalent users and nonusers of hormone therapy (current users vs. never users) Hernán - Target trial

31 “Current vs. never users” contrast is not clinically relevant
Consider a woman wondering whether to start hormone therapy The current vs. never contrast does not provide the information she needs Consider a woman wondering whether to stop hormone therapy Hernán - Target trial

32 What if we re-analyze the observational study…
… to compare the risk in incident users vs. nonusers? That is, what if we use the observational data to answer same question as randomized trial? estimate the observational analog of the intention-to-treat effect Hernán et al. Biometrics 2005 Hernán et al. Epidemiology 2008 Hernán - Target trial

33 Effect estimates (ITT hazard ratios) Randomized. Observational
Effect estimates (ITT hazard ratios) Randomized Observational Women’s Health Initiative Nurses’ Health Study Overall (0.99, 1.53) (0.82, 1.34) Years of follow-up (1.06, 2.14) (0.92, 2.23) > (0.81, 1.41) (0.72, 1.16) Years since menopause < (0.54, 1.44) (0.63, 1.21) (0.86, 1.80) (0.85, 1.49) > (1.14, 2.40) Hernán - Target trial

34 When same question is asked
No shocking observational-randomized discrepancies for ITT estimates though wide CIs in both studies What about the popular hypothesis? Any residual confounding? Probably, but insufficient to explain the original discrepancy Hernán - Target trial

35 Aside: Analysis can/should be extended in two ways
Causal contrast Estimate per-protocol effect Rather than intention-to-treat effect only Effect measure Estimate survival (or cumulative risk) curves Rather than hazard ratios only Hernán - Target trial

36 ITT effect is problematic Hernán, Hernández-Díaz. Clinical Trials 2012
Depends on adherence patterns Substantial non-adherence in both randomized trial and observational study Inappropriate for safety outcomes Not patient-centered We also estimated per-protocol effect via IP weighting (more later) Again no randomized-observational discrepancies Toh et al. Ann Intern Med 2010 Hernán - Target trial

37 End of aside Hernán - Target trial

38 Conclusion: C based on epidemiologic studies is possible
If high-quality observational data on treatment, outcome, and confounders are available e.g., the Nurses Health Study But most observational CER relies on large databases (big, pre-existing data) Health claims, electronic medical records Can emulation of a target trial work in that setting? Hernán - Target trial

39 EXAMPLE #2 Statins and coronary heart disease
Question What is the effect of statin therapy on the risk of coronary heart disease? Extreme example of confounding Hernán - Target trial

40 Target trial: Statin therapy and coronary heart disease
Protocol summary Eligibility criteria Individuals aged 55–84 in the years with no prior history of CHD, stroke, peripheral vascular disease, heart failure, cancer, schizophrenia or dementia, no symptoms of subclinical CHD, and no use of statin therapy in the last 2 years. Treatment strategies Initiate statin therapy at baseline and remain on it during the follow-up, unless contraindications arise Refrain from taking statin therapy during the follow-up Assignment procedures Participants will be randomly assigned to either strategy at baseline, and will be aware of the strategy they have been assigned to. Follow-up period Starts at randomization and ends at diagnosis of coronary heart disease, death, loss to follow-up, or January 2007, whichever occurs earlier. Outcome Coronary heart disease diagnosed by a cardiologist Causal contrasts Intention-to-treat effect, per-protocol effect Analysis plan Intention-to-treat analysis, non-naïve per-protocol analysis Hernán - Target trial

41 Observational data The Health Improvement Network
THIN is a database of electronic medical records 6.2 million individuals from 350 general practices in the UK (2009) For each individual demographic and socioeconomic characteristics symptoms, signs and diagnoses, referrals, laboratory test results some lifestyle information Vital status and cause of death data Hernán - Target trial

42 Target trial emulation
Use observational data from THIN to emulate the components of the target trial Eligible individuals classified into Strategy 1 if they initiated statin treatment Strategy 2 if they did not initiate statin treatment during the baseline month Baseline month January 2000 3178 individuals met all the eligibility criteria 18 were initiators 1 initiator developed CHD Hernán - Target trial

43 Sequence of target trials Emulation
Emulate a target trial starting each calendar month between January 2000 and November 2006 83 target trials with a 1-month enrollment period For each trial Follow-up starts at the trial-specific baseline and ends at diagnosis of CHD, death, lost to follow-up, or January 2007 Eligibility criteria applied at each baseline Hernán - Target trial

44 Sequence of target trials Emulation
individuals eligible for at least 1 trial On average each eligible individual participated in the emulation of 11 trials many non-initiators in the January 2000 trial still met all eligibility criteria in February All 18 initiators in January 2000 were ineligible in February 2000 because they received treatment during the washout period for the February 2000 trial, and so on Danaei et al. Statistical Methods in Medical Research 2013 Hernán - Target trial

45 CONSORT flowchart of emulated trials
Hernán - Target trial

46 Adherence to treatment
___ Non-initiators ___ Initiators Probability of continuing initial treatment Cf to 21% at 5 years on average in RCTs (~35% here) Months of follow-up Hernán - Target trial

47 Intention-to-treat effect Emulation
Compare CHD incidence in initiators vs. no-initiators at baseline of each emulated trial Regardless of their subsequent treatment Pool data across emulated trials to obtain a more precise effect estimate Robust variance because of within-subject correlation Hernán - Target trial

48 Hazard ratio (95% CI) of CHD THIN trials 2000-2006
Intention-to-treat analysis Per-protocol analysis Unique cases 635 488 Unique persons 74,806 Cases 6,335 4,849 Person-trials 844,800 Adjusted for age and sex 1.29 (1.06, 1.56) 1.54 (1.09, 2.18) Adjusted for all covariates 0.89 (0.73, 1.09) 0.84 (0.54, 1.30) Hernán - Target trial

49 What if we had compared prevalent (not incident) users vs. nonusers?
Current users HR: 1.42 (1.16, 1.73) Persistent (1 yr) current users HR: 1.05 Persistent (2 yrs) current users HR: 0.77 (0.51, 1.18) We can get any result we want by changing the definition of current user! Confounding-selection bias tradeoff Hernán - Target trial

50 Mortality hazard ratio for statins in CHD secondary prevention studies
RCTs: 0.84 (0.77, 0.91) Observational studies Incident users: (0.65,0.91) Prevalent-incident mix: 0.70 (0.64, 0.78) Prevalent users: (0.45, 0.66) Danaei et al. Am J Epidemiol 2012 Hernán - Target trial

51 Other static comparisons: same analytic approach
Head-to-head comparisons: Example: Lipophilic statins (atorvastatin, simvastatin) vs other statins Danaei et al. Diabetes Care 2013 Joint interventions Example: statins plus antihypertensives vs standard care Danaei et al. J Clin Epidemiol 2016 (in press) Hernán - Target trial

52 Advantages of the target trial approach (I)
Provides ready access to the application of formal counterfactual theory and concepts to Big Data without the need for technical jargon Organizing principle for causal inference methods which implicitly rely on counterfactual reasoning e.g., new user design, active comparators, outcome controls Hernán - Target trial

53 Advantages of the target trial approach (I)
Provides ready access to the application of formal counterfactual theory and concepts to Big Data without the need for technical jargon, Organizing principle for causal inference methods which implicitly rely on counterfactual reasoning e.g., new user design, active comparators, outcome controls Hernán - Target trial

54 Advantages of the target trial approach (II)
Facilitates the comparison of complex strategies that are sustained over time and may depend on a patient’s evolving characteristics Dynamic treatment strategies Not “treat vs no treat” but rather “when to treat, when to switch, when to monitor” depending on time-varying factors Hernán - Target trial

55 Advantages of the target trial approach (III)
Establishes a link between methods for the analysis and reporting of randomized trials and Big Data analytics Observational studies analyzed like randomized trials, and vice versa Provides a scaffolding to organize discussions about which data are required/missing Hernán - Target trial

56 Advantages of the target trial approach (IV)
Naturally leads to analytic approaches that prevent apparent paradoxes and common biases Selection bias related to prevalent users Immortal time bias Birth weight paradox, obesity paradox Etc. Hernán - Target trial

57 Advantages of the target trial approach (V)
Facilitates a systematic methodologic evaluation of observational studies which components of the target trial we weren’t able to mimic approximately? which components of the target trial would be problematic even if we were able to conduct a truly randomized trial? An approach adopted by the Cochrane Collaboration Risk of Bias Tool for Nonrandomised Studies and the IOM Report on the Safety of Approved Drugs Hernán - Target trial

58 Advantages of the target trial approach (VI)
Helps understand why estimates differ across studies assess the sensitivity of estimates to different design choices for the target trial focus research efforts on “sensitive” choices Hernán - Target trial

59 Advantages of the target trial approach (last)
If we can influence how data are recorded the target trial approach helps record them If we are using data as they exist the target trial approach guides the validation studies and the development and evolution of the Data Model The target trial approach allows you to systematically articulate the tradeoffs that you are willing to accept regarding eligibility criteria, interventions confounders, outcomes Hernán - Target trial

60 A common misinterpretation
You are saying that observational studies are as good as RCTs? “This is a cohort study that tries to turn itself into a clinical trial. This involves a series of assumptions and manoeuvres which lack credibility.” Anonymous JAMA reviewer, April 2014 No, the point is not that observational studies can turn themselves into randomized experiments They can’t Hernán - Target trial

61 The point is that we can do better
by using observational data to explicitly emulate randomized trials The limitations of observational studies (e.g., confounding, mismeasurement) remain, but we do not compound them with additional problems Hernán - Target trial

62 Remember Observational studies are what we do when we cannot conduct a randomized trial In the absence of practical and ethical constraints, sane people will always prefer a randomized trial No alternative to observational studies So we better keep improving them because people will keep using (Big and Small) observational data to guide their decisions Hernán - Target trial


Download ppt "Causal inference from observational studies: emulating a target trial"

Similar presentations


Ads by Google