Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modern Approach to Causal Inference

Similar presentations


Presentation on theme: "Modern Approach to Causal Inference"— Presentation transcript:

1 Modern Approach to Causal Inference
Brian C. Sauer, PhD MS SLC VA Career Development Awardee

2 About Me SLC VA Career Development Awardee
PhD in Pharmacoepidemiology from College of Pharmacy at University of Florida MS in Biomedical Informatics from University Utah Assistant Research Professor in Division of Epidemiology, Department of Internal Medicine

3 Acknowledgments Mentors Simulation Primary References:
Matthew Samore, MD Tom Greene , PhD Jonathan Nebeker, MD Simulation Chen Wang, MS statistics Primary References: Causal Inference Book: Jamie Robins & Miguel Hernán Modern Epidemiology 3rd Ed. Chapter 12. Rothman, Greenland, Lash

4 Outline Causal Inference and the counterfactual framework
Exchangeability & conditional exchangeability Use of Directed Acyclic Graphs (DAGs) to identify a minimal set of covariates to remove confounding.

5 Key Learning Points Understand the Rationale for
randomized control trials. covariate selection in observation research. Identify the minimal set of covariates needed to produce unbiased effect estimates. Develop terminology and language to describe these ideas with precision. Become familiar with notation for causal inference, which is a barrier to this literature.

6 Counterfactual Framework
Neyman (1923) Effects of point exposures in randomized experiments Rubin (1974) Effects of point exposures in randomized and observational studies (potential outcomes and Rubin Causal Framework) Robins (1986) Effects of time-varying exposures in randomized and observational studies. (counterfactuals)

7 Counterfactual Working Example:
Zeus took the heart pill 5-days later he died Had he not taken the heart pill he would still be alive on that 5th day that is, all things being equal Did the pill cause Zeus’s death?

8 Counterfactual Working Example: Hera didn’t take the pill
5-days later she was alive Had she taken the pill she would still be alive 5-days later. Did the pill cause Hera’s survival?

9 Gettysburg: A Novel of the Civil War
Newt Gingrich and William R Forstchen Historical Fiction Imagines how the war would have ended if there was a confederate victory at Gettysburg

10 Notation for Actual Data
Y=1 if patient died, 0 otherwise Yz =1, Yh =0 A=1 if patient treated, 0 otherwise Az =1, Ah=0 Pat ID A Y Zeus 1 Hera

11 Notation for Ideal Data
Outcome under No Treatment Ya=0=1 if subject had not taken the pill, he would have died Yz, a=0= 0, Yh, a=0= 0 Outcome under Treatment Ya=1=1 if subject had taken the pill, he would have died Yz, a=1= 1, Yh, a=1= 0 Pat ID A Ya=0 Ya=1 Zeus 1 Hera

12 Available Research Data Set
ID A Y Ya=0 Ya=1 Zeus 1 ? Hera Apollo Cyclope

13 (Individual) Causal Effect
Formal definition of causal effects: For Zeus: Pill has a causal effect because Yz,a=1 ≠ Yz,a=0 For Hera: Pill doesn’t have a causal effect because Yh,a=1 = Yh,a=0 Individual causal effects are defined as a contrast of the values of counterfactual outcomes, but only one of those outcomes is observed for each subject – the one corresponding to the treatment value actually experienced by the subject. All other counterfactual outcomes remain unobserved. The unhappy conclusion is that, in general, individual causal effects cannot be identified because of missing data.

14 Average Causal Effects
Formal Definitional of Average Causal Effects In the population, exposure A has a causal effect on the outcome Y if Pr[Ya=1=1] ≠ Pr[Ya=0=1] Causal null hypothesis holds if Pr[Ya=1=1] = Pr[Ya=0=1] E[Ya=1=1] = E[Ya=0=1] In general, identifying individual causal effects is hopeless. So we focus our attention on aggregated causal effects: the average causal effect in a population of individuals. To define causal effects we need three pieces of information: the outcome of interest Y, the actions a=1 and a=0 to be compared, and a well characterized population of individuals whose counterfactual outcomes are to be compared.

15 Representation of causal null
Causal effects can be measured in many scales Risk difference: Pr[Ya=1=1] - Pr[Ya=0=1] =0 Risk Ratio: Pr[Ya=1=1] ÷ Pr[Ya=0=1] =1 Odds Ratio, Hazard Ratio, etc… The average causal effect is defined by a contrast that involves 2 actions, receiving treatment A and not receiving treatment A

16 Average Causal Effects
Pat ID Ya=0 Ya=1 Rheia 1 Kronos Demeter Hades Hestia Poseidon Hera Zeus Artemis Apollo Leto Ares Athena Hephaestus Aphrodite Cyclope Persephone Hermes Hebe Dionysus No average causal effect: Risk difference: Pr[Ya=1=1] - Pr[Ya=0=1] =0 Risk Ratio: Pr[Ya=1=1] / Pr[Ya=0=1] =1 Are there individual Causal Effects?

17 Causal Effects Definition
Causal effects are calculated by contrasting counterfactual risks within the population. Counterfactual contrasts are by definition causal effects.

18 Independence Independence: lack of association between variables
Knowing the value of one variable provides no information about the value of another variable Dependence: opposite of independence Synonym: correlation

19 Associational Measures
Observed in real world – association ≠ causation Pr[Y=1|A=1] =Pr[Y=1|A=0]: Treatment A and outcome Y are independent Also quantify strength of association - Risk Difference, Risk Ratio, OR, HR, etc Pr[Y=1|A=1] - Pr[Y=1|A=0]=0 Pr[Y=1|A=1] ÷ Pr[Y=1|A=0]=1

20 Causation vs. Association
The key conceptual difference: A causal effect defines a comparison of the sample subjects under different actions Assumes the counterfactual approach Everyone simultaneously treated and untreated Marginal effects Association is defined as a comparison of different subjects under different conditions Effects conditional on treatment assignment group

21 Causation vs. Association

22 Crucial difference, “Association is not Causation”
Association: different risk in two disjoint subsets of the population determined by the subjects’ actual exposure value Pr[Y=1 I A=a] is the risk in subjects of the population that meet the condition “having actually received exposure level a” Causation: different risk in the entire population under two exposure values Pr[Ya=1] is the risk in all subjects of the population had they received the counterfactual exposure level a

23 Causation? Question: Under what conditions can associational measures be used to estimate causal effects?

24 Causation? Question: Under what conditions can associational measures be used to estimate causal effects? Answer: Ideal randomized experiments

25 Randomized Experiments
Generate missing counterfactual data Missing counterfactual is missing completely at random (MCAR) Because of this, causal effects can be consistently estimated with ideal RCTs despite the missing data

26 Ideal Randomized Experiments
Exchangeability: Risk under the potential treatment value a among the treated is equal to the risk under the potential treatment value a for the untreated Pr[Ya=1|A=1] = Pr[Ya=1|A=0] Consequence of these conditional risk being equal in all subsets defined by treatment status in the population is that they must be equal to the marginal risk under treatment value a in the whole population

27 Exchangeability Pr[Ya=1|A=1] = Pr[Ya=1|A=0] =Pr[Ya =1]
Because the counterfactual risk under treatment value a is the same in both groups A=1 and A=0, we say the actual treatment A does not predict the counterfactual outcome Ya. Exchangeability means that the counterfactual outcome and the actual treatment are independent, or Ya∐A, for all values of a When the treated and untreated are exchangeable statisticians sometimes say the treatment is exogenous - exogeneity is a synonym for exchangeability

28 Population of Interest

29 Key Issue In the presence of exchangeability the counterfactual risk under treatment in the white part of the population would equal the counterfactual risk under treatment in the entire population.

30 RCT? A= heart transplant Y= death L=prognostic factor Counts
Pat ID L A Y Rheia Kronos 1 Demeter Hades Hestia Poseidon Hera Zeus Artemis Apollo Leto Ares Athena Eros Aphrodite Cyclope Persephone Hermes Hebe Dionysus A= heart transplant Y= death L=prognostic factor Counts 13 of 20 (65%) gods were treated 9 of 12 (75%) treated had prognostic factor (l=1) 3 of 7 (25%) not treated have prognostic factor

31 RCT? Pat ID L A Y Rheia Kronos 1 Demeter Hades Hestia Poseidon Hera Zeus Artemis Apollo Leto Ares Athena Eros Aphrodite Cyclope Persephone Hermes Hebe Dionysus Design 1: 13 of 20 treated: Randomly selected 65% for treatment Design 2: 9 out of 12 in critical condition (75%) treated 4 out of 8 not in critical condition were treated (50%) 2 designs might have produced these data. In design 1: we would have randomly selected 65% of the individuals in the population and transplanted a new heart to each selected individual. Design 2: we would have classified all individuals as being in either critical (L=1) or noncritical (L=0) condition. They we would have randomly selected 75% of individuals in critical condition and 50% in noncritical condition to receive treatment. Design 1 is marginally randomized experiment and design 2 conditionally randomized

32 Conditionally RCT Counterfactual mortality risk under each potential treatment value a is the same among the treated and untreated given they were all in critical condition at the time of treatment assignment Pr[Ya=1|A=1, L=1]= Pr [Ya=1|A=0, L=1] or Ya and A are independent given L=1 (Ya∐A|L=1 for all a) Ya∐A|L=0 for all a is also true (Ya∐A|L=l) = (Ya∐A|L) = conditional Exchangeability holds for all values of L

33 Conditionally RCT Simply combination of 2 marginally randomized experiments. One conducted in subset of population L=0 and the other in L=1 Values not MCAR, but they are MAR condition on the covariate L Marginal exchangeability not achieved randomization generates conditional exchangeability

34 Analysis Randomized Trials
Question: How do you analyze a marginally RCT? Answer:

35 Analysis Randomized Trials
Question: How do you typically analyze a marginally RCT? Hint: Dependent and independent variables? Answer:

36 Analysis Randomized Trials
Question 1: How do you typically analyze a marginally RCT? Hint: Dependent and independent variables? Answer: Crude or unadjusted analysis with treatment and outcome.

37 Analysis Randomized Trials
Question 2: How do you typically analyze a conditionally RCT?

38 Analysis Randomized Trials
Question 2: How do you typically analyze a conditionally RCT? Answer 2: Robins recommends standardization and IPW Stratification type method or common Conditions were standardization ≠ stratification. Review their text.

39 Summary Randomization produces Exchangeability
Marginal exchangeability Conditional exchangeability Exchangeability Allows us to use associational measure to estimate causal effects G-methods, which include standardization, MSM with IPW, g-estimation

40 Observational Studies
Investigator has no control over treatment assignment, e.g., randomization Cannot achieve exchangeability by design To estimate a causal contrast we must obtain valid observable substitute quantities for the desired counterfactual quantities If we don’t have good substitutes thenwe have a confounded relationship, i.e., the associational RR ≠ CRR

41 Observational Studies
Conceptual justification Conceptualize observational studies as though they are conditionally randomized experiments. We assume that some components of the observational study happen by chance.

42 Identifiability Conditions
Consistency: treatment levels are not assigned by researcher, but correspond to well defined interventions Positivity: all conditional probabilities of treatment are greater than zero Conditional Exchangeability: conditional probabilities of being assigned to specific treatment not chosen by investigator, but can be calculated from data

43 Observational Studies?
Causal Inference Exchangeability and conditional exchangeability can not be reached by design. Question 1: How do we address conditional exchangeability in Observational studies?

44 Observational Studies?
Causal Inference Exchangeability and conditional exchangeability can not be reached by design. Question 1: How do we address conditional exchangeability in Observational studies? Question 2: How should be pick covariates for our observational studies?

45 Big Picture Covariates should be selected to produce conditional exchangeability Confounding must be removed to produce conditional exchangeability A variable that removes confounding is a confounder Adjusting for certain types of covariates can either block paths, open paths or do nothing We want to adjust variables that block all backdoor paths between the treatment and outcome, i.e., remove confounding.

46 Theory of Causal DAGs Mathematically formalized by
Pearl (1988, 1995, 2000) Sprites, Glymour, and Scheines (1993, 2000)

47 Directed Acyclic Graphs
Are abstract mathematical objects. Encode an investigators a priori assumptions about the causal relations among the exposure, outcomes and covariates. They represent: joint probability distributions causal structures.

48 Value of DAGs Support communication among researchers and clinicians
Explicate our belief and background knowledge about causal structures Allow us to determine what needs to be measured to remove confounding Helps us determine how bias can be induced Helps choose appropriate statistics

49 Directed Acyclic Graphs (DAGs)
Directed edges (arrows) linking nodes (variables) Variables joined by an arrow are said to be adjacent or neighbors Acyclic because no arrows from descendents (effects) to ancestors (causes) Descendants of variable X are variables affected either directly or indirectly by X Ancestors of X are all the variables that affect X directly or indirectly Paths between two variables can be directed or undirected

50 d-separation Criteria
Rules linking absence of open paths to statistical independencies Describe expected data distributions if the causal structure represented by the graph is correct Unconditional d-separation Path is open or unblocked if no collider on path Collider blocks a path d-Connected Open path between two variables If A and Y are separated in a causal graph, then the causal assumptions encoded by the graph imply that A and Y will be unassociated. All paths between A and Y are closed then they will be marginally independent

51 Graphical Conditioning
Conditioning (adjustment) on a collider F on a path, or any descendant of F, opens the path at F U1 and U2 are marginally independent, but conditionally associated (conditioning on F) Conditioning on a non-collider closes the path and removes C as a source of association between A and Y A and Y are marginally associated, but conditionally independent (conditioning on C) These concepts provide a link between the causal structures depicted in a DAG and the statistical associations we expect in data generated from that causal structure. Family income, mothers risk for diabetes and mother having diabetes, testing low income effect on diabetes.

52 Rules for linking causal assumptions of DAG to statistical independencies
Causal Markov Assumption: Any variable X is independent of any other variable Y conditional on the direct causes of X, unless Y is an effect of X Faithfulness: Positive and negative causal effects never perfectly offset one another. If X affects Y through 2 pathways, one + one - , the net effect will never be zero Negligible randomness: Statistical associations or lack of associations are not attributable to random variation or chance (assume large samples)

53 Graphical vs. Statistical Criteria for Indentifying Confounders
Statistical: a confounder must Be associated with the exposure under study in the source population Be a risk factor for the outcome, though it need not actually cause the outcome Not be affected by the exposure or the outcome Graphical: a confounder must Be a common cause Have an unblocked back-door path

54 Theory of Causal DAGs Specifies that an association between an exposure and an outcome can be produced by the following 3 causal structures: 1. Cause and effect relationship (direct and indirect) 2. Common cause: confounding 3. Common effects: conditioning on common effects “colliders” produces selection bias. An exposure and outcome that have a common effect will be conditionally associated if the associational measure is computed within levels of the common effect

55 Unified Theory of Bias Bias can be reduced to or explained by 3 structures Reverse causation: case-control – outcome precedes exposure measurement or outcome can have effect on exposure. Measurement error or Information bias. Common cause: confounding, confounding by indication Conditioning on common effects: collider, selection bias, time varying confounding

56 Covariate Selection Adequate Background Knowledge
Confounder identification must be grounded on an understanding of the causal structure linking the variables being studied (treatment and disease) Build a directed acyclic graph (DAG) to check whether the necessary criteria for confounding exists. Condition on the minimal set of variables necessary to remove confounding Inadequate Background Knowledge Remove known instrumental variables, colliders, intermediates (variables with post treatment measurement Use automated selection procedures such as HDPS

57 Confounding and Bias Under adjustment occurs when
An open back door path was not closed Over adjustment can occur from adjusting Instrumental variables Intermediate variables Colliders Variables caused by outcome Discussion of variable types

58 Confounder Common Cause, i.e., confounder
Confounder L distort the effect of treatment A on disease Y Always adjust for confounders, unless small data set and confounder has strong association with treatment and week association with outcome Goal is to produce conditional exchangeability

59 Confounder Example A = treatment L = Baseline Cholesterol
a=1 statin alone a=0 niacin alone L = Baseline Cholesterol l=1: LDL ≥ 160 mg/dL l=0: LDL < 160 mg/dL Y = Myocardial infarction Y=1: Yes Y=0: No Reichenback talks about principles of common causes.

60 Intermediate Variable
Adjusting for intermediate variable I in a fixed covariate model will remove the effect of treatment A on disease/outcome Y In a fixed covariate model we do not want to include variables influenced by A or Y Time-varying treatment model does include time-varying confounding that is also an intermediate variable

61 Intermediate Example A = treatment I = Post-treatment Cholesterol
a=1 statin alone a=0 niacin alone I = Post-treatment Cholesterol i=1: LDL ≥ 160 mg/dL i=0: LDL < 160 mg/dL Y = Myocardial infarction Y=1: Yes Y=0: No Modulate inflammatory response Prevent thrombus formation

62 Collider Adjusting for the collider C can produce bias
Conditioning on common effect F without adjustment of U1 or U2 will induce an association between U1 and U2, which will confound the association between A and Y Blocked backdoor path. F meets the criteria for traditional confounder, but it is not a counfounder and this is not a confounded study

63 Collider A = antidepressant use Y = lung cancer U1 = depression
U2 = smoking status F= cardiovascular disease Blocked backdoor path. F meets the criteria for traditional confounder, but it is not a counfounder and this is not a confounded study

64 Collider Example Causal null: whether having low education increases risk for type II diabetes. We measured mother’s diabetes status, but do not have measures of family income when the individual was growing up or if the mother had any genes that would increase the risk for diabetes. Under the assumptions in the DAG, should we adjust for mother’s diabetes status? Assumptions that if poor during childhood, then poor as adult and poor associated with diabetes and low education. Mother’s diabetes status will be statistically associated with education. They share a common prior cause. Meets criteria for statistical association Conditioning on mother’s diabetes unblocks the blocked backdoor path and induces a spurious statistical association between low education and diabetes. Does not meet criteria for graphical confounder. Basketball player tall or fast.

65 Variables associated with treatment or disease only
Inclusion of variables associated with treatment only (A) can cause bias and imprecision Variables associated with disease but not treatment (risk factors) can be included in models. They are expected to decrease variance of treatment effect without increasing bias Including variables associated with disease reduces the chance of missing important confounders Argue a high level of inductive reasoning here

66 Reality is Complicated
Shrier I, Platt, RW. Reducing bias through directed acyclic graphs. BMC Medical Research Methodology. 2008: 8:70

67 Determining Minimal Set of Variables
Produce a DAG and get clinical experts to agree on underlying causal network Block (condition) on the variables that allow for open backdoor paths Backdoor paths are confounders Pearl 6-step approach for determining minimal set of variables (illustrated by Shrier & Platt. Reducing bias through DAGs. BMC research Methodology :70)

68 Limitations of DAG approach
Subject matter knowledge is often not good enough to draw DAG that can be used to determine the minimal set of covariates needed to produce conditional exchangeability In large database studies with many providers it is difficult to know all the factors that influence treatment decisions.

69 Insufficient Background Knowledge
Recommendation : Propensity score (PS) approach: Remove colliders and instruments (variables associated with treatment but not disease) In a large PS study we should include as many of the remaining variables as possible. Focus should be on variables that are a priori thought to be strongly causally related to outcomes (risk factors, confounders) Outcome models approach: Use a change in estimate approach to select variables Since evidence of best variable selection approaches are limited, researchers should explore the sensitivity of their results to different variable selection strategies as well as removal and inclusion of variables that could be IV or colliders.

70 Analysis of Observational Data Based on Counterfactuals
Fixed treatments Propensity Score Instrumental Variables IPW Time-varying treatments (sequentially randomization) G-estimation Doubly robust

71 Simulation We have developed simulations to understand and teach these concepts. Poster at CDA conference. If interested then please contact me


Download ppt "Modern Approach to Causal Inference"

Similar presentations


Ads by Google