Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summary: connecting the question to the analysis(es) Jay S. Kaufman, PhD McGill University, Montreal QC 26 February 2016 3:40 PM – 4:20 PM National Academy.

Similar presentations


Presentation on theme: "Summary: connecting the question to the analysis(es) Jay S. Kaufman, PhD McGill University, Montreal QC 26 February 2016 3:40 PM – 4:20 PM National Academy."— Presentation transcript:

1 Summary: connecting the question to the analysis(es) Jay S. Kaufman, PhD McGill University, Montreal QC 26 February 2016 3:40 PM – 4:20 PM National Academy of Sciences 2101 Constitution Ave NW, Washington, DC 20418 USA

2 Causal inference is necessary for medical and public policy decision-making because we hope to optimize some outcome. Causal inference is about inherently unobservable things (i.e. the future under different scenarios) Because we can’t directly observe what we want to know, we model it. Good models Bad models From 1999 to 2009, the number of Americans who fell into a swimming pool and drowned each year is correlated with the number of films in which Nicholas Cage appeared that year. Shall we reduce the number of pool drownings by keeping Cage off the screen?

3 Statistical models are used to estimate relationships between variables in observational data sets. Y X Y X 0 1 β0β0 β1β1 But it is mechanistic knowledge or structural assumptions that allow us to infer causal effects from these relationships (not statistical considerations)

4 Frequency Measure of Dental Caries Hours of Television Viewing per capita  = expected change in outcome per unit change in exposure Consider two different bivariate associations: 1) Relation between ecologic levels of TV viewing and ecologic rates of dental caries, by country:

5 Frequency Measure of Dental Caries Daily Grams of Refined Sugar Consumed per capita  = expected change in outcome per unit change in exposure Consider two different bivariate associations: 2) Relation between ecologic levels of refined sugar consumption and ecologic rates of dental caries, by country: Adjust for (condition on) level of socioeconomic development, and find that Pr(Y|SET[X=x]) is null in scenario #1, and non-null in scenario #2.

6 Read: Pr(Y|SET[X=x]) as:Pr(Y|SET[X=x 1 ]) versus Pr(Y|SET[X=x 2 ]) where x 1 and x 2 are two levels at which you can intervene to set the exposure, and the contrast is usually a difference or ratio. Clearly, quantities intermediate between exposure and outcome are not "confounders", they are just part of the mechanism through which the exposure has the effect that it has.

7 For example: X Z Y cigarette tax cigarette consumptionlung cancer mortality The causal effect of manipulating cigarette tax is: Pr(Y|SET[X=$1]) versus Pr(Y|SET[X=$2]) If these are the only three variables relevant to this problem, this causal effect is estimated without bias by the contrast of the observed probabilities: Pr(Y| X=$1) versus Pr(Y| X=$2) NOT by adjusting for the intermediate Z. In fact, the adjusted effect would be null, which may be very far from the truth.

8 That’s exactly why we use this graphical language for encoding subject matter knowledge about causality A way of communicating structural assumptions. Non-parametric. Cannot be deduced from the data.

9 Compare a graphical model with a typical parametric epidemiologic model, such as logistic regression: XY Z The graphical model asserts only that: Y =  (X, Z,  y ) and that X =  (U,  X ) and Z =  (U,  Z ) The logistic regression model: makes MANY assertions, including the multiplicative interaction of X and Z, and the linearity of the ln(odds) of Y across all values of X and Z. U

10 On the other hand, a graphical model can represent many structural relations that cannot be encoded in a typical statistical model: ZY X The graphical model asserts that: X =  (Z,  X ) The logistic regression model: cannot easily represent this constraint, even if it is know by the investigators to be true on subject matter grounds (e.g., Z = SEX, X = SMOKING)

11 Confounding Confounding is a divergence between two kinds of conditional probability distributions of Y: the distribution given that we find X at the value x (estimable from the data), and the distribution given that we intervene to force X to take the value x. Confounding is the distinction between seeing and doing: ZY X

12 Identification Can we express the Y|SET(X=x) quantity in terms of observables? Estimation What is the actual numerical value of the contrast E(Y|SET(X=1)) – E(Y|SET(X=0))? ZY X

13 Most causal inference methods assume that you have no unmeasured confounders: Regression Propensity Scores Marginal Structural Models G-methods (SNMs, G-Formula, etc) “Quasi-Experimental” Methods use structural assumptions to achieve identification even in the presence of unmeasured confounding: Instrumental variables Regression Discontinuity Fixed Effects Differences in Differences

14 Some causal inference methods achieve identification based on extrapolation of a parametric model. Semi-parametric methods (e.g. propensity scores, IPTW, TMLE, etc) rely less on model form. Letting a computer pick the model reduces “wish bias”. Non-parametric methods (e.g. matching) require no model at all. Doubly robust methods require that at least one model be right, but not both. Computer intensive methods (e.g. bootstrapping) reduce reliance on distributional assumptions.

15 Summary: Models can be used to parameterize associations between treatment and response variables. Decision makers need to interpret associations causally, to predict the change in Y that will occur under specific interventions on X. The validity of this causal interpretation is threatened by both systematic and random errors. The systematic errors are all functions of causal structure which cannot be deduced from the data. Most of the “complex” methods described are only “complex” because of time.


Download ppt "Summary: connecting the question to the analysis(es) Jay S. Kaufman, PhD McGill University, Montreal QC 26 February 2016 3:40 PM – 4:20 PM National Academy."

Similar presentations


Ads by Google