Mendelian randomization with invalid instruments: Egger regression and Weighted Median Approaches David Evans.

Slides:



Advertisements
Similar presentations
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Advertisements

Correlation and regression
Chapter 12 Simple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Lecture 5 Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
Correlation and Regression
Inference for regression - Simple linear regression
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Meta-analysis and “statistical aggregation” Dave Thompson Dept. of Biostatistics and Epidemiology College of Public Health, OUHSC Learning to Practice.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Introduction to meta-analysis.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Lecture 10: Correlation and Regression Model.
Correlation & Regression Analysis
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Methods of Presenting and Interpreting Information Class 9.
Statistics Made Simple
Chapter 13 Simple Linear Regression
Lecture #25 Tuesday, November 15, 2016 Textbook: 14.1 and 14.3
GS/PPAL Section N Research Methods and Information Systems
More Multiple Regression
ESTIMATION.
REGRESSION G&W p
Objectives (BPS chapter 23)
Regression Analysis: Statistical Inference
Topic 10 - Linear Regression
Correlation and Simple Linear Regression
Inference for Regression
Linear Regression and Correlation Analysis
Chapter 11: Simple Linear Regression
PCB 3043L - General Ecology Data Analysis.
Correlation – Regression
Lecture 4: Meta-analysis
Chapter 11 Simple Regression
Understanding Standards Event Higher Statistics Award
STEM Fair: Statistical Analysis
Elementary Statistics
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Correlation and Simple Linear Regression
Introduction to Inferential Statistics
Correlation and Regression
Statistical Analysis Error Bars
Gerald Dyer, Jr., MPH October 20, 2016
More Multiple Regression
Correlation and Simple Linear Regression
More Multiple Regression
Statistics Made Simple
Tutorial 1: Misspecification
Hypothesis Tests for a Standard Deviation
Simple Linear Regression and Correlation
Product moment correlation
UNIT V CHISQUARE DISTRIBUTION
Introduction to Regression
S.M.JOSHI COLLEGE, HADAPSAR
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Mendelian randomization with invalid instruments: Egger regression and Weighted Median Approaches David Evans

What is the problem? Mendelian Randomization (MR) uses genetic variants to test for causal relationships between phenotypic exposures and disease-related outcomes Due to the proliferation of GWAS, it is increasingly common for MR analyses to use large numbers of genetic variants Increased power but greater potential for pleiotropy Pleiotropic variants affect biological pathways other than the exposure under investigation and therefore can lead to biased causal estimates and false positives under the null

Two Sample MR: Single Variants Single causal estimate from the ratio estimator / Wald method Wald = Beta-GY Beta-GX Causal estimate using Wald method:

Two Sample MR: Multiple Variants With multiple genetic variants, causal effect estimated using TSLS – weighted average of the ratio estimates calculated using each genetic variant in turn. If the genetic variants are uncorrelated, inverse-variance weighted estimated using summarized data can be calculated. In infite samples IVW = TSLS, but will differ slightly in finite samples Beta-hat = study specific IV estimates (YZ/XZ) Causal estimate using IVW from summarised data: (Approximates TSLS)

MR – with direct pleiotropy . Single variant Wald estimate: Estimate will be biased unless alphaj = 0; or the alphaj’s cancel out Multiple variant TSLS / IVW :

Egger Regression: Central concept In Mendelian Randomization when multiple genetic variants are being used as IVs, Egger regression can: Identify the presence of ‘directional’ pleiotropy (biasing the IV estimate) provide a less biased causal estimate (in the presence of pleiotropy)

InSIDE Assumption Relaxing MR’s assumptions . W Standard MR has 3 assumptions. InSIDE relaxes the third assumption W

Example: ALL INVALID INSTRUMENTS INSIDE ASSUMPTION SATISFIED SNP – outcome association This is equal to the coefficient from a weighted regression of the gene–outcome association estimates (^C j) on the gene–exposure association estimates (^cj) with the intercept constrained to zero, and weighted by the inverse of the precision of the IV–outcome coefficients (r2 Yj ). SNP – exposure association

InSIDE: Bias of ratio estimator SNP – outcome association Since bias is inversely proportional to SNP-exposure association, ratio estimates for stronger genetic variants (such as (i) ) will be on average closer to the true causal effect SNP – exposure association

Egger regression: Intercept not constrained to zero Egger’s test assesses whether the intercept term is significantly different from zero. This will occur if estimates from weaker genetic variants are more skewed towards higher/lower values compared with estimates from stronger genetic variants. The estimated values of the intercept can be interpreted as average pleiotropic effect across all genetic variants. An intercept term different from zero indicates overall directional pleiotropy (ie pleiotropy not balanced around the null) Egger’s test assesses whether the intercept term is significantly different from zero. The estimated values of the intercept can be interpreted as the average pleiotropic effect across all genetic variants. An intercept term different from zero indicates directional pleiotropy

Height and lung function SNP-Lung Function SNP-Height 180 variants for height Funnel plot suggests little asymmetry present (indicative of no directional pleiotropy); supported by concordance between egger test and IVW, and the lack of significant intercept SNP - Height Causal estimate IVW = 0.59 (95% CI: 0.50, 0.67 ) Egger = 0.58 (95% CI: 0.50, 0.67); intercept -0.001 p=0.5

BP and Coronary Disease FUNNEL PLOTS Some asymmetry is present in these funnel plots, consistent with directional pleiotropy Eg for SBP, 10 of the 13 weakest variants have IV estimates greater than the IVW causal effect Visual evidence for asymmetry

BP and Coronary Disease FUNNEL PLOTS Some asymmetry is present in these funnel plots, consistent with directional pleiotropy Eg for SBP, 10 of the 13 weakest variants have IV estimates greater than the IVW causal effect

BP and Coronary Disease Scatter Plots Intercepts deviate from zero but not strong statistical evidence for difference! Egger test for intercept p=0.2 Egger test for intercept p=0.054

BP and Coronary Disease FUNNEL PLOTS IVW= 0.054 logOR/mmHg p=4x10-6 Egger =0.015 logOR/mmHg p=0.6 IVW= 0.083 logOR/mmHg p=1x10-5 Egger =-0.024 logOR/mmHg p=0.7

Simple Median Method Order instrumental variables estimates and take the median Like all subsequent estimators it enjoys a 50% breakdown limit

Weighted Median Method ^ γj2 wj’ ___ ____ wj’ = wj = σ2Yj Σjwj’

Penalized Weighted Median Method Although the invalid IVs do not contribute directly to the median estimate, they do influence it in small samples Like all subsequent estimators it enjoys a 50% breakdown limit

Penalized Weighted Median Method One way of minimizing this problem is down-weighting the contribution to the analysis of genetic variants with heterogeneous ratio estimates Heterogeneity between estimates can be quantified by Cochrane’s Q statistic: The Q statistic has a chi-squared distribution on J – 1 degrees of freedom under the null hypothesis of no heterogeneity Each individual component of Q has a chi-square distribution with 1 df. Bowden proposes using a one sided upper P value (denoted qj): Q = ΣjQj ^ = Σjwj’(βj - β)2 wj* = w’j × min(1, 20qj)

Penalized Weighted Median Method

References Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Bowden J, Davey Smith G, Burgess S. Int J Epidemiol. 2015 Apr;44(2):512-25. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Bowden J, Davey Smith G, Haycock PC, Burgess S. Genet Epidemiol. 2016 May;40(4):304-14.