Meta-analysis for GWAS BST775 Fall 2013. DEMO Replication Criteria for a successful GWAS P<5*10 -8 Replicate in independent cohorts Fine mapping.

Slides:



Advertisements
Similar presentations
Gene-by-Environment and Meta-Analysis Eleazar Eskin University of California, Los Angeles.
Advertisements

Association Tests for Rare Variants Using Sequence Data
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
CHAPTER 25: One-Way Analysis of Variance Comparing Several Means
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
Basics of Linkage Analysis
Chapter 8 Estimation: Additional Topics
Chapter 10 Simple Regression.
Part I – MULTIVARIATE ANALYSIS
Chapter 12 Multiple Regression
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
BHS Methods in Behavioral Sciences I
Using biological networks to search for interacting loci in genome-wide association studies Mathieu Emily et. al. European journal of human genetics, e-pub.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Meta-Analyses: Combining, Comparing & Modeling ESs inverse variance weight weighted mean ES – where it starts… –fixed v. random effect models –fixed effects.
Ch 11 – Inference for Distributions YMS Inference for the Mean of a Population.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Topic 5 Statistical inference: point and interval estimate
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
Go to index Two Sample Inference for Means Farrokh Alemi Ph.D Kashif Haqqi M.D.
ANOVA One Way Analysis of Variance. ANOVA Purpose: To assess whether there are differences between means of multiple groups. ANOVA provides evidence.
Figure S1. Quantile-quantile plot in –log10 scale for the individual studies The red line represents concordance of observed and expected values. The shaded.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Meta-analysis and “statistical aggregation” Dave Thompson Dept. of Biostatistics and Epidemiology College of Public Health, OUHSC Learning to Practice.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 10 Statistical Inference for Two Samples 10-1 Inference on the Difference in Means of Two Normal Distributions, Variances Known Hypothesis tests.
1 Association Analysis of Rare Genetic Variants Qunyuan Zhang Division of Statistical Genomics Course M Computational Statistical Genetics.
What host factors are at play? Paul de Bakker Division of Genetics, Brigham and Women’s Hospital Broad Institute of MIT and Harvard
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Genome-Wide Association Study (GWAS)
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Introduction to meta-analysis.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Effect Size Calculation for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance Concordia University February 24, 2010 February.
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
I271B The t distribution and the independent sample t-test.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Sequential & Multiple Hypothesis Testing Procedures for Genome-wide Association Scans Qunyuan Zhang Division of Statistical Genomics Washington University.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
MPS/MSc in StatisticsAdaptive & Bayesian - Lect 51 Lecture 5 Adaptive designs 5.1Introduction 5.2Fisher’s combination method 5.3The inverse normal method.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Homogeneity of Variance Pooling the variances doesn’t make sense when we cannot assume all of the sample Variances are estimating the same value. For two.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 10 Introduction to the Analysis.
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
BPS - 5th Ed. Chapter 231 Inference for Regression.
PGC Worldwide Lab Call Details DATE: Friday, April 12 th, 2013 PRESENTER: Alkes Price, Harvard University TITLE: “GWAS in multiple ancestries: heritability,
Chapter 22 Inferential Data Analysis: Part 2 PowerPoint presentation developed by: Jennifer L. Bellamy & Sarah E. Bledsoe.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Methods of Presenting and Interpreting Information Class 9.
H676 Week 3 – Effect Sizes Additional week for coding?
Psychology 202a Advanced Psychological Statistics
Genome Wide Association Studies using SNP
Two Sample Tests When do use independent
H676 Week 7 – Effect sizes and other issues
Lecture 4: Meta-analysis
Quantitative Methods Simple Regression.
Regression-based linkage analysis
BA 275 Quantitative Business Methods
The t distribution and the independent sample t-test
Reasoning in Psychology Using Statistics
Narrative Reviews Limitations: Subjectivity inherent:
Homogeneity of Variance
A Selection Operator for Summary Association Statistics Reveals Allelic Heterogeneity of Complex Traits  Zheng Ning, Youngjo Lee, Peter K. Joshi, James.
Chapter 10 Introduction to the Analysis of Variance
Presentation transcript:

Meta-analysis for GWAS BST775 Fall 2013

DEMO

Replication Criteria for a successful GWAS P<5*10 -8 Replicate in independent cohorts Fine mapping

Statistical power of detection in GWAS for variants that explain 0.1– 0.5% of the variation at a type I error rate of 5 ×

Figure 2. Plots of power (solid lines) and coverage (dotted line) for increasing sample sizes of cases and controls (x-axis). Spencer CCA, Su Z, Donnelly P, Marchini J (2009) Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip. PLoS Genet 5(5): e doi: /journal.pgen

Meta vs Mega:How to combine Multiple GWAS for a same trait Meta-analysis Conduct analysis at individual cohorts Only share summary statistics for each variants Combine evidences using standard meta- analysis Mega-analysis Collect genotype and phenotype information from all participating cohorts. Conduct a single analysis in the pooled sample, adjusting individual cohorts

Why meta-analysis Achieving larger sample size Without sharing privacy geno/pheno data No power gain with pooled sample Adjusting heterogeneity among cohorts

Popularity of meta-analysis Most genetic risk variants discovered in the past few years have come from large-scale meta-analyses. Several hundreds GWAS meta published June 15, 2012: 139 meta each n>10,000 Largest, ~ n=500,000 for height (rumor)

Popularity of meta-analysis

Meta-analysis stages

Statistical approaches P-value Z-score Fixed effects Random effects Bayesian approaches Multivariate approaches

Fisher’s P-value The simplest genome-wide association study (GWAS) meta-analysis approach is to combine P values using Fisher's method. The formula for the statistic is where Pi is the P value for the i th study, and k is the number of studies in the meta-analysis. Under the null hypothesis, X 2 follows a χ 2 distribution with 2k degrees of freedom.

Stouffer’s Z-score The Z scores meta-analysis can be implemented using the equation where w i is the square root of sample size of the i th study and where Φ is the standard normal cumulative distribution function.

Fixed effects For fixed effects models, inverse variance weighting is widely used. The weighted average of the effect sizes can be calculated as where is the i th study normalized effect (for example, logarithm of odds ratio or β-coefficient for a logistic regression for a binary phenotype or mean difference or standardized mean difference for a continuous phenotype), and w i is the reciprocal of the estimated variance of the effect study.

Dealing with heterogeneity

Meta-analysis of multiple correlated variants An approximate conditional approach using summary data from a meta- analysis and LD correlations between SNPs estimated from a reference sample (here, a subset of the meta-analysis sample) was successfully applied in a meta-analysis for height and body mass index. This method identified 36 loci with multiple associated variants for height adding 49 additional SNPs on top of the already known variants. In this approach, a genome-wide stepwise selection procedure selects SNPs on the basis of conditional P values and estimates the joint effects of all selected SNPs after the model has been optimized. The method assumes that the reference sample is from the same population as the samples from which the genotype–phenotype associations are estimated (that is, that linkage correlation estimates in the reference sample are unbiased).

References Evangelou E, Ioannidis JP: Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet 2013, 14(6): Yang J, et al: Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 2012, 44(4): , S Lin DY, Zeng D: Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. Genet Epidemiol 2010, 34(1):60-66.