Diabetes Genome Wide Association Alessandra C Goulart Ida Hatoum Stalo Karageorgi Mara Meyer EPI293 January 2008 Harvard School of Public Health Alessandra.

Slides:



Advertisements
Similar presentations
Statistical methods for genetic association studies
Advertisements

What is an association study? Define linkage disequilibrium
AllerGen / Vancouver - 01/03//2009 Meta-Analysis of GABRIEL GWAS Asthma & IgE F. Demenais, M. Farrall, D. Strachan GABRIEL Statistical Group.
Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Genome-wide Association Studies John S. Witte. Association Studies Hirschhorn & Daly, Nat Rev Genet 2005 Candidate Gene or GWAS.
SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Genetic Analysis in Human Disease
Genetics Journal Club Robert C. Bauer January 22 nd, 2015.
Mapping Genetic Risk of Suicide Virginia Willour, Ph.D.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
Cara Skraban, MD Clinical Genetics Fellow February 12, 2015
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
The Inheritance of Complex Traits
Office hours Wednesday 3-4pm 304A Stanley Hall. Fig Association mapping (qualitative)
Class activity: What are my asthma variants doing? In the subset of individuals for whom expression data are available, the T nucleotide allele at rs
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls The Wellcome Trust Case Control Consortium, Nature, 2007.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Using biological networks to search for interacting loci in genome-wide association studies Mathieu Emily et. al. European journal of human genetics, e-pub.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Study Design Discussion The Ghost of Candidate Gene Past and the Ghost of Genome-wide Association Yet to Come Stephen S. Rich, Ph.D. Wake Forest University.
Type 2 Diabetes With type 2 diabetes, your body either resists the effects of insulin — a hormone that regulates the movement of sugar into your cells.
Common Disease Findings (case study on diabetes) GWAS Workshop Francis S. Collins, M.D., Ph.D. National Human Genome Research Institute May 1, 2007.
Design Considerations in Large- Scale Genetic Association Studies Michael Boehnke, Andrew Skol, Laura Scott, Cristen Willer, Gonçalo Abecasis, Anne Jackson,
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Analysis of genome-wide association studies
Strong Heart Family Study Phase VI Genetics Center Aims October 8, 2009.
Geuvadis RNAseq analysis at UNIGE Analysis plans
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Comments on Rare Variants Analyses Ryo Yamada Kyoto University 2012/08/27 Japan.
Factors to Consider in Selecting a Genotyping Platform Elizabeth Pugh June 22, 2007.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
Figure S1. Quantile-quantile plot in –log10 scale for the individual studies The red line represents concordance of observed and expected values. The shaded.
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
What host factors are at play? Paul de Bakker Division of Genetics, Brigham and Women’s Hospital Broad Institute of MIT and Harvard
From Genome-Wide Association Studies to Medicine Florian Schmitzberger - CS 374 – 4/28/2009 Stanford University Biomedical Informatics
Genome-Wide Association Study (GWAS)
Whole genome association studies Introduction and practical Boulder, March 2009.
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (1) J. R. O’Connell 1 and P. M. VanRaden 2 1 University of Maryland School of Medicine,
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
Statistical Issues in Genetic Association Studies
An quick overview of human genetic linkage analysis
The International Consortium. The International HapMap Project.
Genome-wide association studies (GWAS) Thomas Hoffmann.
Design and Analysis of Genome- wide Association Studies David Evans.
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
Taina K. Lajunen, Jouni J. K. Jaakkola, and Maritta S. Jaakkola Am J Respir Crit Care Med Vol 188, Issue 7, Oct 1, 2013 호흡기내과 R2 김다래 / 장나은선생님.
EBF FLJ31951UBLCP1 IL12B B36 Position Genes LD Regions Genotyped Markers Chr5 (q33.3) rs rs Figure 1. Physical map of 360kb around IL12B.
GWAS Consortia and Meta-Analysis Inês Barroso Joint Head of Human Genetics Metabolic Disease Group Leader Wellcome Trust Sanger Institute 1.
SCANNING OF CANDIDATE GENES FOR THE SUSCEPTIBILITY OF KAWASAKI DISEASE IN THE HLA REGION Lee JK, Kim JJ, Kim S, Choi IH, Kim KJ, Hong SJ, Seo EJ, Yoo HW,
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Peng Yin1, Andrea L Jorgensen1, Andrew P Morris1, Richard Turner2, Richard Fitzgerald2, Rod Stables3, Anita Hanson2, Munir Pirmohamed2 1. Department of.
Genome Wide Association Studies using SNP
Genes and type 2 diabetes: relevance to clinical practice?
Gene Hunting: Design and statistics
Case Study #2 Session 1, Day 3, Liu
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Genome-wide Associations
Beyond GWAS Erik Fransen.
Type 2 Diabetes With type 2 diabetes, your body either resists the effects of insulin — a hormone that regulates the movement of sugar into your cells.
Perspectives from Human Studies and Low Density Chip
Presentation transcript:

Diabetes Genome Wide Association Alessandra C Goulart Ida Hatoum Stalo Karageorgi Mara Meyer EPI293 January 2008 Harvard School of Public Health Alessandra C Goulart Ida Hatoum Stalo Karageorgi Mara Meyer EPI293 January 2008 Harvard School of Public Health

2 Background - Type 2 Diabetes Mellitus Disorder characterized by impaired glucose/insulin function >170 million worldwide

3 Background - Genetic Justification Explosion of diabetes plus rapidly decreasing age of onset argues for environmental rather than genetic etiology Genetic justification — Clustering in families — Leveling off of risk by BMI — Mouse data Pattern argues for a polygenic trait - GWAS!

4 Methods 3 separate studies, all working collaboratively Different populations, different analyses

5 FUSION/Finrisk Population Genome-wide scan study ( N=2,335 Finland) Population-based (T2D) (Controls) Matched on age, sex, birth province

6 DGI Population Genome-wide scan study ( N=2,931/ Finland /Sweden) Population-based Family- based (T2D) (Controls) Matched on gender, age, BMI,Discordant siblings matched on age place of origin

7 WTCCC Population Genome-wide scan study ( N=4,862 British/Irish) (T2D) (Controls) No matching From a 1958 birth cohortFrom a diabetes “repository”

8 Methods – General Outline All three studies start with study populations between 2335 and 4862 All three run genome-wide association scans initially analyzing ,000 SNPs, and reduce that number with certain criteria All three studies then run second waves of replication or conduct replication studies in independent populations Findings are compared with previously published reports and across the three studies — Weighted meta-analysis Findings are fairly consistent between the three study populations, with many replicated associations

9 Population Stratification All three studies investigated potential population stratification by — Cochran-Armitage tests — Genome control inflation factor ( λ ) — Principal components analysis using EIGENSTRAT — Adjustment for region/birthplace – Matching, choice of study population — Replication in independent datasets

10 Methods - Platform Genotyping Platform for GWAS — Affymetrix GeneChip Human Mapping 500k Array Set – Wellcome Trust Case Control Consortium (WTCCC)UK – Diabetes Genetics Initiative (DGI) – Both population- (matched on gender, age, BMI and region of origin) and family-based samples — Illumina HumanHap300 BeadChip – Finland-United States Investigation of Non-Insulin-Dependent Diabetes Mellitus Genetics (FUSION) – 1161 Finnish T2D cases and 1174 normal glucose-tolerant controls from FUSION and Finrisk 2002 studies (matched by province, sex and age)

11 Methods - FUSION FUSION analyzed 315,635 SNPs with MAF > with a model that is additive on the log-odds scale — They observed an excess of low p-values (P<10-4), suggesting many common variants with modest effects ( λ = 1.026) — Imputed >2 million SNPs using data from HapMap CEU to cover 89.1% of SNPs with MAF >1% Compared stage 1 results with DGI and WTCCC to increase statistical power and select SNPs for stage 2 — An association was “genome-wide significant” if p<5x10 -8 Stage 2 replication sample of 1215 Finnish T2D cases and 1258 Finnish NGT controls — 80 of 82 selected SNPs genotyped

12 Methods - FUSION Stage 2 analysis selected SNPs based on — FUSION genotyped and imputed SNPs from stage 1, using a prioritization algorithm that gave preference to genotyped SNPs — Combined analysis of GWA results from FUSION, DGI, and WTCCC — Previous T2D association results Joint analysis of Stage 1 + Stage 2 All-data meta-analysis of FUSION, DGI, WTCCC and follow- up samples

13 Methods - DGI DGI analyzed 386,731 SNPs after applying strict quality control filters, developed 284,968 additional two-marker (haplotype) tests, for a total of 671,699 tests — Each SNP and haplotype was tested for association with T2D and each of 18 clinical traits — Population and family-based samples combined with a weighted meta-analysis — Quantitative traits assessed by linear or logistic regression — “Genome-wide significant” associations at p<5x10 -8 Three strategies to search for systemic bias — P-value distribution in population sample ( λ = 1.05), principal components analysis, and independent genotyping of 114 SNPs with extreme p-values

14 Methods - DGI Observed an excess of low p-values — 1000 permuted whole-genome analyses with phenotype data randomized within matched case-control groups to evaluate the significance of excess of low p –values — Suggests many variants with modest effects, not few variants with large effects Replication in independent sample of 10,850 subjects from case-control samples of European ancestry (Sweden, USA, Poland) under the same model — Replication set of 107 SNPs selected on the basis of this study and comparisons with WTCCC and FUSION

15 Methods - WTCCC Analyzed 393,453 autosomal SNPs with minor allele frequencies >1% in both cases and controls and no extreme departure from HWE (P<10 -4 ) Additional quality controls to find true associations included cluster-plot visualization, and validation genotyping on a second platform — P-value distribution indicates no substantial confounding by population substructure or genotyping bias ( λ = 1.08) The WTCCC group used 3 replication sets with an additional 3757 cases and 5346 controls from two other UK studies

16 Methods - WTCCC First wave of SNPs selected 21 representative SNPs from the 30 SNPs in 9 distinct chromosomal regions with the most extreme p-values from the initial scan (p<10 -5 ) to limit false discovery Second wave relaxed p-value to detect modest associations (p~10 -2 to ) and found 5367 SNPs — Prioritized SNPs by evidence of association in DGI and FUSION; presence of multiple, independent associations within the same locus; and biological candidacy to analyze 56 SNPs

17 Results- FUSION GWAS

18 Results- FUSION GWAS Common in all 3 studies Common in 2 studies

19 Results-FUSION GWAS 10 loci identified: — 5 new: near genes IGF2BP2, CDKAL1, CDKN2A/2B, intergenic region ch. 11, FTO — 5 previously published: near PPARG, SLC30A8, HHEX, TCF7L2, KCNJ11 All loci have biological plausibility. Unknown for non-coding region ch. 11 FUSION study found: — Strong evidence for – TCF7L2 (stage 1+2) – SLC30A8 (stage 1) – IGF2BP2 (stage 1) – Intergenic region ch. 11 (stage 1) — Modest evidence for – HHEX – CDKAL1 (stage 1) – CDKN2A/2B (stage 1+2) – FTO (stage 1+2) — Some evidence for – PPARG (Imputed) – KCNJ11 (Imputed)

20 Results-FUSION GWAS Compared results to DGI and WTCCC scans — HHEX, CDKAL1, FTO with modest evidence showed stronger evidence in WTCCC scan — SLC30A8 subsequent genotyping in other studies resulted in stronger evidence in combined sample All SNPs or genes in this study overlap with corresponding SNP/gene in at least one of the other studies except the intergenic region on ch. 11 Intergenic region on ch. 11 — Includes 3 sets spliced Expressed Sequence Tags — Nearby regions reported in other GWA study (Sladek 2007)

21 Meta-Analysis All-data meta-analysis of FUSION, DGI, WTCCC and follow- up samples — Weighted log ORs from each study by the inverse of the variance — Total sample size: 32,544 (increased 7-fold from FUSION alone) — Increased sample size, power to detect modest effects All 10 loci reached genome-wide significance in meta- analysis (helping to confirm loci with only some evidence, emphasizing importance of combining data)

22 DGI GWA

23 Results- DGI GWAS Confirmed T2D susceptibility variants Common in all 3 studies Common in 2 studies

24 DGI GWAS. TD2 was trait associated with novel and previous published candidate genes. Association with HHEX was confirmed in this GWA, WTCC/UKT2D and by other studies (Sladek 2007). Association with SLC30A8 was consistently confirmed by WTCC/UKT2D and FUSION. No evidence for association: LOC387761, EXT2-ALX4. Additional loci: FLJ393370, PKN2

25 DGI GWAS. Current WGA and collaborators: evidence for association was verified in 3 previously unknown loci with T2D risk ( CDKN2B, IGF2BP2 and CDKAL1). 15 common variants for T2D and lipid levels were identified. New T2D genes suggest a primary role of the pancreatic beta cell

26 Results- WTCCC GWA

27 Results- WTCCC GWA Confirmed T2D susceptibility variants Common in all three studies Common in 2 studies

28 Results- WTCCC GWA In the WTCCC, the strongest association signals were found for SNPs in TCF7L2 (P=6.7x ) From the first wave of SNPs, replication was found for SNPs in CDKAL1: ‘Compelling’ evidence across all studies (P~4.1 x ), SNPs map to a 90kb intron, may be involved in regulation of pancreatic beta cell function An association at FTO on chromosome 16 (rs ) was found to be mediated through a primary effect on adiposity Confirmed a previously reported association at HHEX The HHEX signal is in an area of LD also containing genes encoding KIF11 and IDE, which have biological plausibility

29 Results- WTCCC GWA The second wave found modest associations with SNPs in CDKN2A/CDKN2B replicated across the studies: CDKN2A is a known tumor suppressor, and produces p16 INK4a which inhibits CDK4, a regulator of pancreatic beta cell replication SNPs from the promoter and first 2 exons of IGF2BP2 were replicated in WTCCC, DGI, and FUSION Combined evidence was strong (P~8.6x ), biological plausibility Independent genotyping of SLC30A8 (rs ) replicated previously reported findings (P=7.0x10 -5 in all UK data) Affymetrix chip does not capture this locus

30 Results- WTCCC GWA This study identified several T2D susceptibility loci Confirmed previously reported loci including — TCF7L2: the largest association signal — FTO: the effect disappeared after adjustment for BMI — HHEX/IDE: Strong replication, biological plausibility Three novel loci — CDKAL1, IGF2BP2, and CDKN2A: replicated across the 3 studies in this analysis

31 Conclusions - Differences Across Studies Study populations — Location — Family-based vs. unrelated — Matching factors — Definition of diabetes Genotyping platforms — Illumina vs. Affymetrix Analysis plans — Individual tests — Haplotype analysis — Imputation methods — P-value criteria

32 Conclusions - Theoretical Considerations Agnostic/statistical vs. prior information/biological plausibility Relaxed vs. strict criteria Ability to replicate

33 Conclusions - Future Directions Non-coding regions may be important Many more variants yet to be determined - larger studies needed Resequencing and functional studies are necessary to determine causal variants Generalizability concerns Collaborative model will benefit science!