From Genome-Wide Association Studies to Medicine Florian Schmitzberger - CS 374 – 4/28/2009 Stanford University Biomedical Informatics

Slides:



Advertisements
Similar presentations
Imputation for GWAS 6 December 2012.
Advertisements

What is an association study? Define linkage disequilibrium
Review of main points from last week Medical costs escalating largely due to new technology This is an ethical/social problem with major conseq. Many new.
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Genetic Analysis in Human Disease
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Ferdinand van ’t Hooft Cardiovascular Genetics and Genomics Group Karolinska Institutet, Stockholm, Sweden Genome-Wide Association Study GWAS
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
1 FSTL4 and SEMA5A are associated with alcohol dependence: meta- analysis of two genome-wide association studies Kesheng Wang, PhD Department of Biostatistics.
Classification and risk prediction
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
Using biological networks to search for interacting loci in genome-wide association studies Mathieu Emily et. al. European journal of human genetics, e-pub.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao.
Genome Variations & GWAS
Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.
Design Considerations in Large- Scale Genetic Association Studies Michael Boehnke, Andrew Skol, Laura Scott, Cristen Willer, Gonçalo Abecasis, Anne Jackson,
Understanding Genetics of Schizophrenia
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Division of Population Health Sciences Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Indices of Performances of CPRs Nicola.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Case(Control)-Free Multi-SNP Combinations in Case-Control Studies Dumitru Brinza and Alexander Zelikovsky Combinatorial Search (CS) for Disease-Association:
Pharmacogenetics and Pharmacogenomics Eric Jorgenson 2/24/9.
Apo E and pharmacogenetics: tailoring cures to the patient Jose Ordovas PhD Professor/Senior Scientist JM-USDA-Human Nutrition Research Center on Aging,
Zacho J, et al. N Engl J Med 2008;359: Risk of Ischemic Heart Disease as a Function of Plasma Levels of C-Reactive Protein (CRP) in the General.
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
CS177 Lecture 10 SNPs and Human Genetic Variation
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
What host factors are at play? Paul de Bakker Division of Genetics, Brigham and Women’s Hospital Broad Institute of MIT and Harvard
Gene Hunting: Linkage and Association
Genome-Wide Association Study (GWAS)
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Evaluating Results of Learning Blaž Zupan
Risk Prediction of Complex Disease David Evans. Genetic Testing and Personalized Medicine Is this possible also in complex diseases? Predictive testing.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
A PPROACHING THE G ENOME - G ENETIC M ARKERS, L INKAGE AND A SSOCIATION G ENETICS 202 Jon Bernstein Department of Pediatrics October 8, 2015.
Genome wide association studies (A Brief Start)
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Assessment of genomewide association studies Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Admixture Mapping Controlled Crosses Are Often Used to Determine the Genetic Basis of Differences Between Populations. When controlled crosses are not.
SCANNING OF CANDIDATE GENES FOR THE SUSCEPTIBILITY OF KAWASAKI DISEASE IN THE HLA REGION Lee JK, Kim JJ, Kim S, Choi IH, Kim KJ, Hong SJ, Seo EJ, Yoo HW,
Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Date of download: 7/2/2016 Copyright © 2016 American Medical Association. All rights reserved. From: How to Interpret a Genome-wide Association Study JAMA.
Evaluating Results of Learning
High level GWAS analysis
Genome-wide Associations
Presentation transcript:

From Genome-Wide Association Studies to Medicine Florian Schmitzberger - CS 374 – 4/28/2009 Stanford University Biomedical Informatics

Topics 1. Altshuler et al. Genetic Mapping in Human Disease. Science 322, 881 (2008); 2. Zacho et al. Genetically Elevated C-Reactive Protein and Ischemic Vascular Disease. N Engl J Med 359, 18 (2008); 3. Jakobsdottir et al. Interpretation of Genetic Association Studies: Markers with Replicated Highly Significant Odds Ratios May Be Poor Classifiers. PLoS Genetics 5, 2 (2009)

Topics 1. Altshuler et al. Genetic Mapping in Human Disease. Science 322, 881 (2008); 2. Zacho et al. Genetically Elevated C-Reactive Protein and Ischemic Vascular Disease. N Engl J Med 359, 18 (2008); 3. Jakobsdottir et al. Interpretation of Genetic Association Studies: Markers with Replicated Highly Significant Odds Ratios May Be Poor Classifiers. PLoS Genetics 5, 2 (2009)

Genome-wide association studies

Source: Hardy et al. Genomewide Association Studies and Human Disease. N Eng J Med, 360: ; 17 (2009)

Genome-wide association studies

Source: Hardy et al. Genomewide Association Studies and Human Disease. N Eng J Med, 360: ; 17 (2009)

Human Genome Research Over Time Source: Altshuler et al. Genetic Mapping in Human Disease. Science 322, 881 (2008);

Linkage Analysis Source: genome.wellcome.ac.uk

Human Genome Research Over Time Information source: Altshuler et al. Genetic Mapping in Human Disease. Science 322, 881 (2008);

Initial Lessons 1. “Candidate gene” approach inadequate

Initial Lessons 2. Mutations that cause disease often change protein structure Hemoglobin subunit beta mutation in sickle-cell disease.

Initial Lessons 3. Loci often have many rare disease-causing alleles

Initial Lessons 4. 90% of sites of genetic variation are common variants in the population

Common disease – common variant (CDCV) Common polymorphisms (minor allele freq > 1%) contributes to susceptibility to disease.

Common disease – common variant (CDCV) Common polymorphisms (minor allele freq > 1%) contributes to susceptibility to disease. We can use GWAS to see how common variants contribute to disease. Gives us ideas on which positions to investigate.

Tag SNPs Source: The International HapMap Consortium The International HapMap Project Nature Vol /

Tag SNPs Source: The International HapMap Consortium The International HapMap Project Nature Vol /

Tag SNPs Source: The International HapMap Consortium The International HapMap Project Nature Vol /

GWAS – General Lessons Learned 1. GWAS work : ~two dozen reproducible associations : >150

GWAS – General Lessons Learned 1. GWAS work : ~two dozen reproducible associations : > Effect-sizes are modest for common variants (mostly increases by )

GWAS – General Lessons Learned 1. GWAS work : ~two dozen reproducible associations : > Effect-sizes are modest for common variants (mostly increases by ) 3. Power to detect associations has been low

GWAS – General Lessons Learned 1. GWAS work : ~two dozen reproducible associations : > Effect-sizes are modest for common variants (mostly increases by ) 3. Power to detect associations has been low 4. Association studies have identified regions rather than causal genes

GWAS – General Lessons Learned 1. GWAS work : ~two dozen reproducible associations : > Effect-sizes are modest for common variants (mostly increases by ) 3. Power to detect associations has been low 4. Association studies have identified regions rather than causal genes 5. A single locus may contain more than one risk variant

GWAS – General Lessons Learned 1. GWAS work : ~two dozen reproducible associations : > Effect-sizes are modest for common variants (mostly increases by ) 3. Power to detect associations has been low 4. Association studies have identified regions rather than causal genes 5. A single locus may contain more than one risk variant 6. A single locus may contain both common and rare variants

GWAS – General Lessons Learned 1. GWAS work : ~two dozen reproducible associations : > Effect-sizes are modest for common variants (mostly increases by ) 3. Power to detect associations has been low 4. Association studies have identified regions rather than causal genes 5. A single locus may contain more than one risk variant 6. A single locus may contain both common and rare variants 7. There is great variation between ethnic groups

Sample size required For P < 10 −8. Source: Altshuler et al.

Sample size required For P < 10 −8. Source: Altshuler et al.

GWAS – Common Diseases: Lessons Learned 1. The risk for loci already identified by GWAS is currently underestimated due to currently unknown mutations.

GWAS – Common Diseases: Lessons Learned 1. The risk for loci already identified by GWAS is currently underestimated due to currently unknown mutations. 2. Many more disease loci remain to be found. (low statistical power with studies so far)

GWAS – Common Diseases: Lessons Learned 1. The risk for loci already identified by GWAS is currently underestimated due to currently unknown mutations. 2. Many more disease loci remain to be found. (low statistical power with studies so far) 3. Some loci will only contain rare variants (won’t be found using common polymorphisms)

Disease Risk VS Disease Mechanism Primary value of genetic mapping is not risk prediction but gaining knowledge about mechanisms of disease.

GWAS: The Path Ahead 1.Increased sample sizes: 1000 cases,1000 controls, 20% variant, 1.3 increase in risk 1% power 5000 cases, 5000 controls 98% power

GWAS: The Path Ahead 1.Increased sample sizes: 1000 cases,1000 controls, 20% variant, 1.3 increase in risk 1% power 5000 cases, 5000 controls 98% power 2.Different ancestry groups

GWAS: The Path Ahead 1.Increased sample sizes: 1000 cases,1000 controls, 20% variant, 1.3 increase in risk 1% power 5000 cases, 5000 controls 98% power 2.Different ancestry groups 3.Find rare mutations in suspect loci 1000 genomes project

Topics 1. Altshuler et al. Genetic Mapping in Human Disease. Science 322, 881 (2008); 2. Zacho et al. Genetically Elevated C-Reactive Protein and Ischemic Vascular Disease. N Engl J Med 359, 18 (2008); 3. Jakobsdottir et al. Interpretation of Genetic Association Studies: Markers with Replicated Highly Significant Odds Ratios May Be Poor Classifiers. PLoS Genetics 5, 2 (2009)

C-Reactive Protein (CRP) Elevated levels of CRP lead to increased risk of ischemic heart disease and cerebrovascular disease Studies of >40,000 people with ~4,000 with disease Followed for years Measured levels of CRP Genotyping for four CRP polymorphisms

Results Increased CRP levels CRP PolymorphismsIncreased CRP levels Increased likelihood of disease

Results Increased CRP levels CRP PolymorphismsIncreased CRP levels Increased likelihood of disease

Zacho et al.

Results Increased CRP levels CRP PolymorphismsIncreased CRP levels Increased likelihood of disease

Increased CRP levels lead to increased disease risk Zacho et al.

Increased CRP levels lead to increased disease risk

Increased CRP levels lead to increased disease risk

Increased CRP levels lead to increased disease risk

Results Increased CRP levels CRP PolymorphismsIncreased CRP levels Increased likelihood of disease ?

Zacho et al.

Results Increased CRP levels CRP PolymorphismsIncreased CRP levels Increased likelihood of disease

Zacho et al.

Possible issues with this study CRP polymorphisms could lead to higher plasma levels of less active CRP (unlikely, polymorphisms not near coding region) Limitations of the four individual studies Variability with race (only white participants studied) Potential lack of statistical power

Conclusion Genetic variants that lead to increased CRP levels do not lead to an increased risk of heart-disease (and cerebrovascular disease) Increased CRP levels are likely to be a marker rather than cause for disease.

Topics 1. Altshuler et al. Genetic Mapping in Human Disease. Science 322, 881 (2008); 2. Zacho et al. Genetically Elevated C-Reactive Protein and Ischemic Vascular Disease. N Engl J Med 359, 18 (2008); 3. Jakobsdottir et al. Interpretation of Genetic Association Studies: Markers with Replicated Highly Significant Odds Ratios May Be Poor Classifiers. PLoS Genetics 5, 2 (2009)

Statistical methods to evaluate markers in genetic testing 1.ROC (receiver operating characteristic) curves 2.Logistic regression

Genetic testing for the public Sources: 23andme.com decodeme.com navigenics.com

Classification based statistics Evaluates how well one can distinguish between cases and controls.

Disease YES NO Diagnostic Test Negative Positive

True Positive Disease YES NO Diagnostic Test Negative Positive

True Positive False Positive Disease YES NO Diagnostic Test Negative Positive

True Positive False Positive False Negative Disease YES NO Diagnostic Test Negative Positive

True Positive False Positive False Negative True Negative Disease YES NO Diagnostic Test Negative Positive

True Positive False Positive False Negative True Negative Disease YES NO Diagnostic Test Negative Positive Sensitivity = TP TP + FN

True Positive False Positive False Negative True Negative Disease YES NO Diagnostic Test Negative Positive Sensitivity = TP TP + FN With this test, how many people that are actually ill will I catch?

True Positive False Positive False Negative True Negative Disease YES NO Diagnostic Test Negative Positive Sensitivity = TP TP + FN Specificity = TN TN + FP

True Positive False Positive False Negative True Negative Disease YES NO Diagnostic Test Negative Positive Sensitivity = TP TP + FN Specificity = TN TN + FP With this test, will I tell too many people they might be ill?

ROC curves Source: medcalc.be Important measure: area under the curve (AUC)

Odds Ratios (risk analysis) The odds of an event occurring in one group The odds of an event occurring in the control group

Odds Ratios (risk analysis) The odds of an event occurring in one group The odds of an event occurring in the control group event less likely in first group < 1 < event more likely in first group equal likelihood

Take-home message OR “Strong association (low p-value) does not guarantee effective discrimination between cases and controls (classification). Excellent classification (high AUC) does not guarantee good prediction of actual risk” - Jakobsdottir et al.

Source: newscientist.com