Genomic Analysis: GWAS

Slides:



Advertisements
Similar presentations
Linkage and Genetic Mapping
Advertisements

Lecture 2 Strachan and Read Chapter 13
Review of main points from last week Medical costs escalating largely due to new technology This is an ethical/social problem with major conseq. Many new.
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Genetic Analysis in Human Disease
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Introduction to Medical Genetics Fadel A. Sharif.
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
RFLP DNA molecular testing and DNA Typing
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
CS177 Lecture 10 SNPs and Human Genetic Variation
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
GENETICS Dr. Samar Saleh Assiss. Lecturer Mosul Medical College Pathology3 rd year.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
© 2007 McGraw-Hill Higher Education. All rights reserved. Chapter 2 Genetics: You and Your Family Health History.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Pharmacogenetics/Pharmacogenomics. Outline Introduction  Differential drug efficacy  People react differently to drugs Why does drug response vary?
HS-LS-3 Apply concepts of statistics and probability to support explanations that organisms with an advantageous heritable trait tend to increase in proportion.
Simple-Sequence Length Polymorphisms
OMICS Journals are welcoming Submissions
Interpreting exomes and genomes: a beginner’s guide
Single Nucleotide Polymorphisms (SNPs
SNPs and complex traits: where is the hidden heritability?
Nucleotide variation in the human genome
Complex disease and long-range regulation: Interpreting the GWAS using a Dual Colour Transgenesis Strategy in Zebrafish.
Genetic Testing for the Clinician
Quantitative traits Lecture 13 By Ms. Shumaila Azam
Genome Wide Association Studies using SNP
Gene-set analysis Danielle Posthuma & Christiaan de Leeuw
Human Cells Human genomics
Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam
Gene Hunting: Design and statistics
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
High level GWAS analysis
Recommended Reading: Chapter 12 of OpenStax
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Genome-wide Associations
Beyond GWAS Erik Fransen.
DNA Polymorphisms: DNA markers a useful tool in biotechnology
The student is expected to: 6A identify components of DNA, and describe how information for specifying the traits of an organism is carried in the DNA.
Psychiatric Disorders: Diagnosis to Therapy
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
Chapter 7 Multifactorial Traits
Exercise: Effect of the IL6R gene on IL-6R concentration
Sequential Steps in Genome Mapping
Medical genomics BI420 Department of Biology, Boston College
Psychiatric Disorders: Diagnosis to Therapy
BIO 1A – Unit 4 Notes Genetics.
Medical genomics BI420 Department of Biology, Boston College
Evan G. Williams, Johan Auwerx  Cell 
Genomics, genetic epidemiology, and genomic medicine
Restriction Fragment Length Polymorphism (RFLP)
Discovery From Data Repositories H Craig Mak  Nature Biotechnology 29, 46–47 (2011) 2013 /06 /10.
Presentation transcript:

Genomic Analysis: GWAS

Genetic Markers Genetic marker – a locus used to identify a chromosome or locate other genes on a genetic map Many different types including SNPs, VNTRs (microsatellites), RFLPs, etc.

Genetic Markers: SNPs Single Nucleotide Polymorphisms Polymorphic single bases Four possible states at any single base in a genome Usually only two are observed, ancestral and variant Advantages – Low mutation rate (stable) High abundance (every 100-300 bases) Easy to type Disadvantages – Rate heterogeneity Ascertainment bias Low information content

Genetic Markers: VNTRs Short tandem repeats, microsatellites, STRs Mononucleotide, dinucleotide, trinucleotide, etc. Allele lengths are variable ((TA)3, (TTAA)12, (AGT)33, etc.) Many possible variants in a population Most often occur in non-coding regions Advantages – Low ascertainment bias Easy to identify Highly informative Disadvantages – High mutation frequency Complex mutation behavior Difficult to automate genotyping

SNPs: Single nucleotide polymorphisms Responsible for 90% of all human genetic variation ~12,000,000 documented SNPs in the NCBI database Categorized as coding (in an exon) or noncoding (the majority) Coding SNPs can be synonymous or nonsynonymous Most SNPs are completely neutral Often used as markers for pinpointing disease causing polymorphisms

Finding ‘phenotypic’ SNPs Many genes ~25,000 genes, many can be candidates, many may contribute to particular phenotypes Many SNPs ~12,000,000 SNPs, ability to predict functional SNPs is limited Methods to select candidate SNPs (narrow  broad): Only functional SNPs in a candidate gene Systematic screen of SNPs in a candidate gene Systematic screen of SNPs in an entire metabolic pathway Systematic screen for all coding changes (exome screening)

Genomic Medicine Exome sequencing Exome – the coding sequences of all annotated protein coding genes; ~1% of the genome Accomplished via target-capture methods What’s the major potential drawback?

Genomic Medicine First application of exome sequencing to syndrome with unknown cause Miller syndrome – thought to be recessive Suggests that effected individuals require two variants (one on each chromosome) Exomes of four individuals sequenced including a pair of siblings Narrowed to a single gene, DHODH, dihydroorotate dehydrogenase, biosynthesis of pyrimidines All individuals harbored compound heterozygous mutations for missense mutations All parents were carriers Ng et al. 2010, Nature Genetics 42, 30-35

Finding ‘responsible’ SNPs in an ocean of variation Many genes ~25,000 genes, many can be candidates, many may contribute to particular phenotypes Many SNPs ~12,000,000 SNPs, ability to predict functional SNPs is limited Methods to select candidate SNPs (narrow  broad): Only functional SNPs in a candidate gene Systematic screen of SNPs in a candidate gene Systematic screen of SNPs in an entire pathway Systematic screen for all coding changes Genome-wide screen (GWAS)

Introduction to genomic analysis A genome-wide association study (GWAS) is an approach that involves rapidly scanning markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular phenotype. Once associations are identified, develop better strategies to detect, treat and prevent the disease. find genetic variations that contribute to common, complex diseases, such as asthma, cancer, diabetes, heart disease and mental illnesses. http://www.genome.gov/20019523

Potential of GWAS Whole genome information, when combined with epidemiological, clinical and other phenotype data, offers the potential for increased understanding of basic biological processes affecting human health, improvement in the prediction of disease and patient care, the promise of personalized medicine.

Potential of GWAS

How to do GWAS What do you need? The human genome reference A map of human genetic variation A set of technologies that can quickly and accurately analyze whole or partial (exome) samples for genetic variants This is typically accomplished using low coverage genome (or exome) sequencing (4-20X) A typically GWAS is based on a case-control design in which SNPs are genotyped across a population….

How to do GWAS A typically GWAS is based on a case-control design in which SNPs are genotyped across a population…. And the strength of association between each SNP and the disease in question is calculated

How to do GWAS A typically GWAS is based on a case-control design in which SNPs are genotyped across a population…. And the strength of association between each SNP and the disease in question is calculated Usually visualized via a Manhattan plot in which SNPS from each chromosome are plotted along with their association value

The basic idea The A allele is associated (4/14, 29%) with individuals exhibiting the disease phenotype The basic idea G G G A A G A A A A G G A A A G G A G G G G G G G G

Age-related macular degeneration Study cohort – 2172 unrelated individuals of European descent, at least 60 years old 1238 with AMD, 934 controls Each individual harbors two alleles 2476 AMD alleles 1868 non-AMD alleles Null hypothesis – Alleles will be randomly distributed in the population, i.e. no association of any alleles with AMD Alternative hypothesis – Some allele will be positively associated with AMD

Age-related macular degeneration Single SNP identified by GWAS, rs1061170 4344 alleles recovered, two variants C/T X2 test suggests association, p=1.2 x 10-62 Allele Cases with AMD Controls Total Alleles C 1522 670 2192 T 954 1198 2152 Total alleles 2476 1868 4344

Age-related macular degeneration https://genome.ucsc.edu/cgi- bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState= &virtModeType=default&virtMode=0&nonVirtPosition=&position=chr1%3A19 6690014-196690236&hgsid=572725431_LVSGMKr7pmucs4DrZ7CkYbaFVbni http://useast.ensembl.org/Homo_sapiens/Variation/Explore?r=1:196689607- 196690607;v=rs1061170;vdb=variation;vf=762175

Complex traits vs. Mendelian traits Traits for which a molecular cause is known (2002) Complex trait – any phenotype that does not exhibit classic Mendelian inheritance attributable to a single locus Although these traits may exhibit familial tendencies Why would a trait be non-Mendelian? Codominance, incomplete dominance Multiple alleles Polygenic characteristics Environmental effects

GWAS in practice (2007)

GWAS in practice

GWAS in practice

GWAS in practice

GWAS in practice

GWAS in practice (2014)

Post-GWAS: Finding the causal locus GWAS is really just a starting point – it typically narrows it the causal region down to a few million or a few hundred thousand bp SNPs occur every 100-300 bp If the locus is narrowed down to 500,000 bp, that’s ~2500 SNPs One way to proceed – identify genes in the region and determine plausibility Location of SNPs and function of the locus Relatively easy in some cases Not so much in others (regulatory function?) Functional annotation databases exist to classify genes according to roles in the cell

Association signals in the IL23R gene region on chromosome 1p31 Association signals in the IL23R gene region on chromosome 1p31. (A) Genomic locations of genes on chromosome 1p31 between 67,260,000 and 67,580,000 base pairs (Build 35). (B) The negative log10 association P-values (Cochran-Mantel-Haenszel chi-square test) from the combined Jewish and non-Jewish case-control cohorts are plotted for genotyped markers in the region.

GWAS is promising Many diseases and traits are influenced by genetic factors i.e., they are caused by sequence variants in the genome Over 12 millions SNPs are known in the genome i.e., some SNPs will be directly or indirectly associated with causal variants The cost of SNP Genotyping is reduced i.e., it is affordable to genotype a large number of SNPs in the genome Large numbers of cases and controls are available i.e., there is statistical power to detect variants with modest effect

GWAS is challenging Many diseases and traits are influenced by genetic factors But probably due to multiple modest risk variants They confer a stronger risk when they interact True associated SNPs are not necessary highly significant Too many SNPs are evaluated False positives due to multiple tests Single studies tend to be underpowered False negatives Considerable heterogeneity among studies Phenotypic and genetic heterogeneity False positives due to population stratification Xu, 2007

Components of a GWAS (simple)

Components of a GWAS (not so simple)