Gene Hunting: Design and statistics

Slides:



Advertisements
Similar presentations
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Advertisements

Bob Weaber, Ph.D. Cow-Calf Extension Specialist Assistant Professor Dept. of Animal Sciences and Industry
Outline to SNP bioinformatics lecture
Structural Genomics and Human Health
Genetics and the Organism 10 Jan, Genetics Experimental science of heredity Grew out of need of plant and animal breeders for greater understanding.
Global dissection of cis and trans regulatory variations in Arabidopsis thaliana Xu Zhang Borevitz Lab.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Mutation and DNA Mutation = change(s) in the nucleotide/base sequence of DNA; may occur due to errors in DNA replication or due to the impacts of chemicals.
Identification of obesity-associated intergenic long noncoding RNAs
Genome Variations & GWAS
Geuvadis RNAseq analysis at UNIGE Analysis plans
Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques Min Wenwen
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
A little about how DNA works David Sloane, MD Special Studies, HGSE Brigham and Women’s Hospital Harvard Medical School 2/10/2014David.
Chapter 13.1 and 13.2 RNA, Ribosomes, and Protein Synthesis
RNA and Protein Synthesis
Gene Hunting: Linkage and Association
Supplemental Figure 1A. A small fraction of genes were mapped to >=20 SNPs. Supplemental Figure 1B. The density of distance from the position of an associated.
Molecular Biology in a Nutshell (via UCSC Genome Browser) Personalized Medicine: Understanding Your Own Genome Fall 2014.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
The International Consortium. The International HapMap Project.
Single nucleotide polymorphisms and Large scale variation
Evolution at the Molecular Level. Outline Evolution of genomes Evolution of genomes Review of various types and effects of mutations Review of various.
Lesson Four Structure of a Gene. Gene Structure What is a gene? Gene: a unit of DNA on a chromosome that codes for a protein(s) –Exons –Introns –Promoter.
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
Genetic Code and Interrupted Gene Chapter 4. Genetic Code and Interrupted Gene Aala A. Abulfaraj.
EQTLs.
Genomic Analysis: GWAS
Lesson Four Structure of a Gene.
Lesson Four Structure of a Gene.
Complex disease and long-range regulation: Interpreting the GWAS using a Dual Colour Transgenesis Strategy in Zebrafish.
upstream vs. ORF binding and gene expression?
Statistical Applications in Biology and Genetics
Functional Mapping and Annotation of GWAS: FUMA
Linkage and Linkage Disequilibrium
Every living organism inherits a blueprint for life from its parents.
School of Pharmacy, University of Nizwa
Recombination (Crossing Over)
Protein Synthesis.
Case Study #2 Session 1, Day 3, Liu
Chapter 11.4.
High level GWAS analysis
MICROBIAL GENETICS CHAPTER 7.
Genome-wide Associations
Beyond GWAS Erik Fransen.
Linking Genetic Variation to Important Phenotypes
Bertram et al. (2005) , NEJM, 352: Bertram et al. (2005) , NEJM, 352:
Correlation for a pair of relatives
Disentangling the Effects of Colocalizing Genomic Annotations to Functionally Prioritize Non-coding Variants within Complex-Trait Loci  Gosia Trynka,
Inferring Genetic Architecture of Complex Biological Processes Brian S
Relationship between Genotype and Phenotype
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
Statistical Genetics 統計遺伝学
Pharmacogenomic variability and anaesthesia
School of Pharmacy, University of Nizwa
Sequences and their Properties
Enhancer Connectome Nominates Target Genes of Inherited Risk Variants from Inflammatory Skin Disorders  Mark Y. Jeng, Maxwell R. Mumbach, Jeffrey M. Granja,
Medical genomics BI420 Department of Biology, Boston College
One SNP at a Time: Moving beyond GWAS in Psoriasis
A systems view of genetics in chronic kidney disease
Perspectives from Human Studies and Low Density Chip
Medical genomics BI420 Department of Biology, Boston College
An Expanded View of Complex Traits: From Polygenic to Omnigenic
Fig. 2 Genotype-induced differential gene expression is different in MDMi cells compared to monocytes. Genotype-induced differential gene expression is.
GWAS-eQTL signal colocalisation methods
By Wenfei Jin Presenter: Peter Kyesmu
Presentation transcript:

Gene Hunting: Design and statistics

Population-based Association Design: Qualitative Phenotype Genotype: Schiz: Not Schiz: AA AC CC Do c2 test for association.

Population-based Association Design: Quantitative Phenotype Number of C alleles 0 (AA) 1 (AC) 2 (CC) Phenotype Compute the correlation (or regression slope)

GWAS: Genome-wide Association Study DNA arrays with 1,000s of SNPs scattered throughout the genome. (Current chips have several million different SNPs) Select the SNPs so that they cover ALL the genome using haplotype blocks. (Some DNA chips oversample SNPs in protein coding regions) Genotype patients and controls on all the SNPs (or genotype a random sample of the population). Find the SNPs that differ patients from controls (or have a significant correlation with a quantitative phenotype). Problem: number of statistical tests.

GWAS results as of 2012 From http://www.genome.gov/multimedia/illustrations/GWAS_2012-12.pdf

GWAS and Quantitative Phenotype: Height (Weedon et al, 2007) Note: Effect size = c. 0.2 inches, length of a housefly

Problems with GWAS (1) Expensive. (2) Large number of statistical tests. (3) Need very, very large samples (10,000 or more.

Results from GWAS (1) Good success in medicine. (2) More limited success for psychiatric disorders (but things are improving) (3) Success for normal behavioral traits (personality, IQ) just starting (4) Genetics of behavior is hyper-polygenic: many, many, many genes (5) Predictive power is poor but getting better (6) Pointing to biological mechanisms

Used to be hard to find genes From The Consortium on Tobacco and Genetics (2010)

But things are changing … Manhattan plot for IQ From: Coleman et al. (2018) Molecular Psychiatry.

After GWAS Enrichment Analysis aka functional [enrichment] analysis After detecting a “hit” what do you do? Enrichment Analysis aka functional [enrichment] analysis aka genetic set enrichment analysis (GSEA) aka pathway analysis Conglomeration of different techniques aimed at uncovering the coding areas, function(s), tissue specificity, networks, pathways, etc. for the “hits” in a GWAS

First Question: Where is it? Near a coding region: Exon Synonymous (same amino acid) Nonsynonymous (different amino acid) Intron Splice variant Enhancer Near Promoter Actively transcribed (H3K4me3) Not near a coding region: Nearest coding region(s) Enhancer (eQTL) = expression quantitative trait locus

If the “hit” is in or very close to a coding region (< 10% of all GWAS hits) Exon (see next slide) Intron “Header” area (promoter; technically 5’ UTR) “Trailer” area (technically, 3’UTR)

Synonymous (same amino acid) Exon SNP Missense (amino acid codon) Non synonymous (different amino acid) Nonsense (chain terminating codon)

Splice variant (influences the type(s) of mRNA) Intron SNP Splice variant (influences the type(s) of mRNA) Enhancer (influences rate of transcription)

If the “hit” is not close to a coding region (c. 90% of all GWAS hits) Nearest coding region Linear: nearest in base pairs Chromosome confirmation: nearest in 3D Regulatory role (eQTL): what coding region mRNAs does the “hit” influence? Expression in which tissue(s) Expression at which developmental stage(s) eQTL = expression quantitative trait locus (contribute to variation in the amount of mRNA expressed)

Other Questions: What tissues are the [nearest] coding region[s] expressed in? Are histone markers nearby? H3K4me3  active promoter region H3K27aqc  enhancer What mRNAs and how much mRNAs are associated with the region = eQTL (expression quantitative trait locus) E.g., does the amount of mRNA differ in patients and controls? What other ”hits” are also functionally related to this “hit” = Network analysis

IQ Genes From: Coleman et al. (2018) Molecular Psychiatry.

Polygenic Risk Score (PRS) AKA Genomic Polygenic Score (GPS) Use the top predictors in GWAS that predict the phenotype Always more loci than just the significant loci Validate in a new sample

Polygenic Risk Scores for Education Year Phenotype R2 Study 2013 Years of Education .02 Rietveld et al. 2016 .04 Okbay et al. 2017 Educational Attainment .16 Selzam et al.