Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad.

Similar presentations


Presentation on theme: "Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad."— Presentation transcript:

1 Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad Institute of MIT and Harvard

2 Recombination breakpoints Family Inheritance Me vs. my brother My dad Dad’s mom Mom’s dad Human ancestry Disease risk Genomics: Regions  mechanisms  drugsSystems: genes  combinations  pathways Personal genomics today: 23 and We

3 Goal: A systems-level understanding of genomes and gene regulation: The regulators: Transcription factors, microRNAs, sequence specificities The regions: enhancers, promoters, and their tissue-specificity The targets: TFs  targets, regulators  enhancers, enhancers  genes The grammars: Interplay of multiple TFs  prediction of gene expression  The parts list = Building blocks of gene regulatory networks CATGACTG CATGCCTG Disease-associated variant (SNP/CNV/…) Gene annotation (Coding, 5’/3’UTR, RNAs)  Evolutionary signatures Non-coding annotation  Chromatin signatures Roles in gene/chromatin regulation  Activator/repressor signatures Other evidence of function  Signatures of selection (sp/pop) Understanding human variation and human disease Challenge: from loci to mechanism, pathways, drug targets

4 Compare 29 mammals: Reveal constrained positions Reveal individual transcription factor binding sites Within motif instances reveal position-specific bias More species: motif consensus directly revealed NRSF motif

5 Chromatin state dynamics across nine cell types Single annotation track for each cell type Summarize cell-type activity at a glance Can study 9-cell activity pattern across Correlated activity Predicted linking

6 xx Disease-associated SNPs enriched for enhancers in relevant cell types E.g. lupus SNP in GM enhancer disrupts Ets1 predicted activator Revisiting disease- associated variants

7 HaploReg: Automate search for any disease study (compbio.mit.edu/HaploReg) Start with any list of SNPs or select a GWA study –Mine publically available ENCODE data for significant hits –Hundreds of assays, dozens of cells, conservation, motifs –Report significant overlaps and link to info/browser

8 Experimental dissection of regulatory motifs for 10,000s of human enhancers 54000+ measurements (x2 cells, 2x repl)

9 Example activator: conserved HNF4 motif match WT expression specific to HepG2 Non-disruptive changes maintain expression Motif match disruptions reduce expression to background Random changes depend on effect to motif match

10 Allele-specific chromatin marks: cis-vs-trans effects Maternal and paternal GM12878 genomes sequenced Map reads to phased genome, handle SNPs indels Correlate activity changes with sequence differences

11 Brain methylation in 750 Alzheimer patients/controls 500,000 methylation probes 750 individuals 10+ years of cognitive evaluations, post-mortem brains 93% of functional epigenomic variation is genotype driven! Global repression in 7,000 enhancers, brain-specific targets Phil de Jager, Roadmap disease epigenomics Brad Bernstein REMC mapping Genome Epigenome meQTL Phenotype Epigenome Classification MWAS 1 2

12 Global hyper-methylation in 1000s of AD-associated loci Alzheimer’s-associated probes are hypermethylated 480,000 probes, ranked by Alzheimer’s association P-value Methylation Top 7000 probes Global effect across 1000s of probes –Rank all probes by Alzheimer’s association –7000 probes increase methylation (repressed) –Enriched in brain-specific enhancers –Near motifs of brain-specific regulators  Complex disease: genome-wide effects

13 Human constraint outside conserved regions Non-conserved regions: –ENCODE-active regions show reduced diversity  Lineage-specific constraint in biochemically-active regions Conserved regions: –Non-ENCODE regions show increased diversity  Loss of constraint in human when biochemically-inactive Average diversity (heterozygosity) Aggregate over the genome Active regions

14 Covers computational challenges associated with personal genomics: - genotype phasing and haplotype reconstruction  resolve mom/dad chromosomes - exploiting linkage for variant imputation  co-inheritance patterns in human population - ancestry painting for admixed genomes  result of human migration patterns - predicting likely causal variants using functional genomics  from regions to mechanism - comparative genomics annotation of coding/non-coding elements  gene regulation - relating regulatory variation to gene expression or chromatin  quantitative trait loci - measuring recent evolution and human selection  selective pressure shaped our genome - using systems/network information to decipher weak contributions  combinatorics - challenge of complex multi-genic traits: height, diabetes, Alzheimer's  1000s of genes

15 Personal genomics tomorrow: Already 100,000s of complete genomes Health, disease, quantitative traits: – Genomics regions  disease mechanism, drug targets – Protein-coding  cracking regulatory code, variation – Single genes  systems, gene interactions, pathways Human ancestry: – Resolve all of human ancestral relationships – Complete history of all migrations, selective events – Resolve common inheritance vs. trait association What’s missing is the computation – New algorithms, machine learning, dimensionality reduction – Individualized treatment from 1000s genes, genome – Understand missing heritability – Reveal co-evolution between genes/elements – Correct for modulating effects in GWAS

16 Collaborators and Acknowledgements Chromatin state dynamics –Brad Bernstein, ENCODE consortium Methylation in Alzheimer’s disease –Phil de Jager, Brad Bernstein, Epigenome Roadmap Mammalian comparative genomics –Kerstin Lindblad-Toh, Eric Lander, 29 mammals consortium Massively parallel enhancer reporter assays –Tarjei Mikkelsen, Broad Institute Funding –NHGRI, NIH, NSF Sloan Foundation

17 Daniel Marbach Mike Lin Jason Ernst Jessica Wu Rachel Sealfon Pouya Kheradpour Manolis Kellis Chris Bristow Loyal Goff Irwin Jungreis MIT Computational Biology group Compbio.mit.edu Sushmita Roy Luke Ward Stata4 Stata3 Louisa DiStefano Dave Hendrix Angela Yen Ben Holmes Soheil Feizi Mukul Bansal Bob Altshuler Stefan Washietl Matt Eaton


Download ppt "Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad."

Similar presentations


Ads by Google