Presentation is loading. Please wait.

Presentation is loading. Please wait.

Manolis Kellis Broad Institute of MIT and Harvard

Similar presentations


Presentation on theme: "Manolis Kellis Broad Institute of MIT and Harvard"— Presentation transcript:

1 Computational personal genomics: selection, regulation, epigenomics, disease
Manolis Kellis Broad Institute of MIT and Harvard MIT Computer Science & Artificial Intelligence Laboratory

2 Understanding human variation and human disease
Gene annotation (Coding, 5’/3’UTR, RNAs) Evolutionary signatures Roles in gene/chromatin regulation  Activator/repressor signatures CATGACTG CATGCCTG Disease-associated variant (SNP/CNV/…) Non-coding annotation  Chromatin signatures Other evidence of function  Signatures of selection (sp/pop) Challenge: from loci to mechanism, pathways, drug targets Goal: A systems-level understanding of genomes and gene regulation: The regulators: Transcription factors, microRNAs, sequence specificities The regions: enhancers, promoters, and their tissue-specificity The targets: TFstargets, regulatorsenhancers, enhancersgenes The grammars: Interplay of multiple TFs  prediction of gene expression  The parts list = Building blocks of gene regulatory networks add cartoon image here (remember slide is copied below) 2

3 Compare 29 mammals: Reveal constrained positions
NRSF motif Reveal individual transcription factor binding sites Within motif instances reveal position-specific bias More species: motif consensus directly revealed

4 Chromatin state dynamics across nine cell types
Predicted linking Key points to make: Chromatin states enabled us to study the dynamic nature of chromatin across many cell types. By distinguishing 15 different types of chromatin states, we could summarize all significant combinations of 81 different chromatin tracks and 2.4 billion reads in just nine chromatin annotation tracks, one for each cell type. For example, the same gene (WLS), is ‘poised’ in embryonic stem cells (ES), repressed in three other cell types (K562, blood, and liver), and active in the other five cell types. This allows us to now define ‘vectors’ of activity for each region of the genome, based on the chromatin annotation in the nine cell types. Correlated activity Single annotation track for each cell type Summarize cell-type activity at a glance Can study 9-cell activity pattern across

5 Revisiting disease- associated variants
xx Revisiting disease- associated variants Disease-associated SNPs enriched for enhancers in relevant cell types E.g. lupus SNP in GM enhancer disrupts Ets1 predicted activator

6 HaploReg: Automate search for any disease study (compbio. mit
HaploReg: Automate search for any disease study (compbio.mit.edu/HaploReg) Start with any list of SNPs or select a GWA study Mine publically available ENCODE data for significant hits Hundreds of assays, dozens of cells, conservation, motifs Report significant overlaps and link to info/browser

7 54000+ measurements (x2 cells, 2x repl)
Experimental dissection of regulatory motifs for 10,000s of human enhancers measurements (x2 cells, 2x repl)

8 Example activator: conserved HNF4 motif match
WT expression specific to HepG2 Motif match disruptions reduce expression to background Non-disruptive changes maintain expression Random changes depend on effect to motif match

9 Allele-specific chromatin marks: cis-vs-trans effects
Maternal and paternal GM12878 genomes sequenced Map reads to phased genome, handle SNPs indels Correlate activity changes with sequence differences

10 Brain methylation in 750 Alzheimer patients/controls
500,000 methylation probes 750 individuals Brad Bernstein REMC mapping Phil de Jager, Roadmap disease epigenomics Genome Epigenome meQTL Phenotype Classification MWAS 1 2 10+ years of cognitive evaluations, post-mortem brains 93% of functional epigenomic variation is genotype driven! Global repression in 7,000 enhancers, brain-specific targets

11 Global hyper-methylation in 1000s of AD-associated loci
Top 7000 probes P-value 480,000 probes, ranked by Alzheimer’s association Methylation Alzheimer’s-associated probes are hypermethylated Global effect across 1000s of probes Rank all probes by Alzheimer’s association 7000 probes increase methylation (repressed) Enriched in brain-specific enhancers Near motifs of brain-specific regulators Complex disease: genome-wide effects

12 Covers computational challenges associated with personal genomics:
- genotype phasing and haplotype reconstruction  resolve mom/dad chromosomes - exploiting linkage for variant imputation  co-inheritance patterns in human population - ancestry painting for admixed genomes  result of human migration patterns - predicting likely causal variants using functional genomics  from regions to mechanism - comparative genomics annotation of coding/non-coding elements  gene regulation - relating regulatory variation to gene expression or chromatin  quantitative trait loci - measuring recent evolution and human selection  selective pressure shaped our genome - using systems/network information to decipher weak contributions  combinatorics - challenge of complex multi-genic traits: height, diabetes, Alzheimer's  1000s of genes

13 Personal genomics today: 23 and We
Recombination breakpoints Family Inheritance Me vs. my brother My dad Dad’s mom Mom’s dad Human ancestry Disease risk Genomics: Regions  mechanisms  drugs Systems: genes  combinations  pathways

14 Personal genomics tomorrow: Already 100,000s of complete genomes
Health, disease, quantitative traits: Genomics regions  disease mechanism, drug targets Protein-coding  cracking regulatory code, variation Single genes  systems, gene interactions, pathways Human ancestry: Resolve all of human ancestral relationships Complete history of all migrations, selective events Resolve common inheritance vs. trait association What’s missing is the computation New algorithms, machine learning, dimensionality reduction Individualized treatment from 1000s genes, genome Understand missing heritability Reveal co-evolution between genes/elements Correct for modulating effects in GWAS

15 Collaborators and Acknowledgements
Chromatin state dynamics Brad Bernstein, ENCODE consortium Methylation in Alzheimer’s disease Phil de Jager, Brad Bernstein, Epigenome Roadmap Mammalian comparative genomics Kerstin Lindblad-Toh, Eric Lander, 29 mammals consortium Massively parallel enhancer reporter assays Tarjei Mikkelsen, Broad Institute Funding NHGRI, NIH, NSF Sloan Foundation

16 MIT Computational Biology group Compbio.mit.edu
Mike Lin Ben Holmes Soheil Feizi Angela Yen Luke Ward Bob Altshuler Mukul Bansal Chris Bristow Stefan Washietl Pouya Kheradpour Matt Eaton Manolis Kellis Jason Ernst Irwin Jungreis Rachel Sealfon Jessica Wu Daniel Marbach Louisa DiStefano Dave Hendrix Loyal Goff Sushmita Roy Stata3 Stata4

17 Human constraint matches region activity
Active regions Average diversity (heterozygosity) Aggregate over the genome Conserved regions: Non-ENCODE regions show increased diversity  Loss of constraint in human when biochemically-inactive Non-conserved regions: ENCODE-active regions show reduced diversity  Lineage-specific constraint in biochemically-active regions


Download ppt "Manolis Kellis Broad Institute of MIT and Harvard"

Similar presentations


Ads by Google