Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jason Ernst Joint work with Pouya Kheradpour, Luke Ward

Similar presentations


Presentation on theme: "Jason Ernst Joint work with Pouya Kheradpour, Luke Ward"— Presentation transcript:

1 Chromatin state dynamics in nine human cell types elucidate regulators and disease-associated SNPs
Jason Ernst Joint work with Pouya Kheradpour, Luke Ward Brad Bernstein and Manolis Kellis

2 Challenge: interpreting disease-associated variants
CATGACTG CATGCCTG Epigenomics Disease variants GWAS studies implicate thousands of non-coding loci associated with disease Challenges towards interpreting disease variants: Find ‘true’ causative SNP among many candidates in LD Determining type of function: especially outside protein-coding Reveal relevant cell type of activity Link to upstream regulators and downstream targets This talk: chromatin tools to address these challenges

3 Challenge of data integration in many marks/cells
Construct antibodies pull down chromatin  ChIP-seq tracks Epigenomic information retains genome ‘state’ in differentiation and development Histone tail modifications Two types: DNA methyl. Histone marks Dozens of chromatin tracks Understand their function Reveal their combinations Annotate systematically Common chromatin states Explicitly model combinations Unsupervised approach, probabilistic model DNA packaged into chromatin around histone proteins

4 From ‘chromatin marks’ to ‘chromatin states’
Learn de novo significant combinations of chromatin marks Reveal functional elements, even without looking at sequence Use for genome annotation Use for studying regulation dynamics in different cell types Promoter states Transcribed states Active Intergenic Repressed

5 ENCODE: Study nine marks in nine human cell lines
9 human cell types 81 Chromatin Tracks (2^81 combinations) H3K4me1 H3K4me2 H3K4me3 H3K27ac H3K9ac H3K27me3 H4K20me1 H3K36me3 CTCF +WCE +RNA HUVEC Umbilical vein endothelial NHEK Keratinocytes GM12878 Lymphoblastoid K562 Myelogenous leukemia HepG2 Liver carcinoma NHLF Normal human lung fibroblast HMEC Mammary epithelial cell HSMM Skeletal muscle myoblasts H1 Embryonic x 15 chromatin states (for each cell type)

6 Chromatin states dynamics across nine cell types
Key points to make: Chromatin states enabled us to study the dynamic nature of chromatin across many cell types. By distinguishing 15 different types of chromatin states, we could summarize all significant combinations of 81 different chromatin tracks and 2.4 billion reads in just nine chromatin annotation tracks, one for each cell type. For example, the same gene (WLS), is ‘poised’ in embryonic stem cells (ES), repressed in three other cell types (K562, blood, and liver), and active in the other five cell types. This allows us to now define ‘vectors’ of activity for each region of the genome, based on the chromatin annotation in the nine cell types. Single annotation track for each cell type Summarize cell-type activity at a glance Can study 9-cell activity pattern across

7 Introducing multi-cell activity profiles
Gene expression Chromatin States Active TF motif enrichment TF regulator expression Dip-aligned motif biases HUVEC NHEK GM12878 K562 HepG2 NHLF HMEC HSMM H1 TF On TF Off Motif aligned Flat profile ON OFF Active enhancer Repressed Motif enrichment Motif depletion

8 Linking Distal Regulatory Elements to Genes
Which gene(s) is this active enhancer in HMEC likely regulating? ? ? HMEC state IRF6 expression C1orf107 expression -1.6 4.2 3.7 0.9 -1.7 -0.7 0.1 0.4 0.3 -0.1 0.5 -1.3 0.0 1.2 -1.1 H3K27ac signal Compute correlations between gene expression levels and enhancer associated histone modification signals

9 Linking Distal Regulatory Elements to Genes
Which gene(s) is this active enhancer in HMEC likely regulating? 0.6 Random gene expression -1.1 -1.0 -0.5 0.5 -0.8 4.0 HMEC state IRF6 expression -0.7 H3K27ac signal -1.7 -1.6 Combine intensity signal from all marks: Train logistic regression classifier to discriminate real from random correlations, conditioned on state, TSS dist, cell type -1.7 0.9 -1.6 -1.6 4.2 3.7 Random Real Compare correlations between enhancer and gene expression between real and randomized data

10 Enhancer-gene links supported by eQTL-gene links
eQTL study Validation rationale: Expression Quantitative Trait Loci (eQTLs) provide independent SNP-to-gene links Do they agree with activity-based links? 15kb Individuals Indiv. 1 -0.5 A Indiv. 2 -1.5 A Indiv. 3 -1.8 A Example: Lymphoblastoid (GM) cells study Expression/genotype across 60 individuals (Montgomery et al, Nature 2010) 120 eQTLs are eligible for enhancer-gene linking based on our datasets 51 actually linked (43%) using predictions  4-fold enrichment (10% exp. by chance) Indiv. 4 3.1 C Indiv. 5 1.1 A Indiv. 6 -1.8 A Indiv. 7 -1.4 A C Indiv. 8 3.2 C Indiv. 9 4.4 Independent validation of links. Relevance to disease datasets. Expression level of gene Sequence variant at distal position

11 Coordinated activity reveals activators/repressors
Enhancer activity Gene activity Predicted regulators Activity signatures for each TF Key points to make: Using these correlations in activity enabled us to start piecing together enhancer regulatory networks, which have been previously inaccessible, linking regulators to enhancers and enhancers to target genes. Putting it all together, we can (a) define 20 distinct profiles of activity (labeled A through T) across the nine cell types, (b) observe the expression patterns of associated genes, showing upward of 0.9 correlation with enhancer activity, (c) discover enriched regulatory motifs revealing candidate regulators, (d) distinguish activators and repressors based on positive or negative correlations between motif enrichment in active regions and expression of the corresponding regulator. [click-animate] For example, cluster Oct 4 is a predicted activator of enhancers active in embryonic stem (ES) cells. The motif is enriched in ES-specific enhancers (cluster A), and the Oct 4 TF is expressed specifically in the same cell type [click-animate] similarly, Ets is a predicted activator of cluster G, associated with GM and HUVEC activity but not either one alone. This is important for the next slide, as we predict that a disruption in the Ets1 motif in patients of lupus erythromatosus is responsible for disruption of the corresponding enhancer and disregulation of immunity gene HLA-DRB1 (Human Leukocyte Antigen) in the major histocompatibility locus. Enhancer networks: Regulator  enhancer  target gene Ex1: Oct4 predicted activator of embryonic stem (ES) cells Ex2: Gfi1 repressor of K562/GM cells

12 Causal motifs supported by dips & enhancer assays
Dip evidence of TF binding (nucleosome displacement) Enhancer activity halved by single-motif disruption  Motifs bound by TF, contribute to enhancers

13 Revisiting disease- associated variants
xx Revisiting disease- associated variants Disease-associated SNPs enriched for enhancers in relevant cell types E.g. lupus SNP in GM enhancer disrupts Ets1 predicted activator

14 SNPs from GWAS Enrich for Cell Type Specific Strong Enhancer Chromatin States in Biologically Relevant Cell Types Cell Type Title Author/ Journal # SNPs in Strong enhancers Total #SNPs Fold FDR K562 Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium. Ganesh et al Nat Genet 2009 9 35 17 0.02 HepG2 Biological, clinical and population relevance of 95 loci for blood lipids Teslovich et al Nature 2010 13 101 11 GM12878 Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci Stahl et al Nat Genet 2010 7 29 15 0.03 Genome-wide meta-analyses identify three loci associated with primary biliary cirrhosis Liu et al 4 6 41 Genome-wide association study in a Chinese Han population identifies nine new susceptibility loci for systemic lupus erythematosus. Han et al 18 21 Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Kathiresan et al Nat Genet 2008 5 24 Genome-wide association study of hematological and biochemical traits in a Japanese population Kamatani et al 39 12 A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Soranzo et al 28 Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer. Houlston et al 3 66 Genome-wide association study identifies eight loci associated with blood pressure. Newton-Chen et al 30 0.04 Ernst et al, Nature 2011

15 Ex1: Systemic lupus erythrematosus SNP: Ets-1 motif
SNP in lymphoblastoid GM enhancer state Disrupts Ets1 motif instance, predicted GM regulator  Model: Disease SNP abolishes GM-specific enhancer

16 Ets-1 is a predicted activator of GM enhancers
Enhancer activity Gene activity Predicted regulators Activity signatures for each TF Ets expression  Ets-1 motif enrichment in enhancers  Model: Ets-1 disruption would abolish enhancer state

17 Chromatin state dynamics: Contributions summary
Chromatin states capture mark combinations Reveal promoter/enhancer/insulator/transcribed regions Chromatin states capture chromatin dynamics Single annotation track for each cell type Nine tracks instead of 2^81 combinations Activity profiles capture correlated changes Gene expression vs. chromatin: EnhancerGene links Motifs vs. TF expr vs. chromiatin: Activators/Repressors Regulatory predictions validated: eQTLs/dips/lucif. eQTLs: links. Dips: binding. Luciferase assays: motif role Interpret disease-associated variants Intergenic SNPs enriched for cell-type specific enhancers Mechanistic predictions reveal potential drug targets

18 Collaborators and Acknowledgements
MIT compbio group: Pouya Kheradpour Lucas Ward Manolis Kellis ENCODE consortium Funding NHGRI, NIH, NSF, HHMI, Sloan Foundation MGH Pathology/HHMI: Tarjei Mikkelsen Noam Shoresh Charles B. Epstein Xiaolan Zhang Li Wang Robyn Issner Michael Coyne Manching Ku Timothy Durham Bradley E. Bernstein


Download ppt "Jason Ernst Joint work with Pouya Kheradpour, Luke Ward"

Similar presentations


Ads by Google