Epigenetics 12/05/07 Statisticians like data.

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Genome Sequence & Gene Expression Chromatin & Nuclear Organization Chromosome Inheritance & Genome Stability.
Epigenetics Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
Transcriptional regulation in Eukaryotes The regulatory elements of bacterial, yeast, and human genes.
Gregor Mendel ( ) DNA (gene) mRNA Protein Transcription RNA processing (splicing etc) Translation Folding Post translational modifications Peptides/amino.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Gene regulation in cancer 11/14/07. Overview The hallmark of cancer is uncontrolled cell proliferation. Oncogenes code for proteins that help to regulate.
Methylation, Acetylation and Epigenetics
Lecture #8Date _________ n Chapter 19~ The Organization and Control of Eukaryotic Genomes.
Organization of DNA Within a Cell from Lodish et al., Molecular Cell Biology, 6 th ed. Fig meters of DNA is packed into a 10  m diameter cell.
Computational Approaches in Epigenomics Guo-Cheng Yuan Department of Biostatistics and Computational Biology Dana-Farber Cancer Institute Harvard School.
D. Cell Specialization: Regulation of Transcription Cell specialization in multicellular organisms results from differential gene expression.
Hybridization Diagnostic tools Nucleic acid Basics PCR Electrophoresis
Day 2! Chapter 15 Eukaryotic Gene Regulation Almost all the cells in an organism are genetically identical. Differences between cell types result from.
[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos.
Today: In-Class 5 (Telomere) Wrap-up RNAi Overview and Discussion Preview: Regulating Gene Expression Exam Return.
Epigenetics Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Chromatin Structure & Gene Expression The Histone Code.
Outline  Nucleosome distribution  Chromatin modification patterns  Mechanisms of chromatin modifications  Biological roles.
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
DNA double helix Nucleosomes Chromosome Felsenfeld & Groudine, Nature (2003) Chromatin structure and cancer epigenetics 30 nm fiber.
Chromatin Remodeling. Levels of chromatin organization nucleosome arrays 300 nm fiber.
Regulation of Gene Expression Chapter 18. Warm Up Explain the difference between a missense and a nonsense mutation. What is a silent mutation? QUIZ TOMORROW:
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
Epigenetic Analysis BIOS Statistics for Systems Biology Spring 2008.
Nozomu TAKAHASHI June 11th, 2012
Ch 15 -.Gene Regulation  Prokaryote Regulation Operon * not found in eukaryotes Operon * not found in eukaryotes Regulator gene = codes for repressor.
Molecular Genetics Introduction to
Introduction to epigenetics: chromatin modifications, DNA methylation and the CpG Island landscape Héctor Corrada Bravo CMSC702 Spring 2013 (many slides.
CS173 Lecture 9: Transcriptional regulation III
Histone Methylation Marks : Permanent or Reversible?
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
Outline Molecular Cell Biology Assessment Review from last lecture Role of nucleoporins in transcription Activators and Repressors Epigenetic mechanisms.
Transcription factor binding motifs (part II) 10/22/07.
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
Molecules and mechanisms of epigenetics. Adult stem cells know their fate! For example: myoblasts can form muscle cells only. Hematopoetic cells only.
Epigenetics Slides by GC Yuan 3/28/12. One genome, Many cell-types The genome is constant across all cell types. Tissue specific gene expression cannot.
Epigenetics Originally defined as “ the branch of biology which studies the causal interactions between genes and their products, which brings the phenotype.
Regulation of transcription in eukaryotes
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
Gene Regulation, Part 2 Lecture 15 (cont.) Fall 2008.
The Chromatin State The scientific quest to decipher the histone code Lior Zimmerman.
Squeezing out the histone modifications data Wieslawa Mentzen with Matteo Floris and Paolo Uva Connections between epigenetics and microRNAs during embryonic.
ChIP-seq Downstream Analysis Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Eukaryotic Gene Regulation
Epigenetics Continued
EPIGENETICS Textbook Fall 2013.
Epigenetics 04/04/16.
7.2 Transcription & gene expression
Regulation of Gene Expression
Regulation of Gene Expression
Chromatin Regulation September 20, 2017.
Introduction to Genetic Analysis
Regulation of Gene Expression by Eukaryotes
Concept 18.2: Eukaryotic gene expression can be regulated at any stage
Controlling Chromatin Structure
Regulation of Gene Expression
Today: Regulating Gene Expression.
Regulation of Gene Expression
Review Warm-Up What is the Central Dogma?
Review Warm-Up What is the Central Dogma?
Review Warm-Up What is the Central Dogma?
Rudolf Jaenisch, Richard Young  Cell 
Adam C. Wilkinson, Hiromitsu Nakauchi, Berthold Göttgens  Cell Systems 
Figure 2 Histone acetylation regulates gene expression
Eukaryotic Gene Regulation
Gene Expression II Kim Foreman, PhD
The Aging Epigenome Molecular Cell
Presentation transcript:

Epigenetics 12/05/07 Statisticians like data. Don’t emphasize method too much, it is not to your advantage. Don’t exaggerate Speak more clearly. In the next slide, explain epigenetics

Epigenetic regulation is critical for cell differentiation Epithelial cell (right); liver cell (left)

Gene imprinting

More examples of epigenetic regulation

Epigenetic mechanisms DNA methylation Histone modification Nucleosome positions

DNA methylation Alberts et al. Molecular Biology of the Cell

Methylated genes are silenced

Probable mechanisms for DNA methylation induced siliencing The DNA methylation marker directly interferes with TF binding. The DNA methylation marker is recognized by proteins that cause chromatin structure changes.

DNA in the nucleus is complexed with histones to form nucleosomes 10,000 nm DNA in the nucleus is complexed with histones to form nucleosomes 11 nm 30nm Mention linker DNA. Say that its length is variable. Keep it real short! Don’t say everything in the figure. Nucleosome is the fundamental repeating unit in chromatin. 1bp (0.3nm)

Histone modification Acetyl Ubiquityl Methyl Phosphoryl Luger et al. Nature, (1997) Histone tails can be covalently modified in multiple ways at multiple sites Felsenfeld and Groudine, Nature, (2003)

How histone modfication is inherited Histone methylation marks may be inherited by local concentration. The exact mechanism for inheritance is unknown. Even if histone modification is inherited is not proved.

Transcriptional regulation by chromatin Nucleosome positioning Histone modification TF TF TF TF TF target site

DNA methylation histone modification chromatin H3K9me3 HP1 H3K9me3 H4K16ac

Epigenetic reprogramming during development Methylation marks are erased during cleavage. Methylation of the maternal genome is actively stripped within hours of fertilization. Maternal genome is passively erased at a slower rate. de novo methylation after implantation. Another round of demethylation during differentiation. DNA methylation is essential for development.

Epigenetic reprogramming can reverse tumorgenesis Figure 1. Two-step cloning procedure to produce mice from cancer cells. Different tumor cells were used as donors for nuclear transfer into enucleated oocytes. Resultant blastocysts were explanted in culture to produce ES cell lines. The tumorigenic and differentiation potential of these ES cells was assayed in vitro by inducing teratomas in SCID mice (1), and in vivo by injecting cells into diploid (2) or tetraploid (3) blastocysts to generate chimeras and entirely ES-cell-derived mice, respectively. Hochelinger et al. Genes & Dev, (2004)

Cancer and histone modification Chin, Nature (1998)

Cancer and chromatin BRG1, the motor component of the SWI/SNF chromatin complex, is mutated in multiple cell lines (Wong et al. 2000) prostate DU145; lung A-427; prostate TSU-Pr-1; lung NCI-H1299; breast ALAB; pancreas Hs 700T … suggesting BRG1 may be a tumor repressor protein

Genomic-view of epigenetic regulation How to detect genome-wide patterns of epigenetic markers? How do epigenetic factors regulate genome-wide gene expression? How is the distribution of genome-wide epigenetic markers regulated?

Log (mononuc/genomic) 1.Tile microarray 20 bp offset, 50-mers Chr III + 233 promoters 2.Hybridize mononucleosomal DNA vs naked genomic DNA Green stuff doesn’t have linker DNA Resolution is 20 bp. Nucleosome signals span multiple probes. midlog phase yeast; mononucleosomal DNA is purified by MNase. Don’t say I didn’t do experiments We first filter out promoters containing highly repetitive sequences. Then ~100 promoters are randomly chosen. ~100 promoters correspond to cell-cycle genes Q: How to filter repetitive sequences? A: Highly repetitive sequences are not tiled. 5 or more contiguous probes with perfect matches. 30 contigs. Q: what kind of arrays? A: Pat Brown arrays. Glass. 25,000 probes. 3.Compute Log (mononuc/genomic) Yuan et al., Science, (2005)

Nucleosome positioning in yeast MFA2 HIS3 MATa MATa MATa nucs predicted positioned nucs CHA1 centromere literature positioned nucs Fuzzy nucleosomes are real. Here is how it looks like in our data.. MFA2 (Watson) is the mating pheromone a-factor, made by a cells. HIS3 (Watson) catalyzes the sixth step in histidine biosynthesis; transcription is regulated by Gcn4p. CHA1 (Crick) catalyzes the degradation of both L-serine and L-threonine; required to use serine or threonine as the sole nitrogen source. fuzzy nucs Yuan et al., Science, (2005)

Stereotyped pattern Aligned by ATG Average signal (aligned by ATG codon) shows regular pattern. 95% CI Log2 Ratio Aligned by ATG You might expect that nucleosome positions at different promoters all look differently. But look. Nucleosome positioning has a common pattern, suggesting there may be a basic principle underlying the nucleosome positioning; Show align wrt NFRs Inter-nucleosome distance 160~170 bp. Predict the length of 5’ UTR. Distance to ATG Yuan et al., Science, (2005)

Transcription factor binding sites (TFBSs) are likely to be nucleosome-depleted TFBSs tend to be nucleosome-depleted. Motif sites that are unbound in our condition but bound in other conditions also tend to be nucleosome depleted. Motif sites that are always unbound do not have nucleosome-depletion property. Show one color at a time Highly transcribed genes tend to be more delocalized in ORF. Q: Why does bound (other) also have a strong signal? A: Maybe nucleosome makes accessible the TFBS that are used in other conditions as well. Thus it gives the potential of activity not the activity itself. Yuan et al., Science, (2005)

Histone modification in yeast Liu et al., PLoS Biology, (2005)

Co-regulated histone modifications Liu et al., PLoS Biology, (2005)

Nucleosome positioning in human Ozsolak et al., Nat Biotech, (2007)

Histone modification in human Guenther et al., Cell, (2007)

Distinct histone modification pattern in Embryonic Stem (ES) cells Gene ES ES cells contain both repressive and active markers Differentiated cell type 1 Differentiated cell type 2 Differentiated cells contain either repressive or active markers but not both Differentiated cell type n H3K27M: repressive H3K4M: active Bernstein et al. Cell (2006)

Euchromatin and heterochromatin http://respiratory-research.com

Large–scale chromatin domain Rinn et al. Cell (2007)

Large-scale chromatin domain ENCODE, Nature, 2007

Large-scale chromatin domain Open Closed ENCODE, Nature, 2007

Large-scale chromatin domain Open Closed ENCODE, Nature, 2007

DNA methylation in human Eckhardt et al. Nat Gen. (2007)

DNA-methylation pattern in human Figure 1 Type and distribution of amplicons. In total, we analyzed 2,524 amplicons from six distinct categories: 43.7% 5¢-UTRs, 22.5% evolutionary conserved regions (ECR), 14.3% intronic regions, 13.3% exonic regions, 3.6% Sp1 transcription factor binding sites and 2.6% ‘other’ Eckhardt et al. Nat Gen. (2007)

Histone modification Acetyl Ubiquityl Methyl Phosphoryl Luger et al. Nature, (1997) Histone tails can be covalently modified in multiple ways at multiple sites Felsenfeld and Groudine, Nature, (2003)

Histone code hypothesis “… multiple histone modifications, acting in a combinatorial or sequential fashion on one or multiple histone tails, specify unique downstream functions …” ― Strahl and Allis, Nature, (2000) Don’t get into long discussion of the code. Simply, different combinations can have different effects. Don’t get into details of Dion’s experiment. Simply, mutagenesis suggests that the code is probably much simpler. H4-lysine acetylation seems to be cumulative. A remarkable hypothesis proposed by Strahl and Allis is that … But this hypothesis also leads to a dilemma, which is, since the number of possible combinations of histone modifications are overwhelming, how can we possibly decode the histone modification? On the other hand, there is plenty of evidence that the “histone code” is not as complicated as conjectured. For example, our group mutated H4 tail lysine to arginine, which mimics unacetylable lysine, in all possible combinations. The overall effect seems to be cumulative rather than combinatorial.

Statistical assessment of the global impact of histone acetylation on gene expression Integrative analysis using multiple genomic data resources (sequence, gene expression, histone modification) Linear regression model yi expression; Aij acetylation; Si promoter sequence Key is to estimate sequence dependent regulatory effects. If the model fits well, then it suggests it is not so complicated. Data come from …, expand on sequence part. Yuan et al. Gen Bio (2006)

Estimating sequence dependent regulation effects Linear regression model with transcription factor binding motifs Sij motif score Scan motif (MDscan, AlignAce) Filter out insignificant motifs (RSIR) linear f(Si) R^2 is about 0.27, reasonably well for this kind of data. Including interaction coefficients, the R^2 is increased by less than 0.01. Repressors have negative coefficients. E.g., RFX1 has negative coefficients. The effect of the motifs are fitted by data. Repressor corresponds to negative weights? Say linear model of sequences. Change S_ij to motif scores. Explain. S_ij looks similar to S_i which is not. Say a few words about Beer-tavazoie’s motifs. Are they better? One RSIR direction is selected. Q: what if there are more than one RSIR direction? Would it still help to include the variables corresponding to both directions? A: Yes. RSIR is only an exploratory tool. Andrew Gelman did an experiment: X^2+y^2=1 to geneerate data. And linear model can fit very well to the data. The fact that there are more than one RSIR direction can be caused by 1) nonlinear effect; or 2) linear effect but inaccurate SIR direction estimate. In the first case, the variables in 2nd SIR are important factors and should be included in the model. On the other hand, it will be difficult to estimate the full nonlinear effect, so we use the simplified linear model as a proxy. In the second case, the variables selected based on the 1st SIR is unreliable. Therefore, using these variables alone may actually ignore some important factors. R-square is about 0.3. Yuan et al. Gen Bio (2006)

Performance of the linear regression model

Performance of the linear regression model

Performance of the linear regression model

Cumulative effect of histone acetylation Test whether including quadratic interaction between different acetylation sites would improve model performance quadratic interaction p-value for quadratic interaction coefficients (gjk) Write out the formula on top Question is does including quadratic interaction terms would improve model performance? Coding region acetylation may not be regulatory but serve as mark. (don’t discuss unless pressed) Data available at three sites statistically insignificant

Reading List Strahl and Allis 2000; Bernstein et al. 2007 Proposed histone code hypothesis Bernstein et al. 2007 An up to date review of epigenomics Yuan et al. 2005; Nucleosome positions in yeast Yuan et al. 2006; Statistical analysis of histone related gene expression.