Presentation is loading. Please wait.

Presentation is loading. Please wait.

Epigenetic Analysis BIOS 691- 803 Statistics for Systems Biology Spring 2008.

Similar presentations


Presentation on theme: "Epigenetic Analysis BIOS 691- 803 Statistics for Systems Biology Spring 2008."— Presentation transcript:

1 Epigenetic Analysis BIOS 691- 803 Statistics for Systems Biology Spring 2008

2 Kinds of Questions Where are the epigenetic modifications? How do they co-vary? How do epigenetic changes affect expression of genes?

3 Covariation of Epigenetic Measures Motivating questions –How are epigenetic modifications related? –What are the major determinants of epigenetic state? Statistical techniques –Covariance calculation –Principal component analysis –Linear models

4 Location and Covariance Question: do epigenetic modifiers act on specific targets or do they act on whole regions of DNA? Direct experimental evidence contradictory Statistics may help: –Covariation patterns may be evidence

5 CalcA in NCI60 Calcitonin A gene Two CpG clusters plus 3 odd CpG’s High correlation within clusters

6 CDH1 in NCI 60

7 Covariation in Methylation of 7 Genes Individual genes have multiple CpG sites Most variation: overall methylation Epigenomic Analysis Correlation Map of 108 CpG sites in 6 genes across 5 ECOG pilot samples Red = 1 White = 0 Blue < 0

8 Methylation and Expression Single gene (E-cadherin) results suggest overall methylation correlated with expression

9 Methylation and Expression HELP assay gives genome-wide sampling of methylation sites at 15K genes If select genes with S/N > 2 in both measures, then correlations with associated genes are bi-modal Epigenomic Analysis

10 What Causes Methylation? NCI-60 derived from various tissues Tissue characteristic profile + specific history of cells Fit linear model to each methylation site –9 tissues for 60 observations 51 error df Overall 41% of variance attributable to tissue What causes the remainder of methylation differences?

11 PCA for Cell-specific Factors Residual variance has one strong PC Remainder are ‘noise’ 1 st PC is almost constant –Reflects overall level of methylation –Is this an artifact or is it real? –Significantly correlated with expression of DNMT1 & DNMT3A

12 Relations Between Epigenetic Measures - III Stem Cells & Cancer

13 Issue: Cancer Stem Cells? Hypothesis: cancers arise from stem cells rather than differentiated epithelial cells How would you tell the difference between partially differentiated stem cells and de- differentiated epithelial cells? Proposal: compare characteristic epigenetic modifications of stem cells with cancers Epigenetic modifications are distinct –PRC2 (stem cells) vs methylation (cancer)

14 Statistical Methodology Test of association 2 x 2 table Fisher Exact p ~ 10 -5 PRC2not Methylated 3443 Not397

15 Statistical Methodology Test of association 2 x 2 table Fisher Exact p ~ 10 -5 Alternatives –T-test (predictor: PRC2) –Linear model (predictor: methylation: T – N ) PRC2not Methylated 3443 Not397

16 PRC2 – Methylation Association

17 Are CIMP’s Stem Cell Clones? Distinctive PRC2 sites appear preferentially methylated in CIMP tumors

18 Correlations between epigenetic and expression measures – I Copy Number and Expression

19 Large sections of DNA containing many genes are often copied or deleted We think most control elements are copied or deleted also If more (or fewer) copies of a gene then ceteris paribus there should be more (fewer) copies of RNA

20 Integrative Studies of CGH & Gene Expression Expect to see strong correlation between copy number and expression in data Previous studies report report weak effects –Average correlations from (0.04 to 0.27) NCI 60 study average correlation 0.16

21 Why Not? H1: there really isn’t much effect – biology –Somehow the cells are compensating –In any case there shouldn’t be any effect on non-expressed genes H2: we may not be able to measure the effect that is there – technical error –Probes may be insensitive/cross-hybridizing –Signal/noise too low even when probes are sensitive

22 Eliminating Uninformative Genes Genes which are silenced will not show effect of copy number variation –Mean signal a rough proxy –Remove genes with mean signal above 6.3 Only genes with significant copy number variation (above measurement noise) will show effect –Select genes with SD of copy number > 0.5

23 Correlations of Selected Measures Black: All correlations Red: Reliably measured correlations

24 Estimating True Correlations If measurement noise of SD ~ 0.3 degrades expression measures, then true correlations of variables will be mostly closer to 0 than correlations of measures Given a correlation and measured standard deviations, what are most likely true standard deviations and true correlation?

25 MLE of Noisy Correlations Noise can be estimated from replicates If N large can estimate SD of originals can be estimated by ML Given s and e, the MLE of correlation can be inferred For NCI 60 median MLE correlation ~ 0.65 Epigenomic Analysis

26 Correlations between epigenetic and expression measures – II Chromatin and Expression

27 Do Epigenetic Marks Regulate Transcription? Several studies finding only weak evidence by correlation analysis Same technical issue: S/N ratio Questions –Does methylation shut down most genes? –Which histone marks indicate active transcription?

28 Methylation and Expression HELP assay gives genome-wide sampling of methylation sites at 15K genes Select genes with S/N > 2 in both measures Correlations with gene expression values are bi-modal Epigenomic Analysis

29 Interpretation of Meth-Expr Corrs MLE of negative mode ~ -0.8 ~ 2/3 of genes under that hump Unclear whether positive hump is real or an artifact of small sample size Possible explanations: –True induction by methylation Methylation of insulator –Irrelevant CpG site

30 Acetylation and Expression Histones often acetylated during expression Histone 3 lysine 9 (H3K9) acetylation measured Measures corrupted by noise –Blue: S/N > 2.5 –Red: S/N > 2 –Black: S/N > 1.5

31 Biological Prediction H3K9 acetylation gene expression Is this real? –Experimental test: find genes with high acetylation variance, and little expression variance by microarray Results (7 genes) Confirm hypothesis Implies: –Expression arrays are not sensitive Epigenomic Analysis


Download ppt "Epigenetic Analysis BIOS 691- 803 Statistics for Systems Biology Spring 2008."

Similar presentations


Ads by Google