Analysis of protein-DNA interactions with tiling microarrays Srinivasan (Vasan) Yegnasubramanian Sidney Kimmel Comprehensive Cancer Center Oncology Dept., Genitourinary Division March 7, 2007
Identical genetic sequence, but very different gene expression and phenotypes… …These differences are due to Epigenetic changes.
Epigenetics is the study of heritable processes that alter gene expression without an accompanying change in gene sequence These processes are usually mediated by factors, such as proteins/ribonucleo-proteins, that bind genomic DNA
(3.4x10-10 meters/bp) x (6x109 bp/genome) = ~2 meters/genome Radius of the nucleus is ~ 10 µM !!! Klug and Cummings, 1997
[(6 x 109 bp/genome) / (195 bp/nucleosome)] = ~ 30 [(6 x 109 bp/genome) / (195 bp/nucleosome)] = ~ 30.8 x 106 nucleosomes/genome ~ 5 % of nuclear volume
http://www.albany.edu/~achm110/solenoidchriomatin.html
DNA methylation occurs at CpG dinucleotides in mammalian genomes 5’…ACGT…3’ 5-me
DNA methylation patterns in normal and cancer cell genomes Herman and Baylin, NEJM, 2003
DNA methylation can lead to silencing of gene expression >2 MDalton Complex Robertson and Wolffe, Nat Rev Genet, 2000
Struhl, Cell, 2004 http://www.berkeley.edu/news/features/1999/12/09_3dimage.html
Diameter of DNA Double helix: 20 Angstroms Diameter of Transcriptional machinery: >1,000 Angstroms
Developing an understanding of epigenetic processes… DNA Modifications (e.g. Methylation) Gene Transcriptional Changes DNA-Protein Interactions
Characteristics of Tiling Microarrays d1 d2 d6 d5 d4 d3 d7 Microarray contains n probes of length L distributed across x base pairs on a genomic region of interest. That is, n probes are tiled across a genomic region of interest The average resolution or sampling/window size, then, is R = x / n, or
Affymetrix Tiling microarrays Human Chromosome 21/22 microarrays > 35 million bp of non-repetitive sequence on Chrom 21/22 represented with >1 million probe sets on three microarrays (currently on a single array). R ~ 35 bp. ENCODE arrays representation of 1% of genome corresponding with ENCODE regions at 35 bp resolution with single microarray. Tiled arrays of 10 human chromosomes 74,180,611 probe pairs interrogating 30% of human genome (i.e. 10 complete chromosomes) at on >90 microarrays. R ~ 5 bp. Tiled arrays of whole genome interrogation of whole genome (1.7 Gb) on 7 microarrays (~50,000,000 PM probes only) or 14 microarrays (~50,000,000 PM + MM probe sets). R ~ 35 bp. Promoter Tiling arrays interrogation of all 5’ upstream regions of known genes on a single microarray All probes are 25-mers
Strategy DNA Methylation Transcriptome Chromatin Structure Analysis DNA Methylation (In Vitro DNA/Protein Interactions) Chromatin Structure (In vivo DNA/Protein Interactions) Label and Hybridize Samples To Tiling Microarrays Biostatistical Analysis to Identify Genomic Regions of Interest
ChIP-Chip for “in vivo” DNA protein interactions Total Reverse crosslinks Amplify Label/hybridize Crosslink Lyse & Sonicate Y Other controls for IP (e.g., no antibody, non-specific antibody) IP Reverse crosslinks Amplify Label/hybridize
Current limitations for ChIP-Chip Process is very inefficient and requires large amounts of input material Sonication step can be quite variable and cannot be easily quality controlled with small amounts of starting material Currently difficult to perform on clinical specimens Labor-intensive
Genome-wide, high-resolution DNA methylation detection by taking advantage of tiling arrays and DNA-protein interactions in vitro
Endogenous methyl-CpG binding domain proteins MECP2 MBD1 MBD2 (Anti-5mC Ab) MBD3 MBD4
6His-MBD2-MBD binds symmetrically methylated oligonucleotides Yegnasubramanian et al., Nucleic Acids Res, 2006
Enrich for densely methylated fragments Use of 6His-MBD2-MBD for enrichment of methylated genomic DNA Fragment Fe Enrich for densely methylated fragments Real-time PCR Yegnasubramanian et al., Nucleic Acids Res, 2006
Whole-genome DNA methylation assay Fe Fragment Total input Enrich methylated fragments Amplify Amplify Fragment/label/hybridize Fragment/label/hybridize
Fragmentation techniques Restriction Enzyme Sonication
Middle ground Pool different restriction enzyme digests
Dynamics of amplification and fold enrichment… Total Amplify to 20 Fold enrichment dependent on: Amount of each species after enrichment Total amount of all enriched species
Ongoing and future work Preprocessing Analysis DNA Modifications (e.g. Methylation) Cancer Meta-Analysis DNA-Protein Interactions Preprocessing Analysis Normal Gene Transcriptional Changes Preprocessing Analysis
End of slides