Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.

Similar presentations


Presentation on theme: "Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A."— Presentation transcript:

1 Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A Presentation By Hua Chen

2 Background Knowledge A significant character of cis-regulatory sites: the multiple binding sites for different transcriptional factors tend to cluster together in one region around the gene, forming the Cis-Regulatory Modules (CRM). A significant character of cis-regulatory sites: the multiple binding sites for different transcriptional factors tend to cluster together in one region around the gene, forming the Cis-Regulatory Modules (CRM). The searching of cis-regulatory sites gives out too many candidate positions, which make it difficult to tell the true ones; The searching of cis-regulatory sites gives out too many candidate positions, which make it difficult to tell the true ones; The character of CRM provides a feasible method to identify the cis-regulatory sites in the genome. The character of CRM provides a feasible method to identify the cis-regulatory sites in the genome.

3 One example of CRM in Drosophila: eve gene

4 Targets: Adopt the clustering of cis-regulatory modules as a method to identify the functional motifs; Adopt the clustering of cis-regulatory modules as a method to identify the functional motifs; Test the method with some known real CRM regions; Test the method with some known real CRM regions; Search the genome to discover CRMs and confirm the results by experiments. Search the genome to discover CRMs and confirm the results by experiments. The System Investigated: The early Drosophila embryo. The early Drosophila embryo. Five transcriptional factors: Bcd, Cad, Hb, Kr and Kni are investigated. Five transcriptional factors: Bcd, Cad, Hb, Kr and Kni are investigated.

5 Methods: Collecting Transcription Factor Binding Sequences in preceding lab works and doing Alignment; Collecting Transcription Factor Binding Sequences in preceding lab works and doing Alignment; Construction of Position Weight Matrices (PWM) for the conserved motifs. Construction of Position Weight Matrices (PWM) for the conserved motifs. Test the method with the known CRMs; Test the method with the known CRMs; Genome-wide Searching for unknown regulatory regions; Genome-wide Searching for unknown regulatory regions; mRNA Hybridization and Microarray hybridization to test whether the predicted regions are near to genes under regulation of the Transcription Factors; mRNA Hybridization and Microarray hybridization to test whether the predicted regions are near to genes under regulation of the Transcription Factors; One special case: giant gene, further investigated by Transgenics and Mutant Embryo. One special case: giant gene, further investigated by Transgenics and Mutant Embryo.

6 Step1: Collection and Alignment of TF Binding Sites Bcd, Cad, Hb, Kr, Kni binding sequences are determined by in vitro DNAse protection assays; Bcd, Cad, Hb, Kr, Kni binding sequences are determined by in vitro DNAse protection assays; The sequences are aligned with MEME. The sequences are aligned with MEME.

7

8 Step 2: Construction of PWMs and Searching: Patser is used to construct the Position Weight Matrix; Patser is used to construct the Position Weight Matrix; Cis-Analyst is used to identify the potential binding sites matching to the PWM in the Drosophila genome. Cis-Analyst is used to identify the potential binding sites matching to the PWM in the Drosophila genome. A user-defined cutoff parameter (site_p) to eliminate predicted low-affinity sites; A user-defined cutoff parameter (site_p) to eliminate predicted low-affinity sites; Search the sequence with a specified window length; Search the sequence with a specified window length; Retain the windows that contain at least min_sites binding sites; Retain the windows that contain at least min_sites binding sites; Merge all overlapping windows into a cluster. Merge all overlapping windows into a cluster.

9 Binding Site Sequence for Cad:

10 Binding Sites:

11

12 Step 3: Collection of Known CRMs:

13 Successful Result: 14/19 with the searching criteria: window-size=700 bp, number of predicted sites>=13

14 Step 4: Genome-wide Searching: 28 clusters identified; 28 clusters identified; 23 out of 28 fall in regions between genes; 23 out of 28 fall in regions between genes; 5 in the intron regions; 5 in the intron regions; 49 genes in the nearby regions. 49 genes in the nearby regions.

15 Step 5: Examine the expression pattern of the 49 genes by RNA in situ hybridization and microarray hybridization: The 49 genes are examined by hybridizations to see whether they show the pattern of under regulation of the TFs; The 49 genes are examined by hybridizations to see whether they show the pattern of under regulation of the TFs; 10 out of the 28 clusters are near to at least one gene show the anterior-posterior expression pattern (Under regulation of the five TFs). 10 out of the 28 clusters are near to at least one gene show the anterior-posterior expression pattern (Under regulation of the five TFs).

16 Step 6: The special case: giant gene The posterior expression is regulated by Cad,Hb,Kr; The posterior expression is regulated by Cad,Hb,Kr; The cis-regulatory sites are still unknown; The cis-regulatory sites are still unknown; The predicted CRM nearest to the giant gene is cloned to the upstream of lacZ reporter gene. The predicted CRM nearest to the giant gene is cloned to the upstream of lacZ reporter gene. The lacZ gene show a similar expression pattern as the giant mRNA. The lacZ gene show a similar expression pattern as the giant mRNA. +/+ Kr/Kr +/+ Kr/Kr

17 Conclusions: Binding site clustering is an effective method to identify cis-regulatory modules; Binding site clustering is an effective method to identify cis-regulatory modules; A major block is the paucity of the binding data for most transcription factors, which need a systematical work; A major block is the paucity of the binding data for most transcription factors, which need a systematical work; The real CRM structures is more complex, it needs to incorporate more complex rules in the method. The real CRM structures is more complex, it needs to incorporate more complex rules in the method.

18 Reference Berman, B.P., Nibu, Y. et al Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. P. N. A. S. 99: Berman, B.P., Nibu, Y. et al Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. P. N. A. S. 99:


Download ppt "Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A."

Similar presentations


Ads by Google