Identification of Compositionally Similar Cis-element Clusters in Coordinately Regulated Genes Anil G Jegga, Ashima Gupta, Andrew T Pinski, James W Carman,

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

Gene regulation /function card Anatomical network card Tassy et al., Figure S1: Navigation diagram of ANISEED Anatomical structure card Expression card.
Annotation standards in ORegAnno (Draft) Obi Griffith The RegCreative Jamboree Nov 29, 2006 Ghent, Belgium.
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Homology Based Analysis of the Human/Mouse lncRNome
Predicting Enhancers in Co-Expressed Genes Harshit Maheshwari Prabhat Pandey.
Computational detection of cis-regulatory modules Stein Aerts, Peter Van Loo, Ger Thijs, Yves Moreau and Bart De Moor Katholieke Universiteit Leuven, Belgium.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Two short pieces MicroRNA Alternative splicing.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
An analysis of “Alignments anchored on genomic landmarks can aid in the identification of regulatory elements” by Kannan Tharakaraman et al. Sarah Aerni.
Sequence Comparison Intragenic - self to self. -find internal repeating units. Intergenic -compare two different sequences. Dotplot - visual alignment.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
Finding Regulatory Motifs in DNA Sequences
Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction.
Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Whole genome alignments Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Cincinnati Comparative Mouse Genomics Centers Consortium: Bioinformatics Analysis Tools for Assessment of Human Gene Polymorphisms Anil G Jegga, Sivakumar.
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
Regulation of Gene Expression: An Overview  Transcriptional  Tissue-specific transcription factors  Direct binding of hormones, growth factors, etc.
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Computational Identification of Drosophila microRNA Genes Journal Club 09/05/03 Jared Bischof.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
PreDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Department.
Comparative genomics analysis of NtcA regulons in cyanobacteria: Regulation of nitrogen assimilation and its coupling to photosynthesis Wen-Ting Huang.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
6D Gene expression the process by which the heritable information in a gene, the sequence of DNA base pairs, is made into a functional gene product, such.
Localising regulatory elements using statistical analysis and shortest unique substrings of DNA Nora Pierstorff 1, Rodrigo Nunes de Fonseca 2, Thomas Wiehe.
Proposed redefinition of “gene” requires it to have a biological role Gerstein MB, …, Snyder M Genome Res 17: example of complexities observed.
Mark D. Adams Dept. of Genetics 9/10/04
Tools for Comparative Sequence Analysis Ivan Ovcharenko Lawrence Livermore National Laboratory.
How do we represent the position specific preference ? BID_MOUSE I A R H L A Q I G D E M BAD_MOUSE Y G R E L R R M S D E F BAK_MOUSE V G R Q L A L I G.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Cis-regulatory Modules and Module Discovery
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
RBP1 Splicing Regulation in Drosophila Melanogaster Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla This presentation available at
Gene Structure and Identification III BIO520 BioinformaticsJim Lund Previous reading: 1.3, , 10.4,
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Finding genes in the genome
Pattern Discovery and Recognition for Understanding Genetic Regulation Timothy L. Bailey Institute for Molecular Bioscience University of Queensland.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology.
TSS Annotation Workflow
Eukaryotic Gene Finding
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
Presented by, Jeremy Logue.
Volume 128, Issue 6, Pages (March 2007)
Volume 16, Issue 6, Pages (December 2004)
Nora Pierstorff Dept. of Genetics University of Cologne
Summarized by Sun Kim SNU Biointelligence Lab.
Basic Local Alignment Search Tool
Presented by, Jeremy Logue.
Origins and Impacts of New Mammalian Exons
Splice isoforms of the JNK1, JNK2, and JNK3 proteins.
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

Identification of Compositionally Similar Cis-element Clusters in Coordinately Regulated Genes Anil G Jegga, Ashima Gupta, Andrew T Pinski, James W Carman, Bruce J Aronow Cincinnati Children’s Hospital Medical Center, Cincinnati, OH Abstract: A singular efficient method to decipher the underlying transcriptional control elements in higher eukaryotic genomes is still elusive. We have explored the extension of comparative genomics approaches to tackle this problem using known TF binding sites. Starting with an earlier developed method for identification of conserved cis-elements that are contained within evolutionarily conserved genomic regions ( we extended the query to identify compositionally similar cis-regulatory element clusters that occur in groups of co-expressed genes within each of their ortholog-pair evolutionarily conserved cis-regulatory regions (“peak analyzer”). We have tested series of co- regulated ortholog pairs of promoters and genes using known regulatory regions as training sets and microarray array profile data based co-expressed genes as test sets in the central nervous system, liver, olfactory and immuno-hematologic systems. Our results suggest that this combinatorial approach is broadly sensitive for the identification of known and potential regulatory regions containing conserved cis-elements for known compartment-specific trans-acting factors. However, sensitive detection of some known regulatory regions leads to an abundance of apparently false positives. We believe this approach can be substantially refined by improvement in the use of compositional similarity algorithms and weighted detection of preferred architecture models. Method: Gene 2: Hs-Mm Gene 4: Hs-Mm Gene 3: Hs-Mm Gene 5: Hs-Mm Gene 1: Hs-Mm Local Alignment Similarity Score: 3074 Match Percentage: 51 % Number of Matches: 96 Number of Mismatches: 39 Total Length of Gaps: 52 Begins at (8281,8874) and Ends at (8416,9059) Seq 1 Seq 2 Sim% Nt % (20 nt) % (10 nt) % (14 nt) % (52 nt) % (9 nt) % (30 nt) Seq 1 Seq 2 Sim% Nt Hits % (10 nt) % (14 nt) % (52 nt) % (9 nt) % (30 nt)4 Trafac GeneChip Experiments A set of Coordinately Expressed Genes BlastN/Blat Search for genomic sequence retrieval TF Binding Sites >Seq 1 Human/Mouse Genomic AGAGAAAATTGCTAGAGCTCA GGAGTTTGAGACCAGCCTGG GCAATAGAGTAAGACTTTGTCT CTATCAAAAATTTAAAAATTAAC TGGGCTTGGCGGTGTGCACC TGTGGTCCAGCTACTCAGGAG GCTGAGGTGGGAGGATTGCTT GAGCCCAAGA >Seq 2 Mouse/Human Genomic GACTGAGGGCTTGTGAAACAG CAAGAACCTGTCTCAAAAAACA GTGGGCAGGGAGGGGATTAAT GAATAGGCAGCTACGTTCTGGG ACTGGAGGGACTCGAGGTGGC TAGAAAGCAAGAGGTACTGGGA GACAAGGCTGCAGACATTTCTT TTTTTACTAGAGTC BlastZ ESTs/cDNAs Seq 7 Seq 8 Sim% Nt Hits % (10 nt) % (14 nt) % (52 nt) % (9 nt) % (30 nt)4 Seq 3 Seq 4 Sim% Nt Hits % (10 nt) % (14 nt) % (52 nt) % (9 nt) % (30 nt)4 Seq 5 Seq 6 Sim% Nt Hits % (10 nt) % (14 nt) % (52 nt) % (9 nt) % (30 nt)4 Seq 9 Seq 10 Sim% Nt Hits % (10 nt) % (14 nt) % (52 nt) % (9 nt) % (30 nt)4 Peak-Analyzer Gene 1 : Hs-Mm, Gene 2 : Hs-Mm, Gene 3 : Hs-Mm, … Gene n : Hs-Mm Peak Analyzer Coordinately Expressed Genes in Olfactory Mucosa: Three genes with high levels of expression in Olfactory Mucosa shared several clusters of cis-elements. Each of these clusters was also conserved in human and mouse. The window size ranged from 200 to 300 base pairs. Two of the genes (XM_ and XM_143313) depicted here encode hypothetical proteins while the third is TPD52 (Tumor protein D52) (Genter et al., 2003). Coordinately Expressed Genes in Cerebellum: Conserved Cis- element clusters ( base pair window) between human and mouse homologs and shared by four genes (ATP2A2 (Ca++- ATPase); HPCAL1 (Hippocalcin-like 1); CACNA1A (P/Q type Ca channel alpha 1A); and PLA2G7 (phospholipase A2 group VII)). highly expressed in Cerebellum (Zhang et al., 2003). Skeletal Muscle Genes - Regulogram depiction of shared cis-elements: Horizontal bars with colored segments (exons) are human and mouse genomic sequences. The different colored quadrilaterals are regions of alignment. Within each of these blocks, the % sequence similarity and the number of TF-binding sites are represented as two separate line graphs. TraFaC images of the experimentally validated regulatory regions of Skeletal Muscle genes (represented as blue circle on regulograms): The two gray vertical bars are the two genes that are compared. The TF-binding sites occurring in both the genes are highlighted as various colored bars drawn across the two genes. DES: Upstream Enhancer RegionMYL1: Intronic Enhancer Region CKM: Upstream Enhancer Region ENO3: Intronic Enhancer Region Peak Analyzer: After the initial genomic sequence alignment of orthologous skeletal muscle genes (DES (Desmin), MYL1 (Myosin light polypeptide 1), CKM (creatine kinase muscle) and ENO3 (enolase 3 beta, muscle)), the “peaks” or “hits” (common cis-elements between orthologous gene pair and occurring in conserved genomic regions) were compared to identify shared cis-regulatory modules. The identified cis clusters included the experimental validated regulatory regions in each of these genes and comprised of multiple muscle regulatory cis-elements (Wasserman and Fickett, 1998). The horizontal lines are the genomic sequences of the base species (human in this case). Yellow vertical bars are the exons. The different colored boxes represent the different cis-clusters. Limitations: 1.Cis-elements that are not conserved across the orthologous genes cannot be identified even though they occur in regions of sequence similarity across the species.. 2.Cis-elements that occur in non-aligned genomic regions across the two species cannot be identified by this approach. References: Support : HHMI and NIEHS U01 ES11038 Mouse Centers Genomics Consortium Conclusions: 1.The combinatorial approach of identifying coordinately regulated genes that share compositional similarity of cis-elements within their orthologous non- coding genomic regions offers a powerful filter that can aid in the identification of potential functional cis-clusters. 2.Peak analyzer appears capable of identifying known and novel regulatory modules within a cluster of coordinately regulated genes. 3.These novel cis-element modules may be useable as probes for genome wide annotation of potential regulatory regions.