© 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

Slides:



Advertisements
Similar presentations
Microarray statistical validation and functional annotation
Advertisements

Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
A Genomic Code for Nucleosome Positioning Authors: Segal E., Fondufe-Mittendorfe Y., Chen L., Thastrom A., Field Y., Moore I. K., Wang J.-P. Z., Widom.
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Asking translational research questions using ontology enrichment analysis Nigam Shah
Pathways analysis Iowa State Workshop 11 June 2009.
Too many matches….
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
Pathways & Networks analysis COST Functional Modeling Workshop April, Helsinki.
Open Day 2006 From Expression, Through Annotation, to Function Ohad Manor & Tali Goren.
Work Process Using Enrich Load biological data Check enrichment of crossed data sets Extract statistically significant results Multiple hypothesis correction.
Threshold selection in gene co- expression networks using spectral graph theory techniques Andy D Perkins*,Michael A Langston BMC Bioinformatics 1.
Bi-correlation clustering algorithm for determining a set of co- regulated genes BIOINFORMATICS vol. 25 no Anindya Bhattacharya and Rajat K. De.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
TRANSFAC Project Roadmap Discussion.  Structure DNA-binding domain (DBD)  The portion (domain) of the transcription factor that binds DNA Trans-activating.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
A Data Mining Method to Predict Transcriptional Regulatory Sites Based On Differentially Expressed Genes in Human Genome HSIEN-DA HUANG, HUEI-LINA and.
Bio277 Lab 3: Finding Transcription Factor Binding Motifs Adapted from a Lab Written by Prof Terry Speed Jess Mar Department of Biostatistics Quackenbush.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Identifying conserved promoter motifs and transcription factor binding sites in plant promoters Endre Sebestyén, ARI-HAS, Martonvásár, Hungary 26th, November,
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
>>> Korean BioInformation Center >>> KRIBB Korea Research institute of Bioscience and Biotechnology GS2PATH: Linking Gene Ontology and Pathways Jin Ok.
Ch10. Intermolecular Interactions and Biological Pathways
Automatic methods for functional annotation of sequences Petri Törönen.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai.
Detecting binding sites for transcription factors by correlating sequence data with expression. Erik Aurell Adam Ameur Jakub Orzechowski Westholm in collaboration.
Networks and Interactions Boo Virk v1.0.
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Reconstructing gene networks Analysing the properties of gene networks Gene Networks Using gene expression data to reconstruct gene networks.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
© 2004 by Genomatix Software GmbH Genomatix Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse 6, D München
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Analysis of the yeast transcriptional regulatory network.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Protein and RNA Families
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Gene Regulatory Networks and Neurodegenerative Diseases Anne Chiaramello, Ph.D Associate Professor George Washington University Medical Center Department.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
E14.5E16.5E18.5 Normalized mRNA level Get1 Nfix Smarcd3 A Supplementary Figure 1 (A) The microarray expression levels of bladder terminal differentiation.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
InterPro Sandra Orchard.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Regulation of Gene Expression
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
STRING Large-scale data and text mining
for Reverse Engineering of Regulatory Networks
Basics of Comparative Genomics
Presented by Meeyoung Park
Basics of Comparative Genomics
Presentation transcript:

© 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse 6, D München

© 2005 by Genomatix Software GmbH Genomatix The general goal in microarray analysis Biological functionality is not directly evident from microarrays Classification / Diagnostics Metabolic pathways Regulatory networks Disease mechanisms Microarrays today ? Cell Microarray experiment

© 2005 by Genomatix Software GmbH Genomatix How to reach the general goal in microarray analysis? Methods for microarray data analysis Statistic analysis Cellular processes Literature analysis Sequence analysis (Genome annotation and promoter analysis) Genomatix knowledge transfer approach

© 2005 by Genomatix Software GmbH Genomatix Statistical analysis; clustering What is the biological functionality behind the chip data? PDGF stimulation of fibroblasts (Demoulin et al. JBC 279, No. 34, 2004; 35392–35402) Microarray experiment Evaluation of the role of PDGF in fibroblasts A real life example Chip data Cluster Genomatix Evaluation of chip clusters PDGF Intro PDGF

© 2005 by Genomatix Software GmbH Genomatix Technology Linking genomic sequence analysis and literature mining Automatic evaluation of gene relationships Promoter source for functional promoter analysis Analysis of promoter sequences/ database scans

© 2005 by Genomatix Software GmbH Genomatix Analysis strategy 2 Project statistical clusters onto biology and categorization of results by z-scoring ( BiblioSphere ) 1 Find statistical clusters 3 Analyze functional groups for co-regulation ( ElDorado & GEMS ) and find additional potentially co-regulated genes ( ModelInspector ) 4 Carry out additional statistical analysis 5 Merge results into biological context Workflow of the project Analysis Strategy

© 2005 by Genomatix Software GmbH Genomatix Statistic analysis Cellular processes Literature analysis Sequence analysis Step 1: Statistical Analysis Methods for microarray data analysis

© 2005 by Genomatix Software GmbH Genomatix Cluster Analysis Significance Analysis for Microarrays (SAM; FDR: 4,3%) 105 of 9928 gene spots are significantly up regulated (Chip: Hver1.2.1) hours PDGF induction Statistical analyzed microarray data data

© 2005 by Genomatix Software GmbH Genomatix 2 Project statistical clusters onto biology and categorization of results by z-scoring ( BiblioSphere ) Biology 1 subtitle Workflow Statistic analysis Cellular processes Literature analysis Sequence analysis

© 2005 by Genomatix Software GmbH Genomatix cluster contains 107 genes Too many genes for biological meaningful co-regulation Strategy: knowledge driven sub-clustering Find functional correlations Gene Cluster BiblioSphere : Large Cluster Query Functional correlations are retrieved by categorization Characterisation of experimental cluster with BiblioSphere

© 2005 by Genomatix Software GmbH Genomatix Knowlege driven sub-clustering Ontology based functional ranking: Genomatix z-scoring highest z-score

© 2005 by Genomatix Software GmbH Genomatix Knowlege driven sub-clustering Ontology based functional ranking: Genomatix z-scoring retrieval of genes overrepresented in the GO-category sterol biosynthesis

© 2005 by Genomatix Software GmbH Genomatix BiblioSphere subgroup analysis: connecting TFs re-enter the six overrepresentd genes into BiblioSphere Gene group analysis

© 2005 by Genomatix Software GmbH Genomatix Towards regulatory networks: connecting TFs Knowlege driven sub-clustering Co-citation for HMGCS1, HMGCR, SC4MOL, DHCR7 with SREBF1 Bibliosphere on sentence level; at least 4 co-citations with input genes Prediction of SREBF1 (EBOX) binding sites in the promoters of HMGCS1, HMGCR and DHCR7 ElDorado

© 2005 by Genomatix Software GmbH Genomatix SREBP1 (=SREBF1) expression is experimentally confirmed Experimental verification

© 2005 by Genomatix Software GmbH Genomatix 3 Analyze functional groups for co-regulation ( Gene2promoter & GEMS ) and find additional potentially co-regulated genes ( ModelInspector ) Genomics subtitle Workflow Statistic analysis Cellular processes Literature analysis Sequence analysis

© 2005 by Genomatix Software GmbH Genomatix Sequence analysis Promoter analysis by GEMS based on ElDorado data Results from literature analysis are used to guide sequence analysis Literature analysisPromoter analysis GEMS ElDorado + Gene2Promoter

© 2005 by Genomatix Software GmbH Genomatix human mouse rat Comparative genomics of promoters -> phylogenetic conservation Comparative analysis of promoters within one species -> co-regulation Sequence analysis Analysis strategies: Inter-genomic and intra-genomic 107 genes 6 genes sterol synthesis DHCR24 DHCR7 EBP HMGCR HMGCS1 SC4MOL

© 2005 by Genomatix Software GmbH Genomatix Intra-genomic approach Extraction of the promoters of DHCR24, DHCR7, EBP, HMGCR, HMGCS1, and SC4MOL ElDorado + Gene2Promoter Analysis of the promoters of DHCR24, DHCR7, EBP, HMGCR, HMGCS1, and SC4MOL with FrameWorker GEMS Comparative promoter analysis (intra-genomic co-regulation) Frameworks underly functional conservation of promoters

© 2005 by Genomatix Software GmbH Genomatix Regulatory genome annotation Promoter resource ElDorado / Gene2Promoter ElDorado Alternative promoters/ transcripts Interconnected to: BiblioSphere GEMS Regulatory SNPs Regulatory regions promoter Promoter modules

© 2005 by Genomatix Software GmbH Genomatix Regulatory genome annotation Promoter retrieval ElDorado / Gene2Promoter

© 2005 by Genomatix Software GmbH Genomatix Regulatory genome annotation Promoter retrieval ElDorado / Gene2Promoter

© 2005 by Genomatix Software GmbH Genomatix Regulatory genome annotation Promoter retrieval ElDorado / Gene2Promoter

© 2005 by Genomatix Software GmbH Genomatix Analysis of promoter organization Promoter analysis with FrameWorker

© 2005 by Genomatix Software GmbH Genomatix EBOX ECAT ZBPF Genes sharing framework: DHCR7, EBP, HMGCS1 EBOX (SREBF1) frameworks are found in a subset of the genes Analysis of promoter organization Frameworks are conserved in order and distance of TFBSs

© 2005 by Genomatix Software GmbH Genomatix EBOX ECAT ZBPF EBOX (SREBF1) frameworks are found in a subset of the genes Analysis of promoter organization EBOX ECAT ZBPF

© 2005 by Genomatix Software GmbH Genomatix ModelInspector search Beyond the microarray EBOX ECAT ZBPF framework Genomatix Human promoter database GPD

© 2005 by Genomatix Software GmbH Genomatix Framework# of hits in human promoters steroid biosynthesis z-score EBOX-ECAT-ZBPF ModelInspector results Results of database search highly selective model no Additional found genes for steroid metabolism so fare... The selectivity is reduced by modification of the model by increasing of the distance variability (application of FastM)

© 2005 by Genomatix Software GmbH Genomatix modification of the model with FastM Model modification distance variability is increased to bp

© 2005 by Genomatix Software GmbH Genomatix additional ModelInspector search Beyond the microarray EBOX ECAT ZBPF framework with modified distance variability Genomatix Human promoter database GPD

© 2005 by Genomatix Software GmbH Genomatix ModelInspector results Results of database search Additional found genes related to steroid metabolism: LSS, MVK, SC5DL, SREBF2 Possibility to re-evaluate statistical results Framework# of hits in human promoters four categories related to “steroid metabolism” z-score EBOX-ECAT-ZBPF LSS and MVK are present on chip, up-regulated but not statistically significant SC5DL, is not present on microarray

© 2005 by Genomatix Software GmbH Genomatix Additional framework analysis All sterol-metabolism related genes identified by microarray analysis, and Modelinspector are included: HMGCS1, MVK, SC5DL, DHCR7, EBP, SREBF2, LSS, HMGCR, SC4MOL, DHCR24 ECAT EGRF ZBPF Re-analysis of promoter organization A additional framework consisting of three TFBSs found It matches 8 of 10 genes input genes: HMGCS1, DHCR7, HMGCR, EBP, LSS; MVK, SC5DL, SREBF2

© 2005 by Genomatix Software GmbH Genomatix Second framework is searched in human promoters by ModelInspector Is the framework also part of other human Promoters? ECAT EGRF ZBPF Genomatix Human promoter database GPD Several frameworks may be important for sterol-related pathways/networks Matches may overlap with first framework but are basically distinct Beyond the microarray

© 2005 by Genomatix Software GmbH Genomatix CYP46A1, FDPS, HMGCR, HSD17B8, OPRS1, SREBF1!, STARD5 ModelInspector results Results of second database search SREBF1/2 are potential regulators of the previous framework! SREBF1/2 may be mediators between the two frameworks identified so far Framework# of hits in human promoters four categories related to “steroid metabolism” z-score EBOX-ECAT-ZBPF

© 2005 by Genomatix Software GmbH Genomatix 4 Carry out additional statistical analysis Statistics2 subtitle Workflow Statistic analysis Cellular processes Literature analysis Sequence analysis

© 2005 by Genomatix Software GmbH Genomatix Expression cluster is extended by Pavlidid Template Matching (PTM) Cluster of 105 significant regulated genes is taken as template The threshold p-value is 0.1 Cluster is extended to 798 genes (including all 105 initial genes) Relaxed statistics requires cross-validation by second evidence Clustering by profile of the initially selected 105 genes Relaxed statistical approach Initial profile Profile cluster

© 2005 by Genomatix Software GmbH Genomatix 5 Merge results into biological context Biology 2 subtitle Workflow Statistic analysis Cellular processes Literature analysis Sequence analysis

© 2005 by Genomatix Software GmbH Genomatix Comparison of ModelInspector results with profile cluster 52 genes share a common framework and are co-expressed 8 genes belong to the GO-category "steroid biosynthesis": DHCR24, DHCR7, EBP, HMGCR, HMGCS1, LSS, MVK, SC4MOL Eight genes are associated with steroid metabolism are supported by three lines of evidence: 1.Common up-regulation 2.Common framework 3.Common functional class (GO-annotation) Merging profile and database searches

© 2005 by Genomatix Software GmbH Genomatix Sterol biosynthesis and regulatory networks ECAT EGRF ZBPF EBOX ECAT ZBPF

© 2005 by Genomatix Software GmbH Genomatix Confirmation of results by GNF tissue profiles Example: profile of HMGCS1 Find correlates with cut-off 0.6

© 2005 by Genomatix Software GmbH Genomatix Sterol biosynthesis and regulatory networks ECAT EGRF ZBPF EBOX ECAT ZBPF GNF profile

© 2005 by Genomatix Software GmbH Genomatix Additional gene group: Tubulins CDEF EGRF MAZF

© 2005 by Genomatix Software GmbH Genomatix Sterol biosynthesis / cell structure proteins and regulatory networks ECAT EGRF ZBPF EBOX ECAT ZBPF

© 2005 by Genomatix Software GmbH Genomatix However, the final focus usually is on a few genes (30 or less usually) Genomatix technology elucidates the biology behind the chip data! No individual method can reveal networks and pathway mechanisms An alternating combinatorial approach can achieve this Evaluation of microarray data Conclusions Several independent functional groups may be derived from one chip PDGF conclusions All of this is possible based on available tools

© 2005 by Genomatix Software GmbH Genomatix Let’s have a break…