Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,

Slides:



Advertisements
Similar presentations
Misha Kapushesky November 28, 2003 Expression Profiler: Next Generation.
Advertisements

The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Visualisationmodule Catherine Leroy, Pierre Marguerite, Bhuwan Tiwari, Niran Abeygunawardena, Sergio Contrino, Anna Farne, Ele Holloway, Gaurab Mukherjee,
The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
Abstract BarleyBase ( is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression.
Minimum Information About a Microarray Experiment - MIAME MGED 5 workshop.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MIAME and Data Standards Phillip Lord. Why Standards? "However, there is a subtle implication that standardization (fixation) is a good thing". An anonymous.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Transcriptomics Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht.
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
GCB/CIS 535 Microarray Topics John Tobias November 15 th, 2004.
MARS: Microarray analysis, retrieval, and storage system Albert F. Cervantes.
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
1 MAGE-OM and ArrayExpress database model Ugis Sarkans, EBI.
1 Update on ArrayExpress & standards Ugis Sarkans, EBI.
European Bioinformatics Institute MGED Society Establishing the infrastructure for sharing microarray data Alvis Brazma European Bioinformatics Institute.
Gene Expression Omnibus (GEO)
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
Copyright OpenHelix. No use or reproduction without express written consent1.
MIAMExpress development and local installation DESPRAD Meeting,November 2002 Mohammad shojatalab
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
DESPRAD subproject Alvis Brazma EMBL-EBI Hinxton, October 20, 2003.
VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
Genomics Laboratory University Medical Center Utrecht... Microarray technology group microarray production and use Transcription regulation genome-wide.
MIAMExpress development October 2002 Mohammad shojatalab
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK Standards and infrastructure for managing experimental metadata Philippe Rocca-Serra,
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
Gene Expression Omnibus (GEO)
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
TEMBLOR review meeting - EMBL-EBI, Hinxton, October 20 th 2003 Integration of J-Express with ArrayExpress Partner 20 University of Bergen Inge Jonassen.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
A collaborative tool for sequence annotation. Contact:
1 ArrayExpress Ugis Sarkans, EBI. 2 Overview Underlying standards –MIAME –MAGE* Data submission Data access –annotations –actual data –array design descriptions.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
 9 European Countries  1 Third Country  14 Research Centers of Excellence  5 Universities  4 SMEs  1 Venture Capital.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe.
High throughput biology data management and data intensive computing drivers George Michaels.
ArrayExpress Ugis Sarkans EMBL - EBI
Using ArrayExpress.
From MIAME to MAML: Microarray Gene Expression Database (MGED)
Functional Genomics Consortium: NIDDK (Kaestner) and (Permutt)
Presentation transcript:

Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK Microarrays, one of the latest breakthroughs in experimental molecular biology, are producing considerable amounts of gene expression and other functional genomics data. The handling, storage, and analysis of these data are becoming the major bottlenecks in the utilization of the microarray technology. Storing and annotating these data is not a trivial problem due to many reasons. The raw microarray data are images, which have to be transformed into gene expression matrices -- tables, where rows represent genes, columns represent various samples such as different tissues, and values at each position characterizing the expression level of the particular gene in the particular sample. This process is not a trivial one due to replicate measurements, replicate spots, different oligos reporting information about the expression level of the same gene, problems with sequence homology and potential cross- hybridisation, cross-platform comparisons, and so forth. The high-level gene expression matrices representing genes and respective expression levels, also have to be integrated with other genomic data and analysed further, if any knowledge about the underlying biological processes is to be extracted (see [1]). The European Bioinformatics Institute initiated an international effort to establish standards for microarray data representation, annotation and exchange [2]. Recommendations of MIAME - The Minimum Information About a Microarray Experiment - specify the minimum information that must be reported about a microarray (or any DNA array) based gene expression monitoring experiment in order to ensure the interpretability, as well as potential verification of the results by third parties. An XML based data exchange format - Microarray Markup Language (MAML) is being developed in collaboration with Microarray Gene Expression Database (MGED) Group (see EBI is establishing a database ArrayExpress, a public repository for microarray data, which will accept data in MAML format. Expression Profiler, a set of online tools for gene expression data analysis has been developed at the EBI and is available for public use ( The analysis software in the Expression Profiler facilitates the clustering, exploration, and visualization of the gene expression data, as well as linking the analysis results to tools and databases elsewhere. Expression Profiler includes tools that assist with the analysis of expression data in connection with other data types. Currently, the DNA sequence data can be analysed and visualized as well as expression data, permitting users to discover, study, and visualize putative transcription factor binding sites [3]. One of the prospects of analysing microarray data is a reverse engineering of gene regulatory networks from gene expression and other genomics data. We have been successfully using our tools for in silico prediction of transcription factor binding sites [3]. Furthermore, we are developing models for describing gene regulatory networks, and use this modelling approach to find insights into the regulation of gene expression in response to the activity of other molecules in the cell as well as extracellular signals. ArrayExpress – a public repository for microarray data Helen Parkinson, Mohammadreza Shojatalab, Ugis Sarkans and Alvis Brazma European Bioinformatics Institute (EMBL-EBI), – Hinxton Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK Publication 6 parts of a microarray experiment HybridisationArray External Databases for Gene/sequence id’s Sample Ontologies for sample Description Data Experiment Normalisation MIAME based ArrayExpress Conceptual Model MIAME – six parts of a microarray experiment Login/contact info Experiment submission Browse existing arrays from ArrayExpress Sample details, n/n samples Authors Laboratory Protocols Pending exp. submissions Experiment details Array Submission Pending array submissions Hybs prot./sample hybs Data upload Qualifier,V, S Browse Public /User protocols Add new protocol Protocol sub Protocol top page Extract Hyb Label Scan Other Overview Samples  change Hybs  change Protocols  change Authors  change Data files  change Overview Samples  change Hybs  change Protocols  change Authors  change Data files  change Submit Qualifier,V,S Array protocol Minimum Information About A Microarray Experiment (MIAME) suggests that recorded information should be sufficient to interpret and replicate the experiment and the information should be structured so that querying and automated data analysis and mining are feasible 1 2 MIAME based Submission Tool The ArrayExpress model is designed around the 6 MIAME sections. The prototype submission tool is a GUI implementation of the MIAME questionnaire ( wg/index.html). The submission tool writes to a mysql database which retains the MIAME concept but doesn’t support the complex query capability of the full ArrayExpress model, as this is not required for data submission. The schema for the submission tool and screenshots are shown in below left. wg/index.html The tool is currently set up as a generic submission tool for all species but has the potential for species specific or experiment specific implementations. Additionally it will be freely available for use as a LIMS for users who have limited local bioinformatics support. It has been designed for small scale users and has full contextual help and will be supported by ArrayExpress database staff. The tool will be further developed according to user need. Large scale users with local databases are expected to use the MAGE-ML (XML based) data submission format, this process is analogous to the way that sequencing centres deposit data into sequence databases. One of the most important requirements of MIAME is sample annotation. Complete and accurate sample description is complex and will require the construction of an ontology and inclusion of controlled vocabularies which are referenced by the submission tool. The ontology will need to encompass, tissues, cell lines, developmental stages, disease states, compounds, drugs, strains (and just about anything else that you can think of) related to a microarray experiment sample. Some of these terms have been defined and included into an ontology by Chris Stoeckert (U.Penn.) as part of the MGED (Microarray gene expression database group) ontology working group. The submission tool will be used as a source of terms and controlled vocabulary for the MGED ontology. Acronym Key ArrayExpressthe public database based on MAGE-OM MAGE-OMmicroarray gene expression object model, developed by MGED and Rosetta and submitted to OMG (Object Managment Group) for adoption as a specification for expression data exchange. MGED The MGED group is an open discussion group established at the Microarray Gene Expression Database meeting MGED 1 (1999). The goal of the group is to facilitate the adoption of standards for DNA-array experiment annotation and data representation, as well as the introduction of standard experimental controls and data normalisation methods. The underlying goal is to facilitate the establishment of gene expression data repositories, comparability of gene expression data from different sources and interoperability of different gene expression databases and data analysis software ( MAGE-ML Microarray gene expression mark-up language an XML data exchange format able to capture MIAME, based on MAGE-OM 4 ArrayExpress Prototype Query Interface 4 A prototype query interface has been developed for the ArrayExpress database, this supports complex queries across biosource, experimental factors etc.