1 ArrayExpress Ugis Sarkans, EBI. 2 Overview Underlying standards –MIAME –MAGE* Data submission Data access –annotations –actual data –array design descriptions.

Slides:



Advertisements
Similar presentations
ArrayExpress A public database for microarray based gene expression data European Bioinformatics Institute EMBL-EBI Alvis.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Misha Kapushesky November 28, 2003 Expression Profiler: Next Generation.
The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
ArrayExpress Query Interface Gonzalo Garc í a Lara January, / 24.
Visualisationmodule Catherine Leroy, Pierre Marguerite, Bhuwan Tiwari, Niran Abeygunawardena, Sergio Contrino, Anna Farne, Ele Holloway, Gaurab Mukherjee,
Presented by Amr Ali AL-Hossary (M.B.,B.Ch)
Abstract BarleyBase ( is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression.
1 MAGE: Revised submission against LSR RFP-007 "Gene Expression" Ugis Sarkans, EBI Michael Miller, Rosetta Inpharmatics.
Minimum Information About a Microarray Experiment - MIAME MGED 5 workshop.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
NYU Microarray Database (NYUMAD)
Transcriptomics Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht.
SMD Array- Express GEO Gene Expression Query Gene expression data XML Brodel/ Model-DB FMA/ OQAFMA 3D Scene DSG Client Scene data Model Query FMA Query.
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
MARS: Microarray analysis, retrieval, and storage system Albert F. Cervantes.
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
MAHI Research Database Project Status Report August 9, 2001.
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
EBI is an Outstation of the European Molecular Biology Laboratory. MAGE-TAB - The ArrayExpress Production Experience Helen Parkinson, PhD.
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
1 MAGE-OM and ArrayExpress database model Ugis Sarkans, EBI.
1 Update on ArrayExpress & standards Ugis Sarkans, EBI.
European Bioinformatics Institute MGED Society Establishing the infrastructure for sharing microarray data Alvis Brazma European Bioinformatics Institute.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
September 2003 Aix en Provence Jonathon Blake EMBL Biochemical Instrumentation.
MIAMExpress development and local installation DESPRAD Meeting,November 2002 Mohammad shojatalab
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
Presentation on SubmissionTrackingTool: by Anjan Sharma.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
DESPRAD subproject Alvis Brazma EMBL-EBI Hinxton, October 20, 2003.
VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
Genomics Laboratory University Medical Center Utrecht... Microarray technology group microarray production and use Transcription regulation genome-wide.
MIAMExpress development October 2002 Mohammad shojatalab
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and.
TEMBLOR review meeting - EMBL-EBI, Hinxton, October 20 th 2003 Integration of J-Express with ArrayExpress Partner 20 University of Bergen Inge Jonassen.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
SimDB Implementation & Browser IVOA InterOp 2008 Meeting, Theory Session 1. Baltimore, 26/10/2008 Laurent Bourgès This work makes use of EURO-VO software,
TEMBLOR mid-term review Participation in DESPRAD project Bernd Drescher Robert Wagner.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
Worldwide Protein Data Bank wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
Overview 16 Databases investigated 4 Systems MIAME Compliant –ArrayExpress, SMD, LAD and GeneX 2.x 2 Systems Support MAGE-ML import and export –ArrayExpress.
ArrayExpress Ugis Sarkans EMBL - EBI
May 8, 2006 MAGE v1 and MAGE v2 Michael Miller Lead Software Developer Rosetta Biosoftware NCI MAGE Jamboree.
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Using ArrayExpress.
SDMX Reference Infrastructure Introduction
MySQL Migration Toolkit
Presentation transcript:

1 ArrayExpress Ugis Sarkans, EBI

2 Overview Underlying standards –MIAME –MAGE* Data submission Data access –annotations –actual data –array design descriptions Some technical details Future developments

3 What information should be exchanged? MIAME - Minimum Information About a Microarray Experiment –informal specification –paper published in Nature Genetics –goal - to initiate discussion: which details are important and which may not be –ArrayExpress can store MIAME data (and more)

4 MAGE-OM MAGE-OM: MicroArray Gene Expression Object Model –in January 2002 became an “adopted” OMG specification –January to August finalization process –in September became an “available” specification –should be set in stone for the next 2 years –thinking about MAGE v2 started user feedback support for other types of functional genomics data more precise handling of data manipulation

5 BioEvent Experiment ArrayDesign BioMaterial BioAssayData BioAssay DesignElement UML Packages of MAGE HigherLevelAnalysis BioSequence Array QuantitationType Description Protocol Measurement AuditAndSecurity BQS what was used what was done results miscellaneous

6 MAGE-ML MAGE-ML: MicroArray Gene Expression Markup Language –generated from MAGE-OM, therefore evolved automatically – translation from Jan 2002 to Sep 2002 DTD quite easy

7 ArrayExpress: data currently - 9 experiments, 4 array designs: –from EMBL - human, yeast –from Sanger - pombe coming: –array descriptions: Affymetrix, Agilent –labs: TIGR, Utrecht, more from Sanger,... –export from existing DBs: SMD, RAD –tools - MAGE-ML export: Jexpress, BASE,... –ILSI project journal requirements: Nature, Lancet,...

8 Help with MAGE-ML: MAGEstk MAGE-ML - the only way of getting data into ArrayExpress MAGEstk: MicroArray Gene Expression Software ToolKit –Jamboree IV in Stanford, beginning of December –used in MIAMExpress (MAGE-ML export)

9 MAGEstk Programming APIs Mapping of MAGE-OM to language- specific OMs API’s are automatically generated from the OM specifications –get/set methods for associations –get/set methods for attributes XML language-specific OM marshallers/unmarshallers - also automatically generated

10 MAGEstk (cont.) Use opensource/standard modules/packages –Xerces, JDBC, etc. Implementation in Java, C++, Perl, Python database access modules on top of these APIs –Postgres schema –DB access layer annotation tools - planned

11 ArrayExpress data retrieval main objective - help in finding and initial exploration of data; download for detailed analysis data repository (now) + data warehouse (in development)

12 Array Design - accession - name Protocol - accession Experiment - accession Organisation - name Array SpeciesSample Hybridisation Experiment Design Experiment Type Experimental Factor Person - last name Protocol Type Queries - logical structure

13 Query form

14 Annotation browsing

15 Data representation spots measurements BioAssays (hybridizations, data transformations) QuantitationTypes (signal intensity, ratio etc.) DesignElements (spots, genes) in MAGE/ArrayExpress in Expression Profiler

16 Exporting data to Expression Profiler BioAssays (hybridizations, data transformations) QuantitationTypes (signal intensity, ratio etc.) DesignElements (spots) BioAssayData1 BioAssayData2 select BioAssayData cubes select QuantitationTypes select BioAssays DesignElements (QT,BA) pairs

17 Data export form

18 Array representation - ADF format

19 Experiment plan display

20 ArrayExpress (Oracle + Tomcat) Other Microarray databases www EBI Expression Profiler External Bioinformatics databases Data analysis www Queries www MIAMExpress (MySQL) MAGE-ML Submissions Array Manufacturers LIMS Microarray software Data Analysis software ArrayExpress Infrastructure MAGE-ML import, export Local MIAMExpress Installations Data pipelines MAGE-ML

21 Tomcat ArrayExpress architecture ArrayExpress (Oracle) MAGE-ML (DTD) MAGE-OM MAGE-ML (doc) MAGE loader Velocity template engine Castor object/ relational mapping Web page template Web page template Java servlets MAGE validator MAGE unloader error.log

22 ArrayExpress: other technical details Data matrices - stored in NetCDF format: –binary format for efficient storage of multidimensional array Arrays - stored as ADF spreadsheets (in addition to normal MAGE structures)

23 In development Immediate: –interface efficiency improvements –BioAssays - graphical display –better integration with Expression Profiler Medium-term: –user management non-public data (e.g., for reviewers) –MAGE-ML export Curation tool

24 ratioabsolute change confidence measure namedesign element type speciessample type bioassay type performer labexper. type array design name platform type provider Properties Data warehouse - for gene- and data-driven queries namebiological entity type

25 Microarray Informatics team at EBI Alvis Brazma - group leader ArrayExpress Curation MIAMExpress Ugis Sarkans Gonzalo Garcia Helen Parkinson Mohammadreza Shojatalab Expression Profiler Jaak Vilo Research, students Thomas Schlitt Katja Kivinen Johan Rung Patrick Kemmeren Misha Kapushesky Lev Soinov Koichi Tazaki Anastasia Samsonova Susanna Sansone Philippe Rocca-Serra Ele Holloway Niran Abeyguna- wardena Ahmet Oezcimen Gaurab Mukherjee Sergio Contrino Anjan Sharma Aurora Torrente