1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)

Slides:



Advertisements
Similar presentations
Misha Kapushesky November 28, 2003 Expression Profiler: Next Generation.
Advertisements

The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
ArrayExpress Query Interface Gonzalo Garc í a Lara January, / 24.
Abstract BarleyBase ( is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression.
Minimum Information About a Microarray Experiment - MIAME MGED 5 workshop.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MIAME and Data Standards Phillip Lord. Why Standards? "However, there is a subtle implication that standardization (fixation) is a good thing". An anonymous.
Transcriptomics Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht.
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
Midterm project Course: Statistics in Bioinformatics Date: 指導教授 : 陳光琦 學生 : 吳昱賢.
MARS: Microarray analysis, retrieval, and storage system Albert F. Cervantes.
Persistent Systems Pvt. Ltd. Gene Expression Analysis Using Microarrays Dr Mushtaq Ahmed Technology Incubation Division Persistent.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
EBI is an Outstation of the European Molecular Biology Laboratory. MAGE-TAB - The ArrayExpress Production Experience Helen Parkinson, PhD.
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
1 MAGE-OM and ArrayExpress database model Ugis Sarkans, EBI.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
1 Update on ArrayExpress & standards Ugis Sarkans, EBI.
European Bioinformatics Institute MGED Society Establishing the infrastructure for sharing microarray data Alvis Brazma European Bioinformatics Institute.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
Gene Expression Omnibus (GEO)
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
Copyright OpenHelix. No use or reproduction without express written consent1.
MIAMExpress development and local installation DESPRAD Meeting,November 2002 Mohammad shojatalab
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
Presentation on SubmissionTrackingTool: by Anjan Sharma.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
DESPRAD subproject Alvis Brazma EMBL-EBI Hinxton, October 20, 2003.
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
MIAMExpress development October 2002 Mohammad shojatalab
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
Copyright © 2013 Curt Hill UML Unified Modeling Language.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
Common Terminology Services 2 CTS 2 Submission Team Status Update HL7 Vocabulary Working Group May 17, 2011.
TEMBLOR review meeting - EMBL-EBI, Hinxton, October 20 th 2003 Integration of J-Express with ArrayExpress Partner 20 University of Bergen Inge Jonassen.
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
1 ArrayExpress Ugis Sarkans, EBI. 2 Overview Underlying standards –MIAME –MAGE* Data submission Data access –annotations –actual data –array design descriptions.
TEMBLOR mid-term review Participation in DESPRAD project Bernd Drescher Robert Wagner.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
Yu, et al.’s “A Model-Driven Development Framework for Enterprise Web Services” In proceedings of the 10 th IEEE Intl Enterprise Distributed Object Computing.
ArrayExpress Ugis Sarkans EMBL - EBI
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
Director’s Challenge IT Overview
Using ArrayExpress.
Presentation transcript:

1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies) –common ways of describing data processing –how to query information ArrayExpress –public repository for microarray data –

2 What information should be exchanged? MIAME - Minimum Information About a Microarray Experiment –informal specification –paper published in Nature Genetics –goal - to initiate discussion: which details are important and which may not be

3 Ultimate dream Samples Genes Gene expression levels (in mRNA counts/cell) Pointers to (a)well-establishedgene database(s) Pointers to awell-establishedsample ontology Minimum information is the following table:

4 Currently: MIAME six parts 1. Experimental design: the set of the hybridisation experiments as a whole 2. Array design: each array used and each element (spot) on the array 3. Samples: samples used, the extract preparation and labeling 4. Hybridizations: procedures and parameters 5. Measurements: images, quantitation, specifications 6. Controls: types, values, specifications

5 Login Pending/New Experiment Sample1Sample2Sample3 Sample n Sample protocol Hybridisations Hyb protocol Array 1 Array 2 Array 3 Array n Scanning protocol Data 1 Data 2 Data 3 Data n Image analysis protocol Combined Experiment Data Transformation protocol Submit Final free text comment Create account Extracts 1…n E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn Extraction protocol MIAMExpress submission procedure MAGE-ML

6 How the information should be exchanged? MAGE OM- MicroArray Gene Expression Object Model –formal specification - UML (Unified Modeling Language) model –described by a set of diagrams –standardized through Object Management Group –describes the domain of microarray data –can serve as a source for generating various software artifacts

7 MAGE - brief history August Life Sciences Research group formed within the Object Management Group March gene expression RFP issued December initial submissions of proposals for gene expression data standards: –EBI (on behalf of MGED) - MAML –Rosetta (on behalf of GEML community) - GEML + some IDLs –NetGenics - IDLs

8 MAGE - brief history (2) Decision to proceed with a joint submission Decision to base the standard on UML Submitters’ meetings throughout 2001 End of January MAGE becomes an adopted specification October MAGE becomes an available specification MAGE-ML - XML language - automatically derived from MAGE (More than) MIAME-compliant; only subset can be used

9 MAGE – an example diagram

10 Use case of MAGE: ArrayExpress architecture ArrayExpress (Oracle) Browser MIAMEexpress MAGE-ML (DTD) MAGE-OM MAGE-ML (doc) data loader Velocity template engine Castor object/ relational mapping Web page template Web page template Java servlets Tomcat

11 ArrayExpress (Oracle) Other Microarray databases www EBI Expression Profiler External Bioinformatics databases Data analysis www Queries www MIAMExpress (MySQL) MAGE-ML Submissions Array Manufacturers LIMS Microarray software Data Analysis software ArrayExpress Infrastructure MAGE-ML import, export Local MIAMExpress Installations Data pipelines MAGE-ML

12 Common terms (ontologies) What is an ontology? –formal model of some domain –simplest ontologies – controlled vocabularies –hierarchical, other relations, constraints, … MGED Ontology maintained by Chris Stoeckert, UPenn enables: –unambiguous annotation –therefore, queries currently sample description experiment design description to come multiple formats: RDFS, DAML+OIL

13 Ontologies and ArrayExpress Curation team –lead by Helen Parkinson –currently 5 curators Curation tool under development –management of all relevant ontologies “under one roof” –support in distributed ontology development –submission tracking –accession numbers –...

14 Common ways of describing data processing no “deliverables” yet MAGE can describe data processing –just syntax, too much free text Laboratory Activity Broker process within OMG - common points? problem: –it is possible to come up with a universal framework that can describe all possible scenarios of data processing –however, how will it be used in real life?

15 process instance clusteringpattern discovery visualization data filtering dataparameter values in out... workflow enactment process typedata type in out workflow parameters

16 Benefits compile “best practices” of data analysis document what has been done to obtain final results enable “high-throughput” data analysis work

17 How to query information again no “deliverables” initial plan - MAGE will include query support –all methods were dropped - a data model ArrayExpress - 2 large components: –repository - retrieve experiments as units, MAGE-based –warehouse - gene & data- oriented queries, work across experiments G2G (Jason Stewart) - protocol + query language for distributed queries

18 ratioabsolute change confidence measure namedesign element type speciessample type bioassay type performer labexper. type array design name platform type provider Properties

19 Summary