The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Mouse Phenotype Ontology George Gkoutos. Phenotype Annotation Traditional phenotypic descriptions are captures as free text Information retrieval based.
The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania.
Visualisationmodule Catherine Leroy, Pierre Marguerite, Bhuwan Tiwari, Niran Abeygunawardena, Sergio Contrino, Anna Farne, Ele Holloway, Gaurab Mukherjee,
Minimum Information About a Microarray Experiment - MIAME MGED 5 workshop.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI,
MIAME and Data Standards Phillip Lord. Why Standards? "However, there is a subtle implication that standardization (fixation) is a good thing". An anonymous.
MIAME Minimum Information About a Microarray Experiment
Transcriptomics Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht.
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
MARS: Microarray analysis, retrieval, and storage system Albert F. Cervantes.
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
Microarray Gene Expression Database (MGED) Ontology Working Group Chris Stoeckert Center for Bioinformatics University of Pennsylvania July 26, 2001.
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
The importance of meta data capture – problems and solutions Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NERC Meta Data.
Excerpts from a Sample Description courtesy of M. Hoffman, S. Schmidtke, Lion BioSciences Organism: mus musculus [ NCBI taxonomy browser ] Cell source:
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
1 Update on ArrayExpress & standards Ugis Sarkans, EBI.
European Bioinformatics Institute MGED Society Establishing the infrastructure for sharing microarray data Alvis Brazma European Bioinformatics Institute.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Copyright OpenHelix. No use or reproduction without express written consent1.
Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of.
Standards and Ontologies for Data Annotation Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NBN-EBI Course, October 2002.
MIAMExpress development and local installation DESPRAD Meeting,November 2002 Mohammad shojatalab
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
DESPRAD subproject Alvis Brazma EMBL-EBI Hinxton, October 20, 2003.
Review of Array Express Thomas, M.D. Georgia Institute of Technology 21 June, 2006.
EBI is an Outstation of the European Molecular Biology Laboratory. Anatomy ontology ArrayExpress Helen Parkinson,
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
MGED Ontology Working Group MGED4 Boston, MA Feb. 15, 2002 Chris Stoeckert, Center for Bioinformatics, U. Penn Helen Parkinson, EBI.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
MIAMExpress development October 2002 Mohammad shojatalab
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
1 ArrayExpress Ugis Sarkans, EBI. 2 Overview Underlying standards –MIAME –MAGE* Data submission Data access –annotations –actual data –array design descriptions.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
ArrayExpress Ugis Sarkans EMBL - EBI
Using ArrayExpress.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays
From MIAME to MAML: Microarray Gene Expression Database (MGED)
MGED Ontology Working Group Report
Presentation transcript:

The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1 March th 2002 Philippe Rocca-Serra Microarray Informatics Team EBI-EMBL, Hinxton Cambridge

The European Bioinformatics Institute ArrayExpress: a database for Gene Expression Studies Samples Genes Gene expression data matrix

The European Bioinformatics Institute ArrayExpress goals  To create a public repository for gene expression data:  apply a standard format  apply curation to the data (high quality control)  easy access to information  search and retrieve information  To compare experiments.  To perform analysis and data mining using complex querying

The European Bioinformatics Institute Gene expression data matrix Experiment (platform, conditions…) What kind of data should be stored ? Samples Genes & transcription units annotations

The European Bioinformatics Institute Important issues about data annotation  Sufficient annotation of the experiment, genes and samples  Efficient annotation: Machine processable: effective mining agents Homogenous: consistent annotation Unambiguous: accurate description, sample discrimination.

The European Bioinformatics Institute MIAME Requirements: addressing the issue of sufficient annotation  Experimental design: the set of hybridisation experiments as a whole  Array design: each array used and each element (spot) on the array  Samples: samples used, extract preparation and labelling  Hybridisations: procedures and parameters  Measurements: images, quantitation, specifications  Normalisation controls: types, values, specifications (Brazma et al, Nature Genetics, 2001)  Samples: samples used, extract preparation and labelling Recorded info should be sufficient to interpret and replicate the experiment

The European Bioinformatics Institute Second Challenge Addressing the issue of annotation efficiency  requires machine understandable annotations: –Avoid free text and natural language: –Avoid synonyms: adrenaline / epinephrine –General use of CV and Ontologies  Gene annotation using e.g. GO and pathway analysis  Create a new ontology where necessary: –Task assigned to MGED for Biomaterial (sample) description One of the main MGED Goal to facilitate the adoption of standards for DNA-array experiment annotation and data representation

The European Bioinformatics Institute  ArrayExpress DB is an implementation of the MAGE-OM model (a UML model)  MAGE model by construction includes the use of ontology entries : -37 locations for an “Ontology Entry” -36 cases of simple Controlled Vocabularies: e.g. Image Format (TIFF, JPEG) -1 has required development of specific modelling: Biomaterial (sample) description Ontology integration in the object model describing ArrayExpress database

The European Bioinformatics Institute MAGE BioMaterial Model

The European Bioinformatics Institute Facts about MGED biomaterial ontology  Authors: Developed by Chris Stoeckert, U. Penn and Helen Parkinson, EBI  Coordinated with the ArrayExpress database model (mapping available)  Technical choices: Use of the OIL Language –A new standard for building ontologies provides support for Formal Semantics and Reasoning: –Class/property modelling primitives based on Frame based systems: –Semantics Capturing based on Description Logics: –Syntax for encoding primitives and semantics based on existing Web languages: XML  Availability:

The European Bioinformatics Institute MGED ontology: features & complexity  Facts about the ontology: –75 classes –70 slots –98 individuals –more individuals to be added

The European Bioinformatics Institute Using MGED Ontology: a Browseable Form

The European Bioinformatics Institute MGED defined concepts: internal terms

The European Bioinformatics Institute Linking to external ontologies: an application

The European Bioinformatics Institute ©-BioMaterialDescription ©-Biosource Property ©-Organism ©-Age ©-DevelopmentStage ©-Sex ©-StrainOrLine ©-BiosourceProvider ©-OrganismPart ©-BioMaterialManipulation ©-EnvironmentalHistory ©-CultureCondition ©-Temperature ©-Humidity ©-Light ©-PathogenTests ©-Water ©-Nutrients ©-Treatment ©-CompoundBasedTreatment (Compound) (Treatment_application) (Measurement) Instances 7 weeks after birth Female Charles River, Japan 22  2  C 55  5% 12 hours light/dark cycle Specified pathogen free conditions ad libitum MF, Oriental Yeast, Tokyo, Japan in vivo, oral gavage 100mg/kg body weight MGED Ontology External References NCBI Taxonomy Mouse Anatomical Dictionary International Committee on Standardized Genetic Nomenclature for Mice International Committee on Standardized Genetic Nomenclature for Mice Mouse Anatomical Dictionary ChemIDplus Mus musculus musculus id: Stage 28 C57BL/6 Liver Fenofibrate, CAS

The European Bioinformatics Institute Referencing to external ontologies  NCBI taxonomy database  Jackson Lab mouse strains and genes  Edinburgh mouse atlas anatomy  GO Gene Ontology  HUGO nomenclature for Human genes  Chemical and compound Ontologies - Merck index  TAIR  Flybase  …..and many more…

The European Bioinformatics Institute Planning MGED ontology’s future  Making the ontology available where it’s needed:  Develop browser or other interface for the ontology and link to LIMS  Incorporate the ontology into submission/annotation and curation tools (MIAMExpress)

The European Bioinformatics Institute Planning MGED ontology’s future ArrayExpress DB Direct Submission in Mage-ML Large centres LIMS Submission via MIAMExpress Curation DB Other submitters Ontology availability made simple ? MGED/ArrayExpress ontology External Ontologies

The European Bioinformatics Institute Planning MGED ontology’s future  Making the ontology available where it’s needed:  Develop browser or other interface for the ontology and link to LIMS  Incorporate the ontology into submission/annotation and curation tools (MIAMExpress)  Further ontology development : new instances, class refinement  Better integration of available ontologies  Writing guidelines on how to use ontologies for annotating data: Developing Use cases (non trivial task)

The European Bioinformatics Institute Resources  List of ontology resources from MGED pages  MAGE-MIAME-ontology mappings, MIAME glossary  Schemas for both ArrayExpress and MIAMExpress  Annotation examples in MAGE-ML URL: mailing

The European Bioinformatics Institute Acknowledgements EBI-EMBL:University of Pennsylvania: H. ParkinsonC. Stoeckert S. Sansone E. Holloway A. Brazma And the Microarray Informatics Team.