Presentation is loading. Please wait.

Presentation is loading. Please wait.

The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.

Similar presentations


Presentation on theme: "The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1."— Presentation transcript:

1 The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1 March 23-24 th 2002 Philippe Rocca-Serra Microarray Informatics Team EBI-EMBL, Hinxton Cambridge

2 The European Bioinformatics Institute ArrayExpress: a database for Gene Expression Studies Samples Genes Gene expression data matrix

3 The European Bioinformatics Institute ArrayExpress goals  To create a public repository for gene expression data:  apply a standard format  apply curation to the data (high quality control)  easy access to information  search and retrieve information  To compare experiments.  To perform analysis and data mining using complex querying

4 The European Bioinformatics Institute Gene expression data matrix Experiment (platform, conditions…) What kind of data should be stored ? Samples Genes & transcription units annotations

5 The European Bioinformatics Institute Important issues about data annotation  Sufficient annotation of the experiment, genes and samples  Efficient annotation: Machine processable: effective mining agents Homogenous: consistent annotation Unambiguous: accurate description, sample discrimination.

6 The European Bioinformatics Institute MIAME Requirements: addressing the issue of sufficient annotation  Experimental design: the set of hybridisation experiments as a whole  Array design: each array used and each element (spot) on the array  Samples: samples used, extract preparation and labelling  Hybridisations: procedures and parameters  Measurements: images, quantitation, specifications  Normalisation controls: types, values, specifications (Brazma et al, Nature Genetics, 2001)  Samples: samples used, extract preparation and labelling Recorded info should be sufficient to interpret and replicate the experiment

7 The European Bioinformatics Institute Second Challenge Addressing the issue of annotation efficiency  requires machine understandable annotations: –Avoid free text and natural language: –Avoid synonyms: adrenaline / epinephrine –General use of CV and Ontologies  Gene annotation using e.g. GO and pathway analysis  Create a new ontology where necessary: –Task assigned to MGED for Biomaterial (sample) description One of the main MGED Goal to facilitate the adoption of standards for DNA-array experiment annotation and data representation

8 The European Bioinformatics Institute  ArrayExpress DB is an implementation of the MAGE-OM model (a UML model)  MAGE model by construction includes the use of ontology entries : -37 locations for an “Ontology Entry” -36 cases of simple Controlled Vocabularies: e.g. Image Format (TIFF, JPEG) -1 has required development of specific modelling: Biomaterial (sample) description Ontology integration in the object model describing ArrayExpress database

9 The European Bioinformatics Institute MAGE BioMaterial Model

10 The European Bioinformatics Institute Facts about MGED biomaterial ontology  Authors: Developed by Chris Stoeckert, U. Penn and Helen Parkinson, EBI  Coordinated with the ArrayExpress database model (mapping available)  Technical choices: Use of the OIL Language –A new standard for building ontologies provides support for Formal Semantics and Reasoning: –Class/property modelling primitives based on Frame based systems: –Semantics Capturing based on Description Logics: –Syntax for encoding primitives and semantics based on existing Web languages: XML  Availability: http://mged.sourceforge.net/Ontologies.shtml

11 The European Bioinformatics Institute MGED ontology: features & complexity  Facts about the ontology: –75 classes –70 slots –98 individuals –more individuals to be added

12 The European Bioinformatics Institute Using MGED Ontology: a Browseable Form

13 The European Bioinformatics Institute MGED defined concepts: internal terms

14 The European Bioinformatics Institute Linking to external ontologies: an application

15 The European Bioinformatics Institute ©-BioMaterialDescription ©-Biosource Property ©-Organism ©-Age ©-DevelopmentStage ©-Sex ©-StrainOrLine ©-BiosourceProvider ©-OrganismPart ©-BioMaterialManipulation ©-EnvironmentalHistory ©-CultureCondition ©-Temperature ©-Humidity ©-Light ©-PathogenTests ©-Water ©-Nutrients ©-Treatment ©-CompoundBasedTreatment (Compound) (Treatment_application) (Measurement) Instances 7 weeks after birth Female Charles River, Japan 22  2  C 55  5% 12 hours light/dark cycle Specified pathogen free conditions ad libitum MF, Oriental Yeast, Tokyo, Japan in vivo, oral gavage 100mg/kg body weight MGED Ontology External References NCBI Taxonomy Mouse Anatomical Dictionary International Committee on Standardized Genetic Nomenclature for Mice International Committee on Standardized Genetic Nomenclature for Mice Mouse Anatomical Dictionary ChemIDplus Mus musculus musculus id: 39442 Stage 28 C57BL/6 Liver Fenofibrate, CAS 49562-28-9

16 The European Bioinformatics Institute Referencing to external ontologies  NCBI taxonomy database  Jackson Lab mouse strains and genes  Edinburgh mouse atlas anatomy  GO Gene Ontology  HUGO nomenclature for Human genes  Chemical and compound Ontologies - Merck index  TAIR  Flybase  …..and many more…www.mged.org/ontology/

17 The European Bioinformatics Institute Planning MGED ontology’s future  Making the ontology available where it’s needed:  Develop browser or other interface for the ontology and link to LIMS  Incorporate the ontology into submission/annotation and curation tools (MIAMExpress)

18 The European Bioinformatics Institute Planning MGED ontology’s future ArrayExpress DB Direct Submission in Mage-ML Large centres LIMS Submission via MIAMExpress Curation DB Other submitters Ontology availability made simple ? MGED/ArrayExpress ontology External Ontologies

19 The European Bioinformatics Institute Planning MGED ontology’s future  Making the ontology available where it’s needed:  Develop browser or other interface for the ontology and link to LIMS  Incorporate the ontology into submission/annotation and curation tools (MIAMExpress)  Further ontology development : new instances, class refinement  Better integration of available ontologies  Writing guidelines on how to use ontologies for annotating data: Developing Use cases (non trivial task)

20 The European Bioinformatics Institute Resources  List of ontology resources from MGED pages  MAGE-MIAME-ontology mappings, MIAME glossary  Schemas for both ArrayExpress and MIAMExpress  Annotation examples in MAGE-ML URL: www.mged.org¦ www.ebi.ac.uk/microarray mailing lists:microarray-ontol-request@ebi.ac.uk microarray-annot-request@ebi.ac.uk

21 The European Bioinformatics Institute Acknowledgements EBI-EMBL:University of Pennsylvania: H. ParkinsonC. Stoeckert S. Sansone E. Holloway A. Brazma And the Microarray Informatics Team.


Download ppt "The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1."

Similar presentations


Ads by Google