MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI,

Slides:



Advertisements
Similar presentations
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Advertisements

Mouse Phenotype Ontology George Gkoutos. Phenotype Annotation Traditional phenotypic descriptions are captures as free text Information retrieval based.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
Ontology Notes are from:
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MIAME and Data Standards Phillip Lord. Why Standards? "However, there is a subtle implication that standardization (fixation) is a good thing". An anonymous.
MIAME Minimum Information About a Microarray Experiment
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
What is an Ontology? AmphibiaTree 2006 Workshop Saturday 8:45–9:15 A. Maglia.
From SHIQ and RDF to OWL: The Making of a Web Ontology Language
GCB/CIS 535 Microarray Topics John Tobias November 15 th, 2004.
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
OntologyEntry in MAGE Chris Stoeckert, Helen Parkinson Trish Whetzel, Joe White Gilberto Fragoso, Liju Fan, Mervi Heiskanen, Angel Pizarro Ontology Working.
Microarray data repositories
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
Microarray Gene Expression Database (MGED) Ontology Working Group Chris Stoeckert Center for Bioinformatics University of Pennsylvania July 26, 2001.
The importance of meta data capture – problems and solutions Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NERC Meta Data.
Excerpts from a Sample Description courtesy of M. Hoffman, S. Schmidtke, Lion BioSciences Organism: mus musculus [ NCBI taxonomy browser ] Cell source:
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
Knowledge Representation Ontology are best delivered in some computable representation Variety of choices with different: –Expressiveness The range of.
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Copyright OpenHelix. No use or reproduction without express written consent1.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of.
Standards and Ontologies for Data Annotation Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NBN-EBI Course, October 2002.
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Taxonomies and Laws Lecture 10. Taxonomies and Laws Taxonomies enumerate scientifically relevant classes and organize them into a hierarchical structure,
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
MGED Ontology Working Group MGED4 Boston, MA Feb. 15, 2002 Chris Stoeckert, Center for Bioinformatics, U. Penn Helen Parkinson, EBI.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
FuGE: A framework for developing standards for functional genomics Andrew Jones School of Computer Science, University of Manchester Metabomeeting 2.0.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Organization of the Lab Three meetings:  today: general introduction, first steps in Protégé OWL  November 19: second part of tutorial  December 3:
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
Ontology Driven Data Collection for EuPathDB Jie Zheng, Omar Harb, Chris Stoeckert Center for Bioinformatics, University of Pennsylvania.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
ArrayExpress Ugis Sarkans EMBL - EBI
Expression Data Integration Microarray Gene Expression Database Meeting Sunday 14th November 1999.
ece 627 intelligent web: ontology and beyond
Using ArrayExpress.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
RAD (RNA Abundance Database)
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays
From MIAME to MAML: Microarray Gene Expression Database (MGED)
MGED Ontology Working Group Report
Functional Genomics Consortium: NIDDK (Kaestner) and (Permutt)
Presentation transcript:

MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI, Hinxton, Cambridge, UK Dec. 5, 2001 Chris Stoeckert, U. Penn

Ontology Usage for Genes in EpoDB EpoDB is a prototype system of genes expressed during erythropoiesis Built before microarrays were readily available Illustrate usage of an ontology of gene parts and controlled vocabularies of gene (and gene family) names

EpoDB “Gene Ontology”

Stoeckert, Salas, Brunk, Overton (1999) Nucl. Acids Res. 26:288

EpoDB Gene Landmark Query

What is an ontology? (In the computer science not philosophy sense) An ontology is a specification of concepts that includes the relationships between those concepts. Removes ambiguity. Provides semantics and constraints. Allows for computational inferences and reliable comparisons

Types of Ontologies Taxonomy –Tree structure. IS-A hierachy –Variants - Gene Ontology (DAG) Frame-based (object-oriented) –Classes and attributes –EcoCyc Description logic (DL) –Reasoning about concept (class) relationships –Combine terms with constraints (sanctioning) –GRAIL (GALEN, TAMBIS) Ontology Inference Layer (OIL) –Combines Frames and DLs –Uses Web standards XML and RDF

Taxonomy Terms for common usage –Homo sapiens, not human, not homo sapeins –NCBI ID = 9606 Hierarchy provides unambiguous levels of equivalence –Homo sapiens and Mus musculus are of the class Mammalia but Drosophila melanogaster is not. Can use taxonomic hierarchies for other types of information –e.g., Human Developmental Anatomy (U. of Edinburgh)

Microarray Information to be Captured Figure from: David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14

Tables Describing Samples in RAD (RNA Abundance Database) Experiment ExpGroupsGroups RelExperiments Exp.ControlGenes ControlGenes Hybridization Conditions Label Sample TreatmentDisease Devel. Stage ExperimentSample Taxon Anatomy

CBIL Anatomy Hierarchy

Anatomy Table Used by RAD

Usage of Anatomy Hierarchy to Query RAD

Standardisation of Microarray Data and Annotations -MGED Group The MGED group is a grass roots movement initially established at the Microarray Gene Expression Database meeting MGED 1 (14-15 November, 1999, Cambridge, UK). The goal of the group is to facilitate the adoption of standards for DNA-array experiment annotation and data representation, as well as the introduction of standard experimental controls and data normalisation methods. Members are from around the world in academia, government, and industry.

MGED Working Groups Annotation: Experiment description and data representation standards (Alvis Brazma, EMBL- EBI) Format: Microarray data XML exchange format (Paul Spellman, UC Berkeley) Ontology: Ontologies for sample description (Chris Stoeckert, U Penn) Normalization: Normalization, quality control and cross-platform comparison (Gavin Sherlock, Stanford U)

MGED Documents Annotation -> Minimal Information About a Microarray Experiment (MIAME) –What should go into a microarray database –Brazma et al. Nature Genetics 29: , 2001 Format -> Microarray Gene Expression (MAGE) Object Model and XML DTD –How microarray databases will talk to each other

Relationship of MGED Efforts MAGE MIAME DB MIAME DB External Ontologies/CVs MGED Ontology  Annotation  Format  Ontologies  External  Internal Ontologies provide common terms and their definitions for describing microarray experiments.

MGED Ontology Working Group Goals 1.Identify concepts 2.Collect available controlled vocabularies and ontologies for concepts 3.Define concepts 4.Formalize concept relationships

Species Resources

Concept Definitions

MGED Ontology Working Group Goals 1.Identify concepts 2.Collect available controlled vocabularies and ontologies for concepts 3.Define concepts 4.Formalize concept relationships

Usage of Concepts and Resources for Microarrays MIAME glossary –Provide definitions for types of information (concepts) listed in MIAME MIAME qualifier, value, source –Provide pointers to relevant sources that can be used to

sample source and treatment ID as used in section 1 organism (NCBI taxonomy) additional "qualifier, value, source" list; the list includes: cell source and type (if derived from primary sources (s)) sex age growth conditions development stage organism part (tissue) animal/plant strain or line genetic variation (e.g., gene knockout, transgenic variation) individual individual genetic characteristics (e.g., disease alleles, polymorphisms) disease state or normal target cell type cell line and source (if applicable) in vivo treatments (organism or individual treatments) in vitro treatments (cell culture conditions) treatment type (e.g., small molecule, heat shock, cold shock, food deprivation) compound is additional clinical information available (link) separation technique (e.g., none, trimming, microdissection, FACS) laboratory protocol for sample treatment MIAME Section on Sample Source and Treatment

Excerpts from a Sample Description courtesy of M. Hoffman, S. Schmidtke, Lion BioSciences Organism: mus musculus [ NCBI taxonomy browser ] Cell source: in-house bred mice (contact: Sex: female [ MGED ] Age: weeks after birth [ MGED ] Growth conditions: normal controlled environment o C average temperature housed in cages according to German and EU legislation specified pathogen free conditions (SPF) 14 hours light cycle 10 hours dark cycle Developmental stage: stage 28 (juvenile (young) mice) [ GXD "Mouse Anatomical Dictionary" ] Organism part: thymus [ GXD "Mouse Anatomical Dictionary" ] Strain or line: C57BL/6 [International Committee on Standardized Genetic Nomenclature for Mice] Genetic Variation: Inbr (J) 150. Origin: substrains 6 and 10 were separated prior to This substrain is now probably the most widely used of all inbred strains. Substrain 6 and 10 differ at the H9, Igh2 and Lv loci. Maint. by J,N, Ola. [International Committee on Standardized Genetic Nomenclature for Mice ] Treatment: in vivo [MGED] intraperitoneal injection of Dexamethasone into mice, 10 microgram per 25 g bodyweight of the mouse Compound: drug [MGED] synthetic glucocorticoid Dexamethasone, dissolved in PBS

MGED Ontology Working Group Goals 1.Identify concepts 2.Collect available controlled vocabularies and ontologies for concepts 3.Define concepts 4.Formalize concept relationships

MGED Biomaterial Ontology Under construction –Using OILed (Not wedded to any one tool) –Generate multiple formats: RDFS, DAML+OIL Define classes, provide relations and constraints, identify instances Motivated by MIAME and coordinated with MAGE

MAGE BioMaterial Model

Building a Microarray Ontology

Ontology Available as RDFS

Ontology in Browseable Form

Example of Internal Terms

Example of External Terms

Example of Combined Internal and External: Treatment

OWG Use Cases Return a summary of all experiments that use a specified type of biosource. –Use “age” to select and order experiments –Use Mouse Anatomical Dictionary Stage 28 to pick experiments according to “organism part” Return a summary of all experiments done examining effects of a specified treatment –E.g., Look for “CompoundBasedTreatment”, “in vivo” –Select “Compound” based on CAS registry number –Order based on “CompoundMeasurement” Build gene networks based on biomaterial description –Generate a distance metric based on biosource and use in calculation of correlation with gene expression level –Generate an error estimation based on biosample (i.e., even when biosources are identical, there will be variation resulting from different treatments)

Ontology Working Group Highlights First pass ontology of biomaterial descriptions Participated in Bio-ontologies Consortium Meeting at ISMB Mail list of about 200 subscribers

Ontology Working Group Plans Finish building biomaterial description ontology Expand efforts to include remaining parts of a microarray experiment Demonstrate usage to the microarray community

Acknowledgements Past and present members of CBIL for their work on EpoDB and RAD The members of the MGED Ontology Working Group for their contributions The Bio-Ontologies Consortium for encouragement and guidance This presentation is available at