From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^

Slides:



Advertisements
Similar presentations
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Advertisements

Mouse Phenotype Ontology George Gkoutos. Phenotype Annotation Traditional phenotypic descriptions are captures as free text Information retrieval based.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Minimum Information About a Microarray Experiment - MIAME MGED 5 workshop.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
FuGO: Development of a Functional Genomics Ontology (FuGO) Patricia L. Whetzel 1, Helen Parkinson 2, Assunta-Susanna Sansone 2,Chris Taylor 2, and Christian.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI,
MIAME and Data Standards Phillip Lord. Why Standards? "However, there is a subtle implication that standardization (fixation) is a good thing". An anonymous.
MIAME Minimum Information About a Microarray Experiment
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
Howard Fillit, MD Executive Director Improving Animal Trials for Alzheimer’s Disease: Recommendations for Best Practices.
Microarray data repositories
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
Microarray Gene Expression Database (MGED) Ontology Working Group Chris Stoeckert Center for Bioinformatics University of Pennsylvania July 26, 2001.
The importance of meta data capture – problems and solutions Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NERC Meta Data.
Excerpts from a Sample Description courtesy of M. Hoffman, S. Schmidtke, Lion BioSciences Organism: mus musculus [ NCBI taxonomy browser ] Cell source:
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
1 MAGE-OM and ArrayExpress database model Ugis Sarkans, EBI.
Using Styles and Style Sheets for Design
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
Gene Expression Omnibus (GEO)
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
Copyright OpenHelix. No use or reproduction without express written consent1.
MIAMExpress development and local installation DESPRAD Meeting,November 2002 Mohammad shojatalab
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
Resource Curation and Automated Resource Discovery.
DESPRAD subproject Alvis Brazma EMBL-EBI Hinxton, October 20, 2003.
VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
MGED Ontology Working Group MGED4 Boston, MA Feb. 15, 2002 Chris Stoeckert, Center for Bioinformatics, U. Penn Helen Parkinson, EBI.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
MIAMExpress development October 2002 Mohammad shojatalab
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
FuGE: A framework for developing standards for functional genomics Angel Pizarro Univesrity of Pennsylvania Andrew Jones University of Manchester.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
1 Cancer Models Database (caMOD). 2 History  January 2000 – Prototype is presented during the Mouse Models of Human Cancers (MMHCC) Steering Committee.
TEMBLOR review meeting - EMBL-EBI, Hinxton, October 20 th 2003 Integration of J-Express with ArrayExpress Partner 20 University of Bergen Inge Jonassen.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
ArrayExpress Ugis Sarkans EMBL - EBI
Using ArrayExpress.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays
From MIAME to MAML: Microarray Gene Expression Database (MGED)
MGED Ontology Working Group Report
Functional Genomics Consortium: NIDDK (Kaestner) and (Permutt)
Presentation transcript:

From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^

Standardisation of Microarray Data and Annotations -MGED Group The MGED group is a grass roots movement initially established at the Microarray Gene Expression Database meeting MGED 1 (14-15 November, 1999, Cambridge, UK). The goal of the group is to facilitate the adoption of standards for DNA-array experiment annotation and data representation, as well as the introduction of standard experimental controls and data normalisation methods. Members are from academia, government, and industry from around the world.

Why Microarray Data Standards? Standards are needed for: –Evaluating microarray data (standards in quality measures, protocols). –Exchanging microarray data (standards in data exchange). –Analysing microarray data (standards in annotations, data provided)

How to Create Microarray Data Standards Understand thoroughly what is the minimum information about a microarray experiment that is needed to interpret it unambiguously and what is the structure of this information (objects and relationships) Create the technical data format able to capture this information Find or generate appropriate controlled vocabularies and ontologies Create standards in experiments themselves (standard controls and protocols)

MGED Working Groups Experiment description and data representation standards (Alvis Brazma, EMBL-EBI) Microarray data XML exchange format (Paul Spellman, UC Berkeley) Ontologies for sample description (Chris Stoeckert, U Penn) Normalisation, quality control and cross-platform comparison (Frank Holstege, UMC Utrecht, Roger Bumgarner, U Wash)

MGED Milestones MGED 2 meeting in Heidelberg in 2000, MGED 3 in Stanford in 2001, both ~ 300 participants Minimum Information About a Microarray Experiment – MIAME version 1.0 posted Collaboration with OMG on data formats MAML+GEML = MAGE-ML and MAGE-OM MGED 4 meeting in 2001, in Boston in February MGED will become ISCB Special Interest Group

MIAME v1.0 Minimum Information About a Microarray Experiment Approved at MGED 3 meeting, Stanford University, March 28, 2001 The goal of the MIAME is to specify the minimum information that must be reported about an array based gene expression monitoring experiment in order to ensure the interpretability of the results, as well as potential verification by third parties. This is to facilitate establishing repositories and a data exchange format for array based gene expression data. The MGED group will encourage scientific journals and funding agencies to adopt policies requiring data submissions to repositories, once MIAME compliant repositories and annotation tools are established.

MIAME Descriptions Definition: The minimum information about a published microarray-based gene expression experiment should include a description of the: 1.Experimental design: the set of hybridisation experiments as a whole 2.Array design: each array used and each element (spot) on the array 3.Samples: samples used, extract preparation and labeling 4.Hybridisations: procedures and parameters 5.Measurements: images, quantitation, specifications 6.Normalisation controls: types, values, specifications An additional section dealing with the data quality assurance will be added in the next MIAME release.

sample source and treatment ID as used in section 1 organism (NCBI taxonomy) additional "qualifier, value, source" list; the list includes: cell source and type (if derived from primary sources (s)) sex age growth conditions development stage organism part (tissue) animal/plant strain or line genetic variation (e.g., gene knockout, transgenic variation) individual individual genetic characteristics (e.g., disease alleles, polymorphisms) disease state or normal target cell type cell line and source (if applicable) in vivo treatments (organism or individual treatments) in vitro treatments (cell culture conditions) treatment type (e.g., small molecule, heat shock, cold shock, food deprivation) compound is additional clinical information available (link) separation technique (e.g., none, trimming, microdissection, FACS) laboratory protocol for sample treatment MIAME Section on Sample Source and Treatment

MAGE SourceForge

MAGE BioMaterial Model

MAGE Programming Jamboree Toronto Sept Hosted by Jason Goncalves, Iobion APIs, Importers, Exporters Perl, Java, C++

MGED OWG home page

What is an ontology? An ontology is a specification of concepts that includes the relationships between those concepts. Provides semantics and constraints Allows for computational inferences and reliable comparisons

OWG Use Cases Return a summary of all experiments that use a specified type of biosource. –Group the experiments according to treatment. Return a summary of all experiments done examining effects of a specified treatment –Group the experiments according to biosource. Return a summary of all experiments measuring the expression of a specified gene. –Indicate when experiments confirm results, provide new information, or conflict. Generate a distance metric for experiment types Generate an error estimation for experimental descriptions

Species Resources

Concept Definitions

Excerpts from a Sample Description courtesy of M. Hoffman, S. Schmidtke, Lion BioSciences Organism: mus musculus [ NCBI taxonomy browser ] Cell source: in-house bred mice (contact: Sex: female [ MGED ] Age: weeks after birth [ MGED ] Growth conditions: normal controlled environment o C average temperature housed in cages according to German and EU legislation specified pathogen free conditions (SPF) 14 hours light cycle 10 hours dark cycle Developmental stage: stage 28 (juvenile (young) mice) [ GXD "Mouse Anatomical Dictionary" ] Organism part: thymus [ GXD "Mouse Anatomical Dictionary" ] Strain or line: C57BL/6 [International Committee on Standardized Genetic Nomenclature for Mice] Genetic Variation: Inbr (J) 150. Origin: substrains 6 and 10 were separated prior to This substrain is now probably the most widely used of all inbred strains. Substrain 6 and 10 differ at the H9, Igh2 and Lv loci. Maint. by J,N, Ola. [International Committee on Standardized Genetic Nomenclature for Mice ] Treatment: in vivo [MGED] intraperitoneal injection of Dexamethasone into mice, 10 microgram per 25 g bodyweight of the mouse Compound: drug [MGED] synthetic glucocorticoid Dexamethasone, dissolved in PBS

MGED Biomaterial Ontology Under construction –Using OILed (May use others) –Generating a RDF schema file Motivated by MIAME and coordinated with MAGE Extend classes, provide constraints, provide terms to use

MGED Plans MIAME 2.0 –Add/extend sections on normalisation,quality assurance, data analysis MAGE Software –Importers, exporters –Reflect MIAME changes and ontologies Ontologies –Identified resources –Ontology of entire microarray experiment Normalization –Discussion of methods –Common controls User’s Queries –Community needs

MGED Summary International grass-roots organization for microarray standards. –Public databases –Published experiments Generated MIAME and MAGE –MIAME: Guidelines for information capture –MAGE: Common object model Building ontologies and normalization standards –Ontologies: Common language –Normalization: New web page MGED 4 in Boston, MA, Feb , 2002

MGED-Related sites MGED: MIAME: MAGE: OWG: NWG: