MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.

Slides:



Advertisements
Similar presentations
ArrayExpress A public database for microarray based gene expression data European Bioinformatics Institute EMBL-EBI Alvis.
Advertisements

Misha Kapushesky November 28, 2003 Expression Profiler: Next Generation.
The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Visualisationmodule Catherine Leroy, Pierre Marguerite, Bhuwan Tiwari, Niran Abeygunawardena, Sergio Contrino, Anna Farne, Ele Holloway, Gaurab Mukherjee,
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI,
Transcriptomics Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht.
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
GCB/CIS 535 Microarray Topics John Tobias November 15 th, 2004.
MARS: Microarray analysis, retrieval, and storage system Albert F. Cervantes.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
The importance of meta data capture – problems and solutions Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NERC Meta Data.
EBI is an Outstation of the European Molecular Biology Laboratory. MAGE-TAB - The ArrayExpress Production Experience Helen Parkinson, PhD.
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
1 Update on ArrayExpress & standards Ugis Sarkans, EBI.
European Bioinformatics Institute MGED Society Establishing the infrastructure for sharing microarray data Alvis Brazma European Bioinformatics Institute.
The MGED Society Facilitating Data Sharing and Integration with Standards CTSA Omics Data Standards Working Group Chris Stoeckert Dept. of Genetics and.
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of.
Standards and Ontologies for Data Annotation Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NBN-EBI Course, October 2002.
MIAMExpress development and local installation DESPRAD Meeting,November 2002 Mohammad shojatalab
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
DESPRAD subproject Alvis Brazma EMBL-EBI Hinxton, October 20, 2003.
VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.
EBI is an Outstation of the European Molecular Biology Laboratory. Anatomy ontology ArrayExpress Helen Parkinson,
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
The European Bioinformatics Institute Atlas of Gene Human Gene Expression Proposal - resources Alvis Brazma, Tom Freeman and Helen Parkinson.
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
MGED Ontology Working Group MGED4 Boston, MA Feb. 15, 2002 Chris Stoeckert, Center for Bioinformatics, U. Penn Helen Parkinson, EBI.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
MIAMExpress development October 2002 Mohammad shojatalab
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Agenda Intro: Information management in Biology Information management engineering Formats and standards XML MAGE example Perspectives: the Semantic Web.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
TEMBLOR review meeting - EMBL-EBI, Hinxton, October 20 th 2003 Integration of J-Express with ArrayExpress Partner 20 University of Bergen Inge Jonassen.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
1 ArrayExpress Ugis Sarkans, EBI. 2 Overview Underlying standards –MIAME –MAGE* Data submission Data access –annotations –actual data –array design descriptions.
TEMBLOR mid-term review Participation in DESPRAD project Bernd Drescher Robert Wagner.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
Describing Bioinformatic Metadata at EBI James Malone
ArrayExpress Ugis Sarkans EMBL - EBI
Using ArrayExpress.
From MIAME to MAML: Microarray Gene Expression Database (MGED)
Presentation transcript:

MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute Microarrays and Data Mining 10 th -11 th December 2002

Outline  Capturing information  Ontologies  MIAMExpress

Capturing information  Lab book – only useful for the individual  Annotate in a controlled way  Submit information to a database / LIMS Need information understandable by all Allows easy retrieval Available to other researchers

What is an ontology?  A kind of controlled vocabulary (CV) expressed in a structured way.

Components of an ontology  Class  Instance Has a definition and a relationship to other classes (is-a, part-of, kind-of). Terms that are contained within a class. = container for information. e.g. An exon is part of a gene

An ontology – what can it do?  Captures knowledge  Shared understanding  Structure enriches CV  Computer ‘readable’

Why do we need an ontology for the database?  To help users annotate their data usefully and easily  To perform structured queries  To accurately compare data  To avoid problems with free text searching  To avoid excessive curation workload in future

Annotation Data mining Controlled vocabulary Free text Database Natural language processing

Standards and Ontologies for Functional Genomics Aim: To bring together scientists (biologists and bioinformaticians) developing standards and ontologies 17 – 20 th November 2002 Hinxton

Examples of ontologies and CVs  MGED Ontology – For describing samples used in microarray experiments – Gene Ontology – Edinburgh Mouse Atlas Project – Drosophila genome database  NCBI Taxonomy  GO  EMAP  FlyBase - All organisms represented in the genetic databases

Infrastructure EBI Expression Profiler External bioinformatics databases www Submissions Queries www Data analysis www MAGE-ML Local MIAMExpress installations Array manufacturers LIMS Data pipelines ArrayExpress (Oracle) Other microarray databases Data analysis software Microarray software MAGE-ML import/export MIAMExpress MAGE-ML

MIAME requirements  Experimental design  Array design  Samples  Measurements  Normalization controls  Hybridizations Nature Genetics 29(4):

External links NormalizationData ArrayHybridizationSample Experiment 6 parts of a microarray experiment MEDLINE Publication details MGED Experiment details NCBI taxonomy CAS/ Merck EMAP Mouse stage Species Chemical compd. EMBL Gene acc. n o. Gene name GO Genew

MGED Ontology  Community effort  Supports efforts of MAGE - MGED Society  Describes the parts of a microarray experiment  References out to external ontologies

MGED Ontology  Structured in DAML+OIL using OilEd 3.4

MIAMExpress  Submission and annotation tool  Based on MIAME concepts  Array, Experiment and Protocol submissions  Perl-CGI, MySQL database

Submission process

Tour of MIAMExpress  Login +Password  Multi-user environment  Control over data access

Login New/Pending Experiment Sample 1Sample 2Sample 3Sample 4

Login New/Pending Experiment Sample 1Sample 2Sample 3Sample 4 Extracts 1….n E1E1 E1E1 E1E1 E1E1 E2E2 E2E2 E2E2 E2E2 EnEn EnEn EnEn EnEn

Login New/Pending Experiment Sample 1Sample 2Sample 3Sample 4 Extracts 1….n E1E1 E1E1 E1E1 E1E1 E2E2 E2E2 E2E2 E2E2 EnEn EnEn EnEn EnEn LE Lab. Extr. 1….n

Login New/Pending Experiment Sample 1Sample 2Sample 3Sample 4 Extracts 1….n E1E1 E1E1 E1E1 E1E1 E2E2 E2E2 E2E2 E2E2 EnEn EnEn EnEn EnEn LE Lab. Extr. 1….n Hybridizations Array 1 Array 2 Array 3 Array n Data 1 Data 2 Data 3 Data n

Submission successful  Curation  Export of MAGE-ML  Loading to ArrayExpress

ArrayExpress MIAMExpress RAD MAGE-ML data exchange Ontology instances propagated to submission/annotation web forms Curation of user defined terms, before inclusion in the ontology User defined terms collected via forms MGED Ontology BiomaterialDescription Sex C C C C Gender documentation: Subclass of sex applicable to heterogametic species (i.e., those in which the sexes produce gametes of markedly different size). Males produce small numerous gametes. Females produce small numbers of large gametes. Hermaphrodites are individuals with both male and female characteristics. Mixed refers to a population of individuals with more than one type of gender. used in individuals: female,hermaphrodite,male,mixed_sex,unknown_sex

Resources  Microarray Informatics Group  MIAMExpress  MGED Ontology Working Group  Sourceforge 

Acknowledgements ArrayExpress  Ugis Sarkans  Gonzalo Garcia  Ahmet Oezcimen  Anjan Sharma Curation  Helen Parkinson  Gaurab Mukherjee  Philippe Rocca-Serra  Susanna Sansone MIAMExpress  Mohammad Shojatalab  Niran Abeygunawardena  Sergio Contrino Alvis Brazma MGED Ontology  Chris Stoeckert (U. Penn)

 GO  EMAP  FlyBase  NCBI Taxonomy