Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.

Similar presentations


Presentation on theme: "1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics."— Presentation transcript:

1 1 MIAME The MIAME website: http://www.mged.org © 2002 Norman Morrison for Manchester Bioinformatics.

2 2 Overview Why capture meta-data? The data capture challenges –What to capture? –How to capture it? –Who agrees what to capture?

3 3 Post-genome data bioinformatics genome transcriptome proteome interactome metabolome textome mobileome phenome

4 4 Why meta-data? Genome data is static Post-genome is very state-dependant –Transcriptome = no. of cell types * no. no of environmental conditions –Annotation matters –Data comparisons matter –Learn from the gene debacle Protein-tyrosine phosphatase, non-receptor type 6, Protein-tyrosine phosphatase 1C, PTP-1C, Hematopoietic cell protein-tyrosine phosphatase, SH-PTP1, Protein- tyrosine phosphatase SHP-1 LARD, death receptor 3 beta, WSL-1R protein, lymphocyte associated receptor of death, death receptor 3 We need repositories

5 5 Microarray Repositories A repository is a primary source of data generated by experimentalists. Its main role is to enforce standards and quality thresholds and to make data widely available. Needs standards.

6 6 Microarray Repositories II Repositories allow for easier data exchange between groups Ensure that key details are kept BUT: –What should be captured and how Requires international cooperation –Minimal Information for the Annotation of Microarray experiments (MIAME) –Developed within MGED

7 7 MIAME – Major Sections Array design –Reporters –Features –Control elements Experimental design –Experiment type –Sample details –Hybridisations –Measurements

8 8 The Six Parts of MIAME 1.Experimental design: the set of hybridization experiments as a whole 2.Array design: each array used and each element (spot, feature) on the array 3.Samples: samples used, extract preparation and labeling 4.Hybridizations: procedures and parameters 5.Measurements: images, quantification and specifications 6.Normalization controls: types, values and specifications

9 9 MIAME Glossary

10 10 Value of audit Based on (qualifier, value, source) Qualifier: cell type Value: epithelial Source: Gray’s anatomy (38th ed.) or Qualifier: treatment Value: 15heat shock Source: Smith and Jones, Nature Genet. (1992)

11 11 MIAME definitions Available from www.mged.org A minimum document to be read All details mentioned in MIAME should be captured somewhere –Know where they are Latest draft: Version 1.1 (Draft 5, March 5, 2002) –Discussed at MGED IV See also: A. Brazma, et al., Nature Genetics, vol 29 (December 2001), pp 365 - 371

12 12 MIAME part 1: – array description In principle this is someone else’s problem –(e.g. Affymetrix, Clonetech, etc.) Three levels of array design elements: –feature – the location on the array –reporter – the nucleotide sequence present in a particular location on the array –composite sequence – a set of reporters used collectively to measure an expression of a particular gene, exon, or splice-variant Array design has 5 parts: 1.1 Array related information 1.2 Reporter information 1.3 Feature information 1.4 Composite sequences 1.5 Control elements

13 13 MIAME part 2: Experimental design This is your problem Experimental design has four parts 2.1 Experimental design 2.2 Sample 2.3 Hybridisation 2.4 Measurements

14 14 2.1 Experimental design Design and purpose of the set of hybridisations Author, lab and contact Experiment type Experimental factors Number of hybridisations Common reference QC steps Experiment description (plus refs) Anything else

15 15 2.2 Sample Biosource properties – organism, contact, cell type, sex,…. Biomaterial manipulation – growth conditions, in vivo treatment, compound Sample labelling – label used, amount, method Spiked controls – feature, type Anything else

16 16 2.3 Hybridisation Relationship between samples and arrays Protocol – full description Anything else

17 17 2.4 Measurement Raw data – scanner files, scanning protocol Scanning protocol – parameter settings Analysis and quantification – analysis output, protocol – e.g. algorithms Normalisation – strategies and algorithms, final gene expression table Anything else

18 18 MAGE, ontologies and maxd The MAGE website: http://www.mged.org © 2002 Norman Morrison for Manchester Bioinformatics.

19 19 Outline MIAME is useful, but ….. –How can we represent it computationally? –How can we use it to share and exchange data? –The wonderful world of XML –The evil that is free text ontologies and controlled vocabularies Maxd – MIAME supportive, MAGE-ML compliant analysis of microarray data.

20 MAGE-ML, MAGE-OM MIAME sets a standard for what knowledge (meta-data) to capture But how to do it? Need a knowledge model – a schema to represent the knowledge and the relationships between them.

21 Knowledge capture UML – Universal Modelling Language provides a methodology for capturing knowledge in ways that are computationally tractable (cf database schemas) MAGE-OM is the MGED approved UML model which attempts to capture the concepts in MIAME

22 XML A UML diagram is not useful by itself MAGE-ML is an attempt to capture MAGE-OM in XML (eXtended Markup Language) – the next generation HTML MAGE-ML provides a structure for a text document (marked up with tags) which describes a microarray experiment

23 MAGE-ML MAGE-ML is not nice! –Complex –Not easily human readable –Needs software tools to help create it –Very rich MAGE-ML is the standard we have to work to.

24 ArrayExpress ArrayExpress is the new public microarray data repository based at the EBI Provides tools to help create MAGE-ML Experiments will not be entered unless the annotation is of a high quality

25 Making MAGE useable For a repository we need a relational database – not an object model We have created a relational implementation of the MAGE-OM which is MIAME compliant (based on an early UML diagram for arrayexpress) - maxdSQL

26 Data repositories Relational version of MAGE-OM

27 Outstanding issues – free text MAGE provides a structure for the knowledge – not a prescription for what gets put in How to control what people put in the free text areas of MIAME/MAGE (the mickey mouse /. problem) How do we define what is meant in ways that other people/software understand

28 Solution 1 Controlled vocabularies –Agreed lists of terms (and definitions) that a community agree to use Pros: technical simple, easy to implement Cons: limiting, how to get agreement?, terms on there own are not very descriptive

29 Solution 2 Ontologies –Can be thought of as a set of agreed terms and the relationships between them (a taxonomy is a simple ontology in which the only relationship allowed is an is-a relationship) Pros: a very rich and powerful infrastructure Cons: complex Many developments – a space to watch –Chris Stoeckert and Helen Parkinson http://www.cbil.upenn.edu/Ontology/MGED_ontology.html


Download ppt "1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics."

Similar presentations


Ads by Google