Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)

Similar presentations


Presentation on theme: "1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)"— Presentation transcript:

1 1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies) –common ways of describing data processing –how to query information ArrayExpress –public repository for microarray data –www.ebi.ac.uk/arrayexpress

2 2 What information should be exchanged? MIAME - Minimum Information About a Microarray Experiment –informal specification –paper published in Nature Genetics –goal - to initiate discussion: which details are important and which may not be

3 3 Ultimate dream Samples Genes Gene expression levels (in mRNA counts/cell) Pointers to (a)well-establishedgene database(s) Pointers to awell-establishedsample ontology Minimum information is the following table:

4 4 Currently: MIAME six parts 1. Experimental design: the set of the hybridisation experiments as a whole 2. Array design: each array used and each element (spot) on the array 3. Samples: samples used, the extract preparation and labeling 4. Hybridizations: procedures and parameters 5. Measurements: images, quantitation, specifications 6. Controls: types, values, specifications

5 5 Login Pending/New Experiment Sample1Sample2Sample3 Sample n Sample protocol Hybridisations Hyb protocol Array 1 Array 2 Array 3 Array n Scanning protocol Data 1 Data 2 Data 3 Data n Image analysis protocol Combined Experiment Data Transformation protocol Submit Final free text comment Create account Extracts 1…n E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn Extraction protocol MIAMExpress submission procedure http://www.ebi.ac.uk/miamexpress MAGE-ML

6 6 How the information should be exchanged? MAGE OM- MicroArray Gene Expression Object Model –formal specification - UML (Unified Modeling Language) model –described by a set of diagrams –standardized through Object Management Group –describes the domain of microarray data –can serve as a source for generating various software artifacts

7 7 MAGE - brief history August 1997 - Life Sciences Research group formed within the Object Management Group March 2000 - gene expression RFP issued December 2000 - initial submissions of proposals for gene expression data standards: –EBI (on behalf of MGED) - MAML –Rosetta (on behalf of GEML community) - GEML + some IDLs –NetGenics - IDLs

8 8 MAGE - brief history (2) Decision to proceed with a joint submission Decision to base the standard on UML Submitters’ meetings throughout 2001 End of January 2002 - MAGE becomes an adopted specification October 2002 - MAGE becomes an available specification MAGE-ML - XML language - automatically derived from MAGE (More than) MIAME-compliant; only subset can be used

9 9 MAGE – an example diagram

10 10 Use case of MAGE: ArrayExpress architecture ArrayExpress (Oracle) Browser MIAMEexpress MAGE-ML (DTD) MAGE-OM MAGE-ML (doc) data loader Velocity template engine Castor object/ relational mapping Web page template Web page template Java servlets Tomcat

11 11 ArrayExpress (Oracle) Other Microarray databases www EBI Expression Profiler External Bioinformatics databases Data analysis www Queries www MIAMExpress (MySQL) MAGE-ML Submissions Array Manufacturers LIMS Microarray software Data Analysis software ArrayExpress Infrastructure MAGE-ML import, export Local MIAMExpress Installations Data pipelines MAGE-ML

12 12 Common terms (ontologies) What is an ontology? –formal model of some domain –simplest ontologies – controlled vocabularies –hierarchical, other relations, constraints, … MGED Ontology maintained by Chris Stoeckert, UPenn enables: –unambiguous annotation –therefore, queries currently sample description experiment design description to come multiple formats: RDFS, DAML+OIL

13 13 Ontologies and ArrayExpress Curation team –lead by Helen Parkinson –currently 5 curators Curation tool under development –management of all relevant ontologies “under one roof” –support in distributed ontology development –submission tracking –accession numbers –...

14 14 Common ways of describing data processing no “deliverables” yet MAGE can describe data processing –just syntax, too much free text Laboratory Activity Broker process within OMG - common points? problem: –it is possible to come up with a universal framework that can describe all possible scenarios of data processing –however, how will it be used in real life?

15 15 process instance clusteringpattern discovery visualization data filtering dataparameter values in out... workflow enactment process typedata type in out workflow parameters

16 16 Benefits compile “best practices” of data analysis document what has been done to obtain final results enable “high-throughput” data analysis work

17 17 How to query information again no “deliverables” initial plan - MAGE will include query support –all methods were dropped - a data model ArrayExpress - 2 large components: –repository - retrieve experiments as units, MAGE-based –warehouse - gene & data- oriented queries, work across experiments G2G (Jason Stewart) - protocol + query language for distributed queries

18 18 ratioabsolute change confidence measure namedesign element type speciessample type bioassay type performer labexper. type array design name platform type provider Properties

19 19 Summary


Download ppt "1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)"

Similar presentations


Ads by Google