Presentation is loading. Please wait.

Presentation is loading. Please wait.

SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK.

Similar presentations


Presentation on theme: "SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK."— Presentation transcript:

1 SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK

2 Data Exchange in SysMO Public data sources model organism databases – (e.g. SGD) BRENDA …. Data produced by SysMO SABIO-RK, iChiP, MeMo …. Local databases & Files Excel Spreadsheets The most common form of experimental data format. Proteomics Metadata Metabolomics Microarray Proteomics Single Cell Data Variable descriptions of data Little adoption of community controlled vocabulary terms

3 Challenges..… Enable data to be easily exchanged & integrated Preserving project autonomy Working with existing resources Wikis; CMS - Alfresco, eGroupWare,MediaWiki; Databases- BASE, maxD; Files and Spreadsheets. Falling in with common work practices Exploiting existing resources in the community

4 COSMIC BaCell- SysMO SysMOLab MOSES Alfresco Wiki ANOTHER A DATA STORE Extracting Data

5 JERM JERM “Just Enough Results Model” Minimum information to exchange data What type of data is it Microarray, growth curve, enzyme activity… What was measured Gene expression, OD, metabolite concentration…. What do the values in the datasets mean Units, time series, repeats…. Which experiment does it relate to How was the data created SOPs and protocols

6 The Idea For each data type….. Transcriptomics Proteomics Metabolomics Single Cell Data Generate and apply…. JERM template JERM extractor for data host Subset registered in SEEK Access / export through JERM interface / template Define a JERM….. Top down analysis of standards Bottom up analysis of practice 1 2 3 ISA-TAB

7 For publishing JERM data needs to be related to SOPs, experimental context and other data JERM must be “MIBBI” compliant for exporting to public repositories e.g. Microarray data needs to be MIAME compliant

8 CIMRCIMR Core Information for Metabolomics Reporting MIABEMIABE Minimal Information About a Bioactive Entity MIACAMIACA Minimal Information About a Cellular Assay MIAMEMIAME Minimum Information About a Microarray Experiment MIAME/EnvMIAME/Env MIAME / Environmental transcriptomic experiment MIAME/NutrMIAME/Nutr MIAME / Nutrigenomics MIAME/PlantMIAME/Plant MIAME / Plant transcriptomics MIAME/ToxMIAME/Tox MIAME / Toxicogenomics MIAPAMIAPA Minimum Information About a Phylogenetic Analysis MIAPARMIAPAR Minimum Information About a Protein Affinity Reagent MIAPEMIAPE Minimum Information About a Proteomics Experiment MIAREMIARE Minimum Information About a RNAi Experiment MIASEMIASE Minimum Information About a Simulation Experiment MIENSMIENS Minimum Information about an ENvironmental Sequence MIFlowCytMIFlowCyt Minimum Information for a Flow Cytometry Experiment MIGenMIGen Minimum Information about a Genotyping Experiment MIGSMIGS Minimum Information about a Genome Sequence MIMIxMIMIx Minimum Information about a Molecular Interaction Experiment MIMPPMIMPP Minimal Information for Mouse Phenotyping Procedures MINIMINI Minimum Information about a Neuroscience Investigation MINIMESSMINIMESS Minimal Metagenome Sequence Analysis Standard MINSEQEMINSEQE Minimum Information about a high-throughput SeQuencing Experiment MIPFEMIPFE Minimal Information for Protein Functional Evaluation MIQASMIQAS Minimal Information for QTLs and Association Studies MIqPCRMIqPCR Minimum Information about a quantitative Polymerase Chain Reaction experiment MIRIAMMIRIAM Minimal Information Required In the Annotation of biochemical Models MISFISHIEMISFISHIE Minimum Information Specification For In Situ Hybridization and Immunohistochemistry Experiments STRENDASTRENDA Standards for Reporting Enzymology Data TBCTBC Tox Biology Checklist BioPAX : Biological Pathways Exchange http://www.biopax.org/http://www.biopax.org/ FuGE Functional Genomics Experiment MGED: Microarray Experimental Conditions http://www.mibbi.org/index.php/MIBBI_portal Minimum Information Models

9 Investigation Title Invasive vs. non-invasive strains of yeast Experimental Design individual_genetic_characteris tics_design growth_condition_design Experimental Factor NameEF_GenotypeEF_GrowthCond Experimental Factor Typegenotypegrowth_condition Person Last NameFalstaffShakespeare Person First NameJohnBill Person Rolessubmitter;investigatorinvestigator Experiment Description An experiment was performed to... Protocol NameYeast GrowthRNA extraction Protocol Typegrownucleic_acid_extraction Protocol Description S. cerevisiae cultures were grown on... Total cellular RNA was extracted... Protocol Parameterscarbon source;temperature SDRF Filemy_sdrf_file.txt

10 ISA-TAB Relating data and its experimental context Investigation, Study, Assay TAB = tabular A format suitable for spreadsheets

11 “assists in the reporting and local management of experimental metadata (i.e. sample characteristics, technologies used, type of measurements) from studies employing one or a combination of technologies facilitates submission to international public repositories of genomics, transcriptomics and proteomics studies” Originally developed for multiple ‘omics data

12 ArrayExpress Pride Existing production systems Transcriptomics data files + required experimental descriptors Proteomics data files + required experimental descriptors HUPO-PSI standards MGED standards Mage TAB Proteome Harvest MIAMExpress Mage-ML PSI-XML(s) Current situation @ EBI NO common representation of complex studies Independent databases, different metadata representation, format, diverse terminologies etc. STORAGE SUBMISSION RETRIEVAL

13 ISA Provides.... A common framework for describing how your data relates to its experimental context A common framework for relating different types of data

14 ISA Provides Cross walking between the Omics data stores Relating microarrays and proteomics etc if they are part of the same study Providing a single mechanism for submission to multiple data silos

15 ISA Defined Investigation: high level description of the area and the main aims of a project Study: a particular biological hypothesis or analysis Assay: specific, individual experiments required to be undertaken together in order to address the study hypotheses

16 ISA in SysMO Investigation: main aims of SysMO projects Analysis of Central Carbon Metabolism of Sulfolobus solfataricus under varying temperatures Study: a collection of experiments designed to answer a particular biological question Comparison of S. solfataricus grown at 70 and 80 degrees Assay: individual experiments in the study Comparison of transcriptome 70 and 80c (Cdna microarray) Comparison of proteome at 70 and 80c (Protein expression profiling) Comparison of proteome at 70 and 80c (Protein expression profiling) Enzyme activity tests for s. solfataricus (Assay types) Intracellular metabolomics of s. solfataricus at 70 and 80c (Metabolomics) Intracellular metabolomics of s. solfataricus at 70 and 80c (Metabolomics)

17 ISA in SysMO Assays linked to data files Data files linked together Assays and data files linked to protocols and SOPs ISA data is available to all in consortium Data files and SOPs may be shared or kept private

18 Advantages A common structure across consortium Can be bundled together with data files to produce a common export format Allows automated submission to public omics stores ArrayExpress, Pride etc Requires SysMO consortium members to only record metadata once

19 Experimental Data Metadata People Projects Assay Study Experimental conditions Factors studied Models SOPs Homogenised terminology and values in the datasets themselves Workflows Based on ISA-TAB Investigation SEEK + JERM

20 Acknowledgements SysMO-DB Team SysMO-PALS myGrid, EML and JWS Online teams OMII-UK, Uni Southampton EMBL-EBI, MCISB


Download ppt "SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK."

Similar presentations


Ads by Google