Presentation on theme: "SysMO-DB: A pragmatic approach to sharing information amongst Systems Biology projects in Europe Carole Goble, University of Manchester,"— Presentation transcript:
SysMO-DB: A pragmatic approach to sharing information amongst Systems Biology projects in Europe http://www.sysmo-db.org Carole Goble, University of Manchester, UK
Pan European collaboration. Systems Biology of Microorganisms. The transition from growing to non-growing Bacillus subtilis cells Energy and Saccharomyces cerevisiae Biology of Clostridium acetobutylicum Gene interaction networks and models of cation homeostasis in Saccharomyces cerevisiae http://www.sysmo.net
Eleven individual projects, 91 institutes Different research outcomes A cross-section of microorganisms, incl. bacteria, archaea and yeast. Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way Present these processes in the form of computerized mathematical models. Pool research capacities and know-how. Running since April 2007. Two phases – more later! http://www.sysmo.net BaCell-SysMO COSMIC SUMO KOSMOBAC SysMO-LAB PSYSMO Valla MOSES TRANSLUCENT STREAM SulfoSYS
Types of stuff Multiple ‘omics genomics, transcriptomics proteomics, metabolomics Images Reaction Kinetics Models Relationships between data sets/experiments Procedures, experiments, data, results and models Analysis of data The same across many Systems Biology projects
The Problem (1) No one concept of experimentation or modelling No planned, shared infrastructure for pooling
Started July 2008, 3 years + 3 years 4 people, 3 teams over 3 sites Sensitively retrofit a data access, model handling and data integration platform. Support and manage the diversity of data, models and competencies. Web-based solution: exchange of data, models and processes. search for across the initiative‘s assets. dissemination of results. DB SysMO-DB
Own solutions Suspicion Data issues Resource Issues Own data solutions and collaboration environments. wikis, e-Groupware, PHProjekt, BaseCamp, PLONE, Alfresco, bespoke commercial … files and spreadsheets. Suspicion and caution over sharing. Interesting interplay between modellers, experimentalists and bioinformaticians. Many do not have data, or follow the standards that exist or know who is doing what. Much of the data cannot be compared Different organisms, different strains. No extra resources for the consortiums 91 institutes, 11 consortiums, some overlapping The Problem (2)
Principles… A series of small victories Realistic Don‘t reinvent Sustainable and extensible Migrate to community standards Provide instant gratification Address doubt and anxiety Keep barriers low.
Social Approach PALS - Power Contributors! 18 Postdocs and PhD students All three kinds of people Design and technical collaboration team Very intense collaboration UK and Continental PALS Chapters Audits and Sharing Methods, data, models, standards, software, schemas, spreadsheets, SOPs….. 20 questions want answered Summer Schools
Communication via PALs DB teamPALSProjects Show what is there Suggest what is possible Ask for requirements Give requirements Tell priorities Rate outcomes Suggest improvements Double check Transmit Disseminate Collect answers
Picking Pain Points. Keeping it Real. Project Directors Data remains with us. We control who sees what. Just enough exchange. Responsibility PALs Spreadsheets. Yellow Pages. Standard Operating Procedures.
SysMO SEEK Assets Catalogue. Archive. Social Network. Sharing Space. Gateway. Yellow Pages People. Expertise. Projects. Institutions. Facilities. Studies. Data Experimental data sets and analysed results. Gateway to data stores – SABIO-RK, ‘omics Models Store. Stimulate. Publish. Curate. Gateway to COPASI, JWS Online, BioModels Processes Laboratory protocols – Standard Operating Procedures Bioinformatics analyses – computational workflows - Taverna Model population and validation – workflows – Taverna Gateway to myExperiment, MolMeth, OpenWetWare…. Interlinking ASSETS CATALOGUE
SysMO SEEK Is there any group generating kinetic data? Is this data available? Who is working with which organism? What methods are been used to determine enzyme activity? Under which experimental conditions are my partners working on for the measurement of glucose concentration? ? ? ? ?
Access Permissions Protect: Just Enough Sharing Reusing myExperiment
Attribution Credit Reward and Provenance Reusing myExperiment
Human-readable web pages Yellow pages Web Service Access Assets catalogue Asset archive JERM Plug-in Architecture Applications and Resources Workflows SysMO CMS Sites Backup SysMO users Monitor Models Community Databases Workflows SOPs Processes myExperiment JWS Online SABIO-RK
Just Enough Results Model Harvest standards e.g. MIAME (MIBBI.org) consortium schemas and spreadsheets JERMs for each data type – microarray, metabolomics, proteomics Map to projects Distribute as spreadsheet templates “I only want to collect and share just enough results”
Experimental Data Metadata People Projects Assay Study Experimental conditions Factors studied Models SOPs Homogenised terminology and values in the datasets themselves Workflows ISA-TAB compliant Investigation Just Enough Results Model
COSMIC and BaCell ( Alfresco, document management system)
Keeping data safe at home Content Management System harvest Harvester Extractor Register Assets Catalogue SearchFetch Project X
Keeping data safe at home Content Management System Upload Extractor Register Assets Catalogue SearchFetch Project X upload
Keeping data safe with SEEK Content Management System Extractor Register Assets Catalogue SearchFetch Project X upload Upload
Models Exchange Experiment Data Exchange Verification Comparison ISA-TAB SBML MIRIAM Population Prediction MIBBI Standards OBO Controlled Vocabularies
Models Exchange Experiment Data Exchange Verification Comparison Just Enough Results Model ISA-TAB SBML MIRIAM Population Prediction MIBBI Standards OBO Controlled Vocabularies SBRML SB-TAB
Quality of Data – Reliable Interpretation Publication standards by stealth Controlled vocabulary plug in BioPortal
Observations - PALs Dissemination of standards Debunking myths Tools exchange Modeller – Experimentalist Trust Like, talking together Transcended the projects Project power politics PALs did their jobs….
Observations - Sharing Methods sharing. Protective of models. in progress vs published models. Access and Version management. Curator-Rival conflict Reluctant to share data. Even within their own projects. Legacy spreadsheets dominate. Curation practices vary. Centralised archive take-up. Point to Point Exchange. Nature 461, 145 (10 Sept09)
SysMO2 Musical Chairs Incentive Model for Sharing Future Funding Phase 2 - SysMO2 Projects dropped and added People dropped and added Institutions dropped and added Others reconstituted and added Incentive Model for Sharing? Convenience, Added Value? Personal benefit? Consortium Policies?
A Platform for Systems Biology Exchange Preservation and archiving. Widen Participation of mothership Community Exchange Bazaar Widen adoption of platform and enable exchange. Accelerant to standards Adoption of JERM. Curation tools CMS + JERM bundling Widen access to External Resources, incl. publication Added value and convenience Preparation for publishing. EMBL- EBI ‘omics datasets Public Model repositories isatab sbml
Research Objects and e-Laboratories Packaged Assets Workflows linked to models linked to data linked to SOPs Community standards Mixed resources External and central Trust Spreadsheets Integration via RDF linked data. myExperiment, MethodBox, NEMA, BioCatalogue
Summary http://www.sysmo-db.org Reality is messy. Extreme Technology Determinism vs Voluntarist Sociocultural shaping Extreme and continuous partnership with users. Act Local Think Global Agile development environment facilitated stream of features to tackle pain points. Leverage other e-Laboratories, Maintaining scientists’ buy-in. Socio-Political Axis dominates the Technical Axis. Collaboration evolutions. Confidence in exchange Consortium Policies.
SysMO-DB Team University of Stellenbosch, South Africa University of Manchester, UK Jacky Snoep EML Research gGmbH, Germany Isabel Rojas University of Manchester, UK Olga Krebs Wolfgang Müller Sergejs Aleksejevs Carole Goble Stuart Owen Katy Wolstencroft Finn Bacall