Johannes Griss PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Slides:



Advertisements
Similar presentations
PSI Mass Spectrometry Standards Working Group Summary HUPO PSI MS Standards Working Group.
Advertisements

1 Adding a statistics package Module 2 Session 7.
Database Modeling Past and Present
Sandra Orchard EMBL-EBI Molecular Interactions
Dan Bolser, EMBL-EBI transPLANT portal: Overview and search Versailles, 12th-13th November 2012 trans-National Infrastructure for Plant Genomic Science.
1 Actuate Corporation © 2010 THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE.
MS-Viewer – A Web Based Spectral Viewer For Database Search Results Peter R. Baker 1, Alma L. Burlingame 1 and Robert J. Chalkley 1 1 Mass Spectrometry.
Alternate Software Development Methodologies
CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.
EBI is an Outstation of the European Molecular Biology Laboratory. PRIDE associated tools: Practical exercise 1 PRIDE team, Proteomics Services Group PANDA.
MIAPE Extractor Tutorial SHPP meeting, 28 Aug 2012 La Cristalera, Miraflores de la Sierra, Madrid Salvador Martínez de Bartolomé Izquierdo CNB-CSIC / ProteoRed.
Minding Your Own Business The Platform for Privacy Preferences Project and Privacy Minder Lorrie Faith Cranor AT&T Labs-Research
5 EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI.
5 EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI.
Requirements Specification
Supplement 02CASE Tools1 Supplement 02 - Case Tools And Franchise Colleges By MANSHA NAWAZ.
BIOCMS: Resource Integration and Web Application Framework for Bioinformatics DHUNDY R BASTOLA †, *, ANIL KHADKA †, MOHAMMAD SHAFIULLAH † AND HESHAM ALI.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 17 Slide 1 Rapid software development.
An innovative platform to allow translation and indexing of internet sites Localization World
Chapter 11 Databases.
UML Tools ● UML is a language, not a tool ● UML tools make use of UML possible ● Choice of tools, for individual or group use, has a large affect on acceptance.
New Tools Samifier: A tool which converts results from protein tandem mass spectrometry into SAM format. This enables co-visualization of genomics, transcriptomics,
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Implementation of HUBzero as a Knowledge Management System in a Large Organization HUBBUB Conference 2012 September 24 th, 2012 Gaurav Nanda, Jonathan.
VeribisCRM CUSTOMER RELATIONSHIP MANAGEMENT Engin Duran Experience is our know how.
SOFTWARE ENGINEERING BIT-8 APRIL, 16,2008 Introduction to UML.
Data Curation and Management activities within the UCT Computational Biology Group Dr Nicky Mulder.
Excel-Based Solutions For Large Data Systems by Douglas M. Smith / Abundant Solutions Data can be extracted from large data systems (mainframe, AS/400,
Automated Data Analysis National Center for Immunization & Respiratory Diseases Influenza Division Nishan Ahmed Data Management Training Cairo, Egypt April.
Introduction to database systems
Copyright OpenHelix. No use or reproduction without express written consent1.
EBI is an Outstation of the European Molecular Biology Laboratory. Proteomics repositories PRIDE team, Proteomics Services Group PANDA group European Bioinformatics.
Communications support for the Vodafone EMF community Pre-read for EMF Leader Workshop, 8 April 2008 Dianne Sullivan & Ros Young.
1st Workshop on Intelligent and Knowledge oriented Technologies Universal Semantic Knowledge Middleware Marek Paralič,
How to assure MIAPE compliance of the data using the ProteoRed MIAPE Extractor tool HUPO-PSI meeting - Liverpool (15th April 2013) Salvador Martínez-Bartolomé.
Computer Science 101 Database Concepts. Database Collection of related data Models real world “universe” Reflects changes Specific purposes and audience.
Crux flexible, structured data reporting for funding agencies.
Data Standards Submission 1 st CHr-16 Workshop. Miraflores de la Sierra August, 28 th -29 th 2012 Alberto Medina.
Writing Scientific Papers Additional materials required for manuscript preparation and submission Prof Steve Leharne.
WG4 SUMMARY DATA WG4 can be summarized in one word: Management of data: Use of data How do we keep track of, exchange and manage all the data that is generated.
Copyright OpenHelix. No use or reproduction without express written consent1.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
The CERIF-2000 and Vocabularies Andrei Lopatenko Vienna University of Technology
Research and Business Proposals and Planning for Business Reports
Software Project MassAnalyst Roeland Luitwieler Marnix Kammer April 24, 2006.
Proteomics databases for comparative studies: Transactional and Data Warehouse approaches Patricia Rodriguez-Tomé, Nicolas Pinaud, Thomas Kowall GeneProt,
OWL Representing Information Using the Web Ontology Language.
Bioinformatics Curriculum Issues, goals, curriculum.
Lesson 7 – Microsoft Excel 2010 Working with Tables, PivotTables, and PivotCharts.
EBI is an Outstation of the European Molecular Biology Laboratory. PRIDE centric exercise: BioMart interface PRIDE team, Proteomics Services Group PANDA.
D E F A B C D E F FeatureExplanation AcvParam indicating that peptide-level scoring has been done and that feature B MAY be present and features D, E and.
Reporting in PowerSchool Laurie Kinney Michigan PSUG Summer Academy June 2010.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Introduction ITM-711 Web Applications. 2 Outline  Overview of The Web  Good Design.
Requirements Engineering Requirements Validation and Management Lecture-24.
CHAPTER 1 – INTRODUCTION TO ACCESS Akhila Kondai September 30, 2013.
Copyright OpenHelix. No use or reproduction without express written consent1.
PREPARED BY: PN. SITI HADIJAH BINTI NORSANI. LEARNING OUTCOMES: Upon completion of this course, students should be able to: 1. Understand the structure.
After FactFinder: The future of data dissemination at Census Bureau December 17,
CoLIMS progress Computational Omics and Systems Biology (CompOmics) Group Niels Hulstaert
Presenter: Bradley Green.  What is Bioinformatics?  Brief History of Bioinformatics  Development  Computer Science and Bioinformatics  Current Applications.
Democratization of ‘Omics Data Availability and Review Robert Chalkley UCSF Data Management Editor - MCP.
Ricardo EIto Brun Strasbourg, 5 Nov 2015
Take a REST from manual searching: PDBe, programmatically
European Network of e-Lexicography
Introduction To System Analysis and Design PART 2
Visualizing and Analyzing NIAID’s Research Portfolio Dolan Ghosh, Ph.D., and Marie Parker Office of Strategic Planning, Initiative Development, and Analysis.
C.U.SHAH COLLEGE OF ENG. & TECH.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Johannes Griss PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for A Simple Data Format for Proteomics Results

Johannes Griss PSI Meeting Heidelberg, April 2011 Current Situation The necessity of standard data formats has become generally accepted Proteomics techniques are constantly evolving Proposed standard formats had to become very complex to adequately capture proteomics data mzIdentML for identification data mzQuantML for quantitative data An effective use of these data formats requires sophisticated bioinformatic knowledge Many researchers are still used to use MS Excel to “look” at their data

Johannes Griss PSI Meeting Heidelberg, April 2011 Communication of Proteomics Results Proteomics resources require a mechanism to simply/efficiently exchange basic proteomics results Collaboration with colleagues from other scientific fields is increasingly important Necessity to share proteomics results with researchers outside of proteomics Need to make proteomics data easily accessible

Johannes Griss PSI Meeting Heidelberg, April 2011 Potential Current Problems Currently proposed standard formats are difficult to use without the JAVA APIs “Complete” standard formats are too complex and big to quickly share the essential results Quick, f.e. Perl scripts for specific research questions are not easily possible Large amount of potential innovation could be lost Reading files requires special software Further processing of the data (f.e. with statistical) tools is not easily possible No standard tools to read / write mz*ML files available Custom built software required for many use cases otherwise fulfilled by “Excel & friends”

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab - Aim To provide a simple and efficient way of exchanging proteomics data Which protein / peptide was identified in a given experimental setting Easy to update and maintain Easy to use by the proteomics community, systems biologists as well as providers of knowledge bases

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab – Target Audience Proteomics repositories (f.e. PRIDE, PeptideAtlas) Knowledge base resources (f.e. UniProt, HPRD) Researchers outside of proteomics Researchers analyzing proteomics data with limited bioinformatic knowledge / support

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab – proposed concept A tab-delimited file format Goals Content should be “readable” using MS Excel Should contain minimal information for proteomics repositories / knowledge bases to exchange data Data should be easily accessible using f.e. scripting languages One file should be able to contain multiple experiments / proteins from different resources Aim: To represent the result of a query to f.e. PRIDE using this format Provide a simplisitic summary of proteomics results Every entry contains a reference to the source data (in mzIdentML / mzQuantML format)

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab – proposed concept What the format does NOT aim at: Replace mzIdentML or mzQuantML Contain the complete data of a proteomics experiment Provide detailed evidence for the data Allow a researcher to recreate the process which led to the results Be requirements conform (MIAPE, journal guidelines, etc.) In short: be complete in any way

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab – Possible Format Specification Three sections (Optional) Metdata section (Required) Protein section (Optional) Peptide section Can report proteomics data at different levels Single experiments Multiple (possibly linked) experiments Data generated as a result to a query (possibly to multiple resources)

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab – Metadata Section ----metadata PRIDE_16649-title: The Synaptic Proteome during Development and Plasticity of the Mouse Visual Cortex PRIDE_16649-species: [NEWT, 10090, Mouse,] PRIDE_16649-tissue: [EFO, EFO: , visual cortex,] PRIDE_16649-instrument[1]-type: [MS, MS: , TOF-MS,] PRIDE_16649-search_engine: [MS, MS: , Mascot, ] PRIDE_16649-contact[1]-name: August B Smit PRIDE_16649-contact[1]- PRIDE_16649-url: END

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab – Protein Section ----proteins Accession … reliabilitypeptides … ambiguity_members P P12346,P … ´----END A Table holding the basic identification information Suggestions of how to include quantitative data multiple search engine scores ambiguous modification positions

Johannes Griss PSI Meeting Heidelberg, April 2011 mzTab – Peptide Table ----peptides sequence accession unit unique … reliability … DIIL O00160 PRIDE_3381 false 5 … VESVDL O00160 PRIDE_3381 true 4 … ----END A Table holding the basic peptide information Suggestions of how to include quantitative data multiple search engine scores ambiguous modification positions

Johannes Griss PSI Meeting Heidelberg, April 2011 PRIDE collaborators Links, collaborations and funding Funding