Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and.

Similar presentations


Presentation on theme: "XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and."— Presentation transcript:

1 XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and Life Sciences, University of Glasgow

2 Proteomics Mass Spectrometry Database Search 1. 3. 3. Mass spectrometry (MS) to characterise protein spots 4. Database searches to identify proteins 1. 2D-PAGE to separate proteins 2. Image analysis to determine the volume of protein spots 2D-PAGE Image Analysis 2. 4.

3 Proteomics Data Issues Many different instruments for data collection Great variety of software used for analysis Access to external databases –For protein identification –Protein characterisation after ID High-throughput techniques generate very large data sets Scanner, MS Image analysis, MS viewer Genome, microarray, publications, more... Instruments Software Databases

4 A Standard Model for Proteomics Improve management of laboratory workflows Data Integration: link local data to external data sources Development of public databases, enabling: –Queries over protocols, raw data and analysis –Experiments to be reproduced or re-analysed by other research groups –Co-analysis of proteome data with genome, transcriptome and other resources

5 Biological Collaborators Parasitology research group –Investigating host-parasite response with Toxoplasma gondii Ras/Raf pathway research at the Beatson institute Functional Genomics facility at the IBLS Functional Genomics Facility - http://www.gla.ac.uk/departments/ibls/ASU/fgf/

6 MAGE model for Proteomics The MAGE model has been developed to store microarray protocols, data and analysis A similar model will facilitate integration between microarray and proteome data Aspects of the model require few modifications to be applicable to proteomics We are developing a new representation of 2D gel analysis and MS data

7 Experimental Protocols in MAGE Array Protocol BioAssay BioEvent BioMaterial ArrayDesign MAGE model is extensible Protocol is generated as an ordered list: events, materials and hardware Few changes required to focus on protein extraction rather than mRNA production

8 2D_PAGE Protocol BioAssay BioEvent BioMaterial 2D_PAGE_ Setup Experimental Protocols for 2D gels MAGE model is extensible Protocol is generated as an ordered list: events, materials and hardware Few changes required to focus on protein extraction rather than mRNA production

9 Proteomics Data Model Image analysis identifies spots observable on the gel Important to store raw data and analysis from MS Separate package for cross gel analysis e.g. time series 2D_PAGE Protein_Spots MS_SetupMS_Data Multiple_ Analysis Data_Analysis Link From Protocol BioSequence

10 Proteomics Model Experimental protocol packages require few changes from MAGE New data model includes MS data and statistical analysis between gels Model incorporates storage of external database searches 2D_PAGE Protocol BioAssay BioEvent BioMaterial 2D_PAGE_ Setup Protein_ Spots MS_Setup MS_Data Multiple_ Analysis Data_ Analysis BioSequence Experiment Audit& Security DescriptionMeasurement CommonBQS Annotation Data Protocol

11 Proteomics Database and Indexing Technology A prototype database for proteomics has been developed We have developed a specialised index structure for XML, in order to improve query performance The performance of the index has currently been tested with 800MB of protein data 1 1 2 6 7 3 4 8 9 Data Stores XML Index XML Dictionary 1 Experiment 2gelImage 3spots 4spot … Data Path Tree 1. Protein Information Resource - http://pir.georgetown.edu/

12 Related Research Databases: SWISS-2DPAGE, LIMS systems Standards: Proteomics Standards Initiative (PSI) –Standards for protein-protein interactions and mass spectrometry PEDRo system with PEML: Proteomics experiment markup language PSI: http://psidev.sourceforge.net/

13 Work In Progress Work towards an XML standard for proteomics Create standards for capturing statistical processing of large data sets Developing XML indexing technology to improve data integration and query power Developing a proteome database utilising XML indexing and a standard model

14 Contact jonesa@dcs.gla.ac.uk Bioinformatics Research Centre - www.brc.dcs.gla.ac.uk The Functional Genomics Facility is supported by a Wellcome Trust grant for £2.4M. My research is supported by an MRC Bioinformatics PhD studentship, Ela Hunt is supported by an MRC Fellowship. Acknowledgements Researchers in Jonathan Wastling lab for input into the model. Dr Ashwin Kotiwaliwale at the Beatson for the collaboration on the prototype database.


Download ppt "XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and."

Similar presentations


Ads by Google