Presentation is loading. Please wait.

Presentation is loading. Please wait.

Daehee Hwang Leroy Hood Institute for Systems Biology.

Similar presentations


Presentation on theme: "Daehee Hwang Leroy Hood Institute for Systems Biology."— Presentation transcript:

1 Daehee Hwang Leroy Hood Institute for Systems Biology

2 2 Why Prequips for Systems Biology with proteomic data? Need for visualization, analysis, and integration of multiple proteomic datasets: raw data level, peptide level, protein level, multi sample analysis Need for an interface between proteomic data and systems biology analytical tools such as network/pathway analyses

3 3 Integration of proteomic data at various levels Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Communication not possible! Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline

4 4 Pep3d: Quality Assessment Prequips Multi Sample Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Pep3D Properties -quality assessment -2D gel-like visualization Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR

5 5 Pep3d: Quality Assessment Pep3D Instance 1 Pep3D Instance 2 Communication not possible!

6 6 Interface to Systems Biology Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Communication not possible!

7 7 Prequips Overview Prequips Multi Sample Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR -handles multiple samples at all levels Key Properties -integrates high-level analysis tools -is extensible Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline

8 8 Integration of proteomic datasets at various levels Database Search raw data Mass Spectrometer peptide-level data e.g. mzXML, mzData,... Validation Peptide Quantification Protein Inference protein-level data Protein Quantitation e.g. pepXML, AnalysisXML,... e.g. protXML,... Trans-Proteomic Pipeline annotation further analysis results

9 9 Raw Data Data model Peptide LevelProtein Level Core Meta Single-Sample Analysis Multi-Sample Analysis Project Data Providers Data Structures protein-level data source, e.g. protXML files peptide-level data source, e.g. pepXML, dta or AnalysisXML files raw data level, e.g. mzXML or mzData files ViewersPerspectives

10 10 Case Study: Toponomic change in drug treated Mø Calreticulin BiP Bcl2 ATPase Lamp1 2468101214161820 8% 28% 114115116117 Fraction #: Mock1Mock2Thapsigargin

11 11 Visualization: Single exp. CID spectra that have been selected detailed information about one of the level 2 spectra project manager peak map for run 29 level 1 spectrum & corresponding CID spectra level 1 level 2 all scans of Mock 1 experiment

12 12 Visualization: Multiple exps. (polymer?) contamination in all 4 runs (this would be hard to see with Pep3D) green = 0 red = 1

13 13 Visualization: assess, quntify, etc. Mock Up (software is under development): m/z minmax retention time minmax map 1 map 2 map 3 map 4 map 5 map 6 map 1 map 2 map 3map 4 X X X Doesn’t really match the remaining 3 maps!

14 14 Prequips & the Gaggle Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser

15 15 Mayday

16 16 Cytoscape overall mouse protein/protein interaction map in Cytoscape

17 17 Analysis: Feature extraction Protein table Gaggle plugin for interaction with other tools Filters

18 18 Analysis: Feature extraction Gaggle plugin: selection for broadcast calreticulin

19 19 Analysis: Feature selection Mock1Mock2Thapsigargin

20 20 Broadcast to Gaggle

21 21 Prequips to Gaggle Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser

22 22 Gaggle Boss

23 23 Gaggle to Cytoscape Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser

24 24 Integration: Network Analysis proteasome complex ribosome large subunit chaperones actin filament regulation Thapsigargin 114 iTRAQ ratio

25 25 Cytoscape to Prequips Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser

26 26 Analysis: Feature extraction- Module selection the ids sent from Cytoscape through the Gaggle proteasome proteins

27 27 Prequips & the Gaggle Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser

28 28 Analysis: Functional enrichment the proteasome complex enriched compared to a mouse genome background

29 29 Prequips Summary Prequips Multi Sample Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR -handles multiple samples at all levels Key Properties -integrates high-level analysis tools -is extensible Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline

30 30 Conclusion general and extensible software for systems biology research with proteomics mass spectrometry data. Integration capability of data from various sources for visualization and analysis. An interactive environment that supports (visual) data exploration.

31 31 Software details implemented in Java based on Eclipse Rich Client Platform extremely modular architecture multiple plugin interfaces –e.g. viewers, data providers, algorithms meta information framework –analysis results, sequence information, annotation,... –data structures as plugins –requirement to support future analytical tools and data sources

32 32 Acknowledgements Special thanks to Nils Gehlenborg Hood Lab: Inyoul Lee Kay Nieselt Aebersold Lab: Nichole King, James Eddes, Eric Deutsch, Ning Zhang, David Shteynberg, Wei Yan, and Andrew Garbutt Paul Shannon for help with the Gaggle

33 33 Core Mayday DatabaseGaggle R Visualization Excel PostgreSQL database MySQL database R environment Bioconductor SBEAMS installation Machine Learning WEKA Library anything else Prequips

34 34 Cytoscape


Download ppt "Daehee Hwang Leroy Hood Institute for Systems Biology."

Similar presentations


Ads by Google