Presentation is loading. Please wait.

Presentation is loading. Please wait.

Application of meta-search, grid-computing, and machine-learning can significantly improve the sensitivity of peptide identification. The PepArML meta-search.

Similar presentations


Presentation on theme: "Application of meta-search, grid-computing, and machine-learning can significantly improve the sensitivity of peptide identification. The PepArML meta-search."— Presentation transcript:

1 Application of meta-search, grid-computing, and machine-learning can significantly improve the sensitivity of peptide identification. The PepArML meta-search engine is publicly available, free of charge, on-line from: http://edwardslab.bmcb.georgetown.edu Boosting Peptide Identification Performance by Combining Many Search Engines, Spectral Matching, and Proteotypic and Physicochemical Peptide Properties. Introduction Automatic search engine configuration and execution, parameterized by: Instrument & proteolytic agent Fixed and variable modifications Protein sequence database & MS/MS spectra file Peptide candidate selection MS/MS Spectra Reformatting Charge and precursor enumeration for peptide candidate selection (for charge & 13 C peak correction) Search engine formatting constraints (MGF/mzXML) Consistent MS/MS spectrum identifier tracking Spectrum file “chunking” Prabhakar Gubbala and Nathan J. Edwards, Georgetown University Medical Center Unified MS/MS Search Interface Peptide Identification Meta-Search via Grid-Computing Feature Rankings by Info. Gain Conclusions References The PepArML meta-search engine provides: A unified MS/MS search interface for Mascot, X!Tandem, OMSSA, KScore, SScore, MyriMatch, and InsPecT. Search job scheduling on independent large- scale heterogeneous computational grids. Additional features including tryptic digest, peptide physicochemical, and proteotypic [1] properties; spectra and precursor isotope cluster properties, plus retention-time modeling. Spectral match to synthetic spectra using Zhang’s KineticModel [2,3]. Unsupervised, model-free result combining using machine-learning (PepArML [4]) The PepArML meta-search engine improves peptide identification sensitivity, significantly increasing the number of peptide ids at 10% FDR. Georgetown University 1.P. Mallick, Schirle, M., Chen, S. S., Flory, M. R., Lee, H., Martin, D., Ranish, J., Raught, B., Schmitt, R., Werner, T., Kuster, B., Aebersold, R. Computational prediction of proteotypic peptides for quantitative proteomics. Nature Biotechnology (2006), 25 (1). 2.Z. Zhang, "Prediction of low-energy collision-induced dissociation spectra of peptides". Anal. Chem. (2004), 76(14). 3.Z. Zhang, "Prediction of Low-Energy Collision-Induced Dissociation Spectra of Peptides with Three or More Charges", Anal. Chem. (2005), 77(19). 4.N. Edwards, X. Wu, and C.-W. Tseng. "An Unsupervised, Model-Free, Machine-Learning Combiner for Peptide Identifications from Tandem Mass Spectra." Clinical Proteomics (2009), 5(1). PepArML – Evaluation of non-Search Engine Features NSF TeraGrid 1000+ CPUs Edwards Lab Scheduler & 80+ CPUs Meta-search with seven search engines; Automatic target & decoy searches. Secure communication Heterogeneous compute resources Scales to 250+ simultaneous searches Free, instant registration Simple search descriptionJob managementResult combining US HUPO 2010 Mascot, Tandem, OMSSA, KScore, SScore, MyriMatch, InsPecT 3969 search jobs, weeks of CPU time. Total elapsed time (Mascot bottleneck): < 28 hours. All non-Mascot jobs: < 19 hours. OMICS 17 Protein Mix LCQ MS/MS Dataset Semi-tryptic search of SwissProt 39408 spectra searched ~ 36 times: - Target + 2 decoys, 7 engines, 1+ vs 2+/3+ charge HeuristicVoting & FDR based heuristic. PepArMLPublicly available PepArML combiner. PepArML-PSFPepArML combiner without spectrum or peptide-based proteotypic and physicochemical features. PepArML-Dig,PSFPepArML combiner without trypsin digest, spectrum or peptide-based proteotypic and physicochemical features. PepArML+KMPepArML combiner plus spectral similarity to peptide’s synthetic spectrum (Zhang’s Kinetic Model [1,2])


Download ppt "Application of meta-search, grid-computing, and machine-learning can significantly improve the sensitivity of peptide identification. The PepArML meta-search."

Similar presentations


Ads by Google