Presentation is loading. Please wait.

Presentation is loading. Please wait.

ProReP - Protein Results Parser v3.0©

Similar presentations


Presentation on theme: "ProReP - Protein Results Parser v3.0©"— Presentation transcript:

1 ProReP - Protein Results Parser v3.0©
A Tool For Handling Tandem Mass Spectrometer Protein Database Search Results Capstone Presentation Kiran Annaiah (M.S Bioinformatics) Advisors Dr. Randy Arnold Dr. Haixu Tang

2 Outline Background Data generation from Mass Spec Experiment
Mascot Search Engine Why to parse Mascot results? Parser features Results Conclusions Acknowledgments

3 Background High-throughput “shotgun” Proteomics Mass Spectrometry
Identify, characterize and quantify all expressed proteins simultaneously in a mixture. Mass Spectrometry Peptide mass fingerprinting Collision Induced Dissociation (CID) spectra from MS/MS analysis LC/MS/MS approach used to identify protein components in a complex mixture Tandem mass spectra helps in inferring amino acid sequences of peptides

4 Peptide Mass Fingerprinting vs. MS/MS protein identification
James S. Eddes et.al., 2002, Proteomics

5 Database Searching L M G S E I P K b1 b2 b3 b4 b5 b6 b7 NH2 CO2 y7 y6
m/z y7 y6 y5 y4 y3 y2 y1 Database searching software Results MASCOT® Proteins found Hemoglobin, beta chain Pept. Mass Score Sequence HLDNLK VHLTDAEK AAVNGLWGK VINAFNDGLK VVAGVASALAHK LVINAFNDGLK Database (SwissProt) Actin MYTCVPIASEQUENCEMIMEWTPQSDLIRPTVCIMNERCVGGPYILCMTEND Amylase DSLIKRNYTIPMCSQIRECNHIPLMTRCHGYYKWSIALAINTQSFGIVRIVAMNKLPSSCRTIVGHWEDRICTMQNCISPPEKELIAVARGTSP

6 Mascot Search Engine Uses mass spectrometry data to identify proteins from primary sequence databases MS/MS ion search Enzyme cleavage rules applied to sequences in the protein databases Experimental mass values compared with calculated fragment ion mass values Use scoring algorithm to identify the closest match or matches Probability based MOWSE scoring algorithm Databases MSDB – non-identical protein sequence DB NCBInr SwissProt dbEST – “single-pass” cDNA sequences or EST’s

7 A Typical Experiment Analysis of Liver / Brain Tissue
Digest with Trypsin Liquid Chromatography LC eluting sample electrosprayed into Mass Spec APAAIGAYSQAVLVDR from 14.5 kDa translational inhibitor protein MS-MS on intense peak of a parent ion Raw data converted to a DTA file Mascot Search Generates Html file

8 Mascot output – Html file (avg. size 5 MB)

9 Motivation Mass spectrometry generates enormous amount of data
Mascot returns on an average hundreds of proteins matching the mass spectral data Time consuming to analyze the mascot results manually Need different ways of looking at data Comparison of various data sets (experiments) No tools were available in public domain to analyze Mascot results

10 Protein Results Parser v3.0
Features Single File parsing Sequence coverage - with single file parsing Two-file comparison Multiple files Compare Combine Tool was developed using Perl/Tk Windows application

11 Single File Parsing

12 Screened Html Result (smaller file size)

13 Sequence Coverage

14 Two file Comparison

15 Results – Comparison of Two Experiments

16 Combine and Compare Feature
Drug A Treatments (protein digest) Drug B Fractions (SCX) Triplicates (LC/MS/MS) 15 data files 15 data files Combine Combine Compare

17 Multiple File Comparison

18 Results – Multiple file comparison (sequential display)

19 Results – Multiple file comparison (tabular display)

20 Combine – Merging of multiple experiments

21 Results – combining multiple experiments
+ +

22 Conclusion Decreased data analysis and processing time.
Search results reduced using user specified criteria in an automated way. Removal of low-scoring peptide matched greatly improves the accuracy of data interpretation A single result file can be processed multiple times, using a different set of parsing criteria each time, without the need to repeat the database search. The ability to compare two or more result files in an automated fashion makes determination of sample similarity a nearly effortless endeavor

23 Acknowledgements Dr. Randy Arnold – Manager and Research Scientist
(Proteomics Research and Development Facility – Dept. of Chemistry) Dr. Haixu Tang – Asst. Prof, School of Informatics Abhijit Mahabal – Grad student, CS Dept. Kranthi Varala – Grad Student, Bioinformatics


Download ppt "ProReP - Protein Results Parser v3.0©"

Similar presentations


Ads by Google