Presentation is loading. Please wait.

Presentation is loading. Please wait.

La Cristalera, Miraflores de la Sierra, 10-11 December 2012 HPPHPP Use of SEQUEST search results with ProteoRed.org MIAPE Extractor.

Similar presentations


Presentation on theme: "La Cristalera, Miraflores de la Sierra, 10-11 December 2012 HPPHPP Use of SEQUEST search results with ProteoRed.org MIAPE Extractor."— Presentation transcript:

1 La Cristalera, Miraflores de la Sierra, 10-11 December 2012 HPPHPP Use of SEQUEST search results with ProteoRed.org MIAPE Extractor

2 1.A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRed MIAPE Toolkit 1.A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRed MIAPE Toolkit Óscar Gallardo, Joan Villanueva, Montserrat Carrascal, Joaquín Abián 2.Data dependent acquisition using inclusion list (IL) 2.Data dependent acquisition using inclusion list (IL) Joan Villanueva, Óscar Gallardo, Joaquín Abián, Montserrat Carrascal I NDEX

3 Ó. Gallardo M ASCOT W ORKFLOW MIAPE Generation MIAPE Extractor Mass Spectra Identification Mascot Output file mzIdentML MIAPE MSMIAPE MSI MIAPE Generator Tool RAW MGF

4 MIAPE Extractor Ó. Gallardo Mass Spectra Identification Output file P ROTEOME D ISCOVERER W ORKFLOW RAWMSFMGFmzIdentML

5 Ó. Gallardo P ROTEOME D ISCOVERER W ORKFLOW RAWMGF (GPL) (GPL) LP-CSIC/UAB 2011-2012

6 Ó. Gallardo P ROTEOME D ISCOVERER W ORKFLOW RAWMGF

7 MIAPE Extractor Ó. Gallardo Mass Spectra Identification Output file P ROTEOME D ISCOVERER W ORKFLOW RAWMSFMGFmzIdentML

8 Ó. Gallardo A. Medina August 2012 P ROTEOME D ISCOVERER W ORKFLOW MSFmzIdentML

9 Ó. Gallardo P ROTEOME D ISCOVERER W ORKFLOW MSF.Prot.XML mzIdentML...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851)...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851) 1.ProCon 0.9.162 1.ProCon 0.9.162 was unable to interpret correctly the Controlled Vocabulary used by Proteome Discoverer to identify Post Translational Modifications (PTMs)...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found............................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. 2.ProCon 0.9.162 2.ProCon 0.9.162 also had problems with it’s internal array references

10 MIAPE Extractor Mass Spectra Identification Output file OMSSA W ORKFLOW RAWMGF Ó. Gallardo OMX (GPL) (GPL) LP-CSIC/UAB 2010-2012

11 MGF MIAPE Extractor Mass Spectra Identification Output file OMSSA W ORKFLOW RAW Ó. Gallardo mzIdentMLOMX

12 mzIdentMLOMX A. Medina August 2012 OMSSA W ORKFLOW Ó. Gallardo...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526)...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) mzIdentML Parsers mzIdentML Parsers were unable to process big OMX files because of internal memory management problems BIG, real-world, file

13 Ó. Gallardo P ROTEOME D ISCOVERER W ORKFLOW MSF.Prot.XML mzIdentML...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851)...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851) 1.ProCon 0.9.163 1.ProCon 0.9.163 was unable to identify correctly Post Translational Modifications (PTMs), marking all of them as “unknown modification” in the resulting mzIdentML file...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found............................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. 2.ProCon 0.9.163 2.ProCon 0.9.163 had still problems with it’s internal array references

14 Ó. Gallardo P ROTEOME D ISCOVERER W ORKFLOW MSF.Prot.XML mzIdentML

15 MIAPE Generation MIAPE Generator Tool MIAPE Extractor Ó. Gallardo Mass Spectra Identification Output file P ROTEOME D ISCOVERER W ORKFLOW RAWMSFMGF.Prot.XML mzIdentML

16 MIAPE Generation MIAPE Extractor Ó. Gallardo Mass Spectra Identification Output file P ROTEOME D ISCOVERER W ORKFLOW RAWMSFMGF.Prot.XML mzIdentML...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found............................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar..... MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finished SequenceCollection written CV term for unknown modification Deamidated / +0.984 Da (N, Q) not found. CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found. Spectra IDs didn’t match between MGF file and mzIdentML file ID mgf ID mzid ID ID ID ID PepMSChargeRT

17 MIAPE Generation MIAPE Generator Tool MIAPE Extractor Ó. Gallardo Mass Spectra Identification Output file P ROTEOME D ISCOVERER W ORKFLOW RAWMSFMGF.Prot.XML mzIdentML MIAPE MSMIAPE MSI ID ID PepMSChargeRT ID

18 Ó. Gallardo P ROTEOME D ISCOVERER W ORKFLOW MIAPE Generation MIAPE Generator Tool MIAPE Extractor Mass Spectra Identification Output file RAWMSF MGF.Prot.XML mzIdentML MIAPE MSMIAPE MSI

19 1. 1.Uploading of MSF + mzIdentML files through MIAPE Extractor is not yet automatized 2. 2.Although we can generate MIAPE data from Sequest search results, MIAPE Toolkit doesn’t work very well with this data for the analysis stage: we can not retrieve the identified proteins, there are problems with the Sequest Score fields, … 1. 1.We are working in an automation script, to automatize MIAPE Extractor data extraction: MIAPE Extractor Automator v.2 2. 2.Development of MIAPE Extractor and MIAPE Generator tool continues improvement in each version 1. 1.Exportation of Prot.XML files from the MSF ones, and utter conversion of MSF + Prot.XML files to mzIdentML files is not automatized 2. 2.ProCon has still some errors, is very slow with large files, and is memory hungry ProCon developers are working in a new version that doesn’t need Prot.XML files, making the conversion process much faster and easier. W ORK IN P ROGRESS Ó. Gallardo

20 1.A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRed MIAPE Toolkit 1.A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRed MIAPE Toolkit Óscar Gallardo, Joan Villanueva, Montserrat Carrascal, Joaquín Abián 2.Data dependent acquisition using inclusion list (IL) 2.Data dependent acquisition using inclusion list (IL) Joan Villanueva, Óscar Gallardo, Joaquín Abián, Montserrat Carrascal I NDEX

21 RATIONAL OF USING DDP WITH INCLUSION LIST (IL): a.- Most target proteins assigned to the groups of the shotgun project were not detected using shotgun approaches. b.- The few detected peptides were not optimum for MRM analysis (not proteotypic, with Met/Cys, with missed cleavage). c.- Preliminary tests at LP-CSIC/UAB using targeted approaches require a limited list of peptides (need to restrict the list of target m/z values to 20-30) and failed to detect the target proteins. DDP with Inclusion list increases the probability to positively detect low abundant proteins/peptides without the constraints of targeted approaches. 16 PROTEINS SELECTED FOR INCLUSION LIST - 6 proteins assigned to the LPCSICUAB laboratory - 10 proteins assigned to MRM labs and not detected by shotgun LaboratoryUniprotName CanalsP69905 HBA_HUMAN FBQ6GPI1 CTRB2_HUMAN CGP24855 DNAS1_HUMAN MPVQ6A1A2 PDPK2_HUMAN FCP16444 DPEP1_HUMAN CGQ9BSW7 SYT17_HUMAN CGP11597 CETP_HUMAN MPVP15391 CD19_HUMAN CGQ53FZ2 ACSM3_HUMAN FVQ8N4N3 KLH36_HUMAN AbianQ9BUU2 METTL22_HUMAN AbianP33076 CIITA_HUMAN AbianQ9Y661 HS3ST4_HUMAN AbianQ14703 MBTPS1_HUMAN AbianB7ZMK8 PRSS36_HUMAN AbianA4GXA9 EME2_HUMAN Data dependent acquisition with inclusion list J. Villanueva

22 To obtain the inclusion list: 1.- All tryptic peptides 7-25AA. 2.- m/z values assuming z=2 and z=3 for all peptides. 3.- Filter duplicate m/z values (software requirement) Number of m/z values in the inclusion list: 556 (num peptides 282) Signal IDm/z P33076_GCTLLLTARPR400.9013 P11597_VFHSLAK401.2348 P16444_YPDLIAELLR401.5646 Q53FZ2_EGWGNLK402.2062 P24855_YDIALVQEVR402.5561 Q8N4N3_VASMNQR403.2032 Q8N4N3_VKPAVCSLLPK404.5779 Q14703_APCPGCSHLTLK409.5392 Q9Y661_AISDYTQTLSK409.5473 Q9BSW7_TAVEQWHSLR409.5478 P69905_VDPVNFK409.7243 P16444_TLEQMDVVHR409.8769 A4GXA9_MGLLAVGPDLSR410.2292 Samples CCD18 and MCF7 Aliquot 250 µg protein OffGel (12 fractions) FASP digestion LC-MS/MS (DDP, IL, Targeted) Protein Discoverer Procedure: Data Dependent with IL J. Villanueva

23 DATA DEPENDENT WITH INCLUSION LIST: LTQ-ORBITRAP Offgel Fr6 Offgel Fr7 Sample VH: MCF-7 MS traces J. Villanueva

24 RESULT: Data dependent with IL: 282 Listed peptides undetected (same that targeted experiments) Low amount of target proteins Proteins not expressed in these cells RESULTS: Inclusion list and targeted DATA PROCESSING FOR IL DATA: 1.- MGF generation with PDv1.3 2.- Database search: Proteome Discoverer and Mascot 3.- FDR 5% J. Villanueva

25 DATA PROCESSING: 1.- MGF generation with PDv1.3 2.- Database search: Proteome Discoverer (and Mascot) 3.- Search results and Filtering (1 %FDR): MIAPE Extractor (Data Inspector Module) and Proteome Discoverer. Work in progress: MIAPE EXTRACTOR: The data could be uploaded and the FDR process could be achieved. Data Inspector Module: Detected errors to be solved: unable to extract protein information from SEQUEST data. Chromosome 16 protein description: Data Dependent Analysis J. Villanueva

26 Sample Acquisition method search method MIAPE EXTRACTORPROTEOME DISCOVERER Num peptidesNum proteinsNum peptidesNum proteins MCF7DDPMASCOT30792316 -- SEQUEST3561142236161282 CCD18DDPMASCOT3102237037651180 SEQUEST22509802475946 Work in progress... Number of proteins that passed the 1%FDR filter: 1.- Significant differences between searching algorithms Need an in-depth data revision. J. Villanueva

27 La Cristalera, Miraflores de la Sierra, 10-11 December 2012 HPPHPP Use of SEQUEST search results with ProteoRed.org MIAPE Extractor


Download ppt "La Cristalera, Miraflores de la Sierra, 10-11 December 2012 HPPHPP Use of SEQUEST search results with ProteoRed.org MIAPE Extractor."

Similar presentations


Ads by Google