Presentation is loading. Please wait.

Presentation is loading. Please wait.

Error tolerant search Large number of spectra remain without significant score. Reasonable number of fragment ion peaks might have not match. – Underestimated.

Similar presentations


Presentation on theme: "Error tolerant search Large number of spectra remain without significant score. Reasonable number of fragment ion peaks might have not match. – Underestimated."— Presentation transcript:

1 Error tolerant search Large number of spectra remain without significant score. Reasonable number of fragment ion peaks might have not match. – Underestimated mass measurement error (should be seen in peptide view graphs, – Incorrect determination of precursor charge state – Peptide sequence is not in the database. – Missed cleavage & unexpected cleavage, – Unexpected chemical & post-translational modification. The biological structure, function and activity of a protein can be determined by the modification of the given protein. An increasing part of the proteins that have been mapped to e.g. different diseases, not only change in expression levels but also or exclusively in the level of posttranslational modifications. 1

2 Post-Translational Modifications (PTMs) PTM alters the weight of amino acids and the peptide that results peak shifts in the spectrum: b 1 : H b 2 : HQ b 3 : HQS b 4 : HQSV b 5 : HQSVM … b 10 :HQSVMVGMVQ QSVMVGMVQK:y 10 SVMVGMVQK: y 9 VMVGMVQK: y 8 MVGMVQK: y 7 VGMVQK: y 6 … K: y 1 m/z2004001000 b1b1 y1y1 b2b2 b3b3 y 10 b 10 b3b3 y 10 b 10 y7y7 …… H Q S V M V G M V Q K b1b1 y 10 b2b2 y9y9 b3b3 y8y8 b4b4 y7y7 b5b5 y6y6 b9b9 y5y5 b6b6 y4y4 b7b7 y3y3 b8b8 y2y2 b 10 y1y1 2

3 PTMs Complete modifications (chemical modifications) Variable modifications 3

4 PTMs Obstacles – Complexity (means longer execution time) Can increase the search space 1,10,...10000 fold – Significance 4

5 Obstacles - Complexity Let the theoretical peptide be: – HQSVMVGMVQK (11 amino acids) – Each amino acid can be modified by, let’s say, 5 PTMs # included PTMs# modified theoretical spectratime 011 sec 111*5 = 5555 seconds (1min) 211*25 = 2754.5 mins 311*15*125 = 206255.7 hours... 1029839 hours (3.5 years) In general: Peptide length = L Included PTMs = K PTMs/aa = M 5

6 – Inserting many PTMs make the theoretical spectra too flexible and in the end all theoretical spectra can be aligned to the experimental spectra. 6 100% 0% 1 0

7 Significance Increases the random matches – Inserting many PTMs make the theoretical spectra too flexible and in the end all theoretical spectra can be aligned to the experimental spectra. 7

8 Computational Identification of PTMs 3 approaches: – Targeted, – Untargeted or also called restricted – Unrestricted, de novo, blind search 8

9 Targeted approach Almost all search engine supports it. – Experimenter needs to guess the PTMs in the sample. Two pass strategy – Two rounds, refinement on a smaller – Sequest, Mascot 9

10 Targeted approach – X!Tandem 10

11 Targeted approach – InsPecT 11

12 Untargeted approaches Uses a big list of databases – Search space is limited but can be very huge. – if we allow 5 of the 10 most frequent modifications to occur in a peptide at the same type, the search space grows 3 orders of magnitude. – The growth is more dramatic if instead of 10 types of modifications we wish to consider all of roughly 500 known types. 12

13 Database of PTMs Unimod – http://www.unimod.org http://www.unimod.org – Contains 906 modifications Resid – http://www.ebi.ac.uk/RESID http://www.ebi.ac.uk/RESID – 559 Entries 13

14 Untargeted PILOT_PTM – Uses a large dataset of modifications. – Binary Linear programming. Objective function is the number of the matched peaks Linear constrain functions are guarantee meaningful modifications of the peptide. 14

15 Unrestricted No priori information about PTMs. De novo identification of PTMs Search space is infinite. In practice no more than one or two PTMs can be identified on the same peptide. 15

16 TwinPeaks approach Based on the Sequest idea. Shifts the experimental spectra over a range, and plots the similarity score as a function of the mass shift. 16

17 TwinPeaks approach 17 Sum of matched intensity

18 MS-Alignment Based on the alignment of the theoretical spectra to the experimental spectra 18

19 19 Theoretical Spectrum Experimental Spectrum

20 MS-alignment 20

21 Comparison of targeted and unrestricted results 21 Scan IDlog(-E)Peptide 3.1.1-13.8fqyr 295 ILTAAALCHF TSIEVVK 311 kasg (130)ILTAAALCHF TSIEVVK 6.1.1-6.6rihr 159 FVEKPQVFVS NK 170 inag (471)FVEKPQVFVS NK 11.1.1-3.4rtcr 30 SPEPGPSSSI GSPQASSPPR PN 51 hyll (48)SPEPGPSSSI GSPQASSPPR PN 12.1.1-4.0dvtr 473 TMHFGTPTAY EK 484 ecft (306)TMHFGTPTAY EK 13.1.1-10.0ietk 133 FFDDDLLVST SR 144 vrlf (176)FFDDDLLVST SR 24.1.1-4.2pskr 237 QTNGCLNGYT PSR 249 krqa (112)QTNGCLNGYT PSR 25.1.1-2.5ntpr 149 KNGGLGHMNI ALLSDLTK 166 qisr (1776)KNGGLGHMNI ALLSDLTK 27.1.1-7.4pqgr 19 IHQIEYAMEA VK 30 qgsa (10317)IHQIEYAMEA VK 31.1.1-2.0kefk 80 DREDLVPYTG EK 91 rgkv (137)DREDLVPYTG EK 34.1.1-1.6dyhr 131 YLAEFATGND R 141 keaa (9406)YLAEFATGND R 35.1.1-7.0grar 16 QYTSPEEIDA QLQAEK 31 qkar (2754)QYTSPEEIDA QLQAEK 36.1.1-2.0rlar 172 QDPQLHPEDP ER 183 raai (644)QDPQLHPEDP ER 37.1.1-8.1iflh 92 ISDVEGEYVP VEGDEVTYK 110 mcsi (73)ISDVEGEYVP VEGDEVTYK 38.1.1-3.9mrsr 328 TASGSSVTSL DGTR 341 srsh (2698)TASGSSVTSL DGTR 40.1.1-3.7lgnk 29 YVQLNVGGSL YYTTVR 44 altr (71)YVQLNVGGSL YYTTVR 42.1.1-1.9dlqk 183 EGEFSTCFTE LQR 195 dflk (239)EGEFSTCFTE LQR 45.1.1-2.9pkek 135 QPVAGSEGAQ YR 146 kkql (694)QPVAGSEGAQ YR 46.1.1-10.3lsar 446 ASNAWILQQH IATVPSLTHL CR 467 leir (107)ASNAWILQQH IATVPSLTHL CR 53.1.1-6.8evyr 175 NSMPASSFQQ QK 186 lrvc (7099)NSMPASSFQQ QK 57.1.1-4.7iygk 81 QFEDELHPDL K 91 ftga (491)QFEDELHPDL K Scan IDP-valuePeptide 31.00E-05R.ILTAAALCHFTSIEVVK.K 61.00E-05R.FVEKPQVFVSNK.I 131.00E-05K.FFDDDLLVSTSR.V 271.00E-05R.IHQIEYAMEAVK.Q 470.028806584A.V+172LTAFANGR.S 571.00E-05K.QFEDELHPDLK.F 580.004739336R.ETFY+18LAQDFFDR.F 591.00E-05R.TCLSQLLDIMK.S 711.00E-05K.EYFSTFGEVLM+16VQVK.K 751.00E-05K.QH-18LENDPGSNEDTDIPK.G 970.004672897Q.L+128GVSHVFEYIR.S 980.004830918C.T+160EDMTEDELR.E 991.00E-05R.EFFD-18SNGNFLYR.I 1001.00E-05R.LVLESPAPVEVNLK.L 1051.00E-05K.LQEFAYVTDGAC+14SEEDILR.M 1081.00E-05K.SFDENGFDYLLTYSDNPQTVFP+156.R 1151.00E-05R.GPATVEDLPSAFEEK.A 1191.00E-05Y.ITD+163VLTEEDALEILQK.G 1471.00E-05R.IYSYQMALTPVVVTLWYR.A X!Tandem targeted MS-Alignment Unrestricted (de novo)

22 Validate your results 22

23 Summary What you should remember: – PTM identification is computationally expensive – 3 approaches (targeted, untargeted, unrestricted) – Always examine the results, omit weird PTMs, – Decreases the statistical significance – The more you are looking for the less you get (due to significance) 23


Download ppt "Error tolerant search Large number of spectra remain without significant score. Reasonable number of fragment ion peaks might have not match. – Underestimated."

Similar presentations


Ads by Google