Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hanyang Univ. Introduction to Data Analyses for Mass Spectrometry-based Proteomics 1.

Similar presentations


Presentation on theme: "Hanyang Univ. Introduction to Data Analyses for Mass Spectrometry-based Proteomics 1."— Presentation transcript:

1 Hanyang Univ. Introduction to Data Analyses for Mass Spectrometry-based Proteomics 1

2 Hanyang Univ. Peptide Assignment DEAR vs. READ Differentiable ? 2

3 Hanyang Univ. DEARREAD digestion mass-spectrometry ProteinPeptides m/z intensity DEAR READ Mass spectrum (MS) m/z intensity DEAR READ m/z intensity mass-spectrometry DEAR Mass/Mass spectrum (MS/MS) D EAR DE AR DEA R peptide fragmentation m/z intensity READ R EAD RE AD REA D mass-spectrometry peptide fragmentation 471 Data Analysis - Peptide Assignment 3

4 Hanyang Univ. 4 The mass-spectrometry/proteomic experiment ???? digestion mass-spectrometry ProteinPeptides m/z intensity ???? Mass spectrum (MS) m/z intensity ???? 471 ? ??? ?? ??? ? mass-spectrometry peptide fragmentation m/z intensity ???? 399 356 311 157 201 130 mass-spectrometry Mass/Mass spectrum (MS/MS) m/z intensity ???? 289 345 156 115 234 - Trypsin - Pepsin - Lys-C - Quadrupole - Time of flight - FTICR ? ??? ?? ??? ? peptide fragmentation - CID - ECD - ETD

5 Hanyang Univ. y-ion  Labeled from C-terminal to N-terminal b-ion  Labeled from N-terminal to C-terminal N-terminal (Amino-terminal) C-terminal (Carboxy-terminal) Peptide Fragmentation

6 Hanyang Univ. Peptide Fragmentation

7 Hanyang Univ. Calculating b-/y-ion mass

8 Hanyang Univ. Average vs. Monoisotopopic mass 8

9 Hanyang Univ. Amino AcidMass A71 D115 E129 R156 energy RDEA H +H ++ OH Peptide E R A D Fragmentation 1 D E AR 3 D E R A 2 Intensity 400300200100 m/z b1 b2 b3 y3 y2 y1 116 b1 245 b2 316 b3 375 y3 246 y2 175 y1 ? ? ERA EDA MS/MS Peptide Assignment 9

10 Hanyang Univ. 334 y3 134 y1 Amino AcidMass A71 D115 E129 R156 energy DREA H +H ++ OH Peptide E D A R Fragmentation 1 R E AD 3 R E D A 2 b1 b2 b3 y3 y2 y1 157 b1 286 b2 356 b3 205 y2 Intensity 400300200100 m/z EDA ERA MS/MS Peptide Assignment 10

11 Hanyang Univ. DEAR vs. READ Intensity 400300200100 m/z 116245316 175246375 b1 b2 b3 y3y2y1 334 y3 134 y1 157 b1 286 b2 356 b3 205 y2 Intensity 400300200100 m/z DEAR READ MS/MS 11

12 Hanyang Univ. MS/MS simulation 1 D EAR DEAR Intensity 400300200100 m/z 116 375 Intensity 800600400200 m/z fragmentation MS MS/MS DEAR READ b1 y3 12

13 Hanyang Univ. MS/MS simulation 2 DEA R DEAR Intensity 400300200100 m/z 116 316 175 375 Intensity 800600400200 m/z fragmentation MS MS/MS DEAR READ b1 b3 y3 y1 13

14 Hanyang Univ. MS/MS simulation 3 DEAR Intensity 800600400200 m/z fragmentation 116 316 175 375 Intensity 400300200100 m/z D EAR MS MS/MS DEAR READ b1 b3 y3 y1 14

15 Hanyang Univ. MS/MS simulation 4 DE AR DEAR 245246 Intensity 800600400200 m/z fragmentation 116 316 175 375 Intensity 400300200100 m/z MS MS/MS DEAR READ b1 b2 b3 y3 y2 y1 15

16 Hanyang Univ. MS/MS simulation 5 DE AR DEAR Intensity 800600400200 m/z fragmentation 245246 116 316 175 375 Intensity 400300200100 m/z MS MS/MS DEAR READ b1 b2 b3 y3 y2 y1 16

17 Hanyang Univ. MS/MS simulation 100 DE AR DEAR Intensity 800600400200 m/z fragmentation 245246 116 316175 375 D EAR DEA R 60 30 10 Intensity 400300200100 m/z MS MS/MS READ b1 b2 b3 y3 y2 y1 17

18 Hanyang Univ. Peptide assignment RDEA H +H ++ OH Peptide Intensity 400300200100 m/z 116245316 375246175 b1 b2 b3 y3y2y1 ERA EDA Not known whether an ion is a b-ion or y-ion Some ions may be missing Various ion types (neutral loss) Amino acid modification 18

19 Hanyang Univ. 19 Database search - Peptide assignment using MS/MS >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNR YRDVSPFDHSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSG SWAAIYQDIRHEASDFHEASDFPCRVAKLPKNKDEARYMEKEFEQ IDKGAGVDADIRHEMEKEFEQIDKSGSWAAIYQDIRHE >Protein B MKVLILACLVALALAEGDRLNVPGEIVESLSSSEESITRINKKIE KFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPEGDVAPQNIPP LTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPF >Protein C... Intensity 400300200100 m/z MS/MS Parent mass = 471 Sequence Database  Raw genomic  Transcript or EST  Protein Sequence

20 Hanyang Univ. 20 Database search - Peptide assignment using MS/MS >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNR YRDVSPFDHSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSG SWAAIYQDIRHEASDFHEASDFPCRVAKLPKNKDEARYMEKEFEQ IDKGAGVDADIRHEMEKEFEQIDKSGSWAAIYQDIRHE >Protein B MKVLILACLVALALAEGDRLNVPGEIVESLSSSEESITRINKKIE KFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPEGDVAPQNIPP LTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPF >Protein C... Intensity 400300200100 m/z MS/MS Parent mass = 471

21 Hanyang Univ. 21 Peptide assignment using MS/MS READ DVGAE DEAR GAGVDA EGDVA … Candidate peptides Intensity 400300200100 m/z Experimental MS/MS spectrum MS/MS Comparison Parent mass = 471

22 Hanyang Univ. 22 Peptide assignment using MS/MS READ DVGAE DEAR GAGVDA EGDVA … Candidate peptides Intensity 400300200100 m/z Experimental MS/MS spectrum Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Theoretical MS/MS spectrum MS/MS Parent mass = 471

23 Hanyang Univ. 23 Peptide assignment using MS/MS Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Intensity 400300200100 m/z Theoretical MS/MS spectrum READ DVGAE DEAR GAGVDA EGDVA … Candidate peptidesComparison Select TOP one 0.99 2.25 3.07 1.55 1.98 Match score

24 Hanyang Univ. 24 Post-translational modification (PTM) modified protein Addition of chemical groups Structural changes Various Cellular Functions PROTEIN PROT PO4 EIN PROTEINS PROTE CH2 IN ROTEIN PROT PO4 EINS PO4 Dynamic proteome

25 Hanyang Univ. 25 MS/MS of modified peptides digestion MS/MS Intensity m/z Modified protein

26 Hanyang Univ. 26 MS/MS spectrum of modified peptides ‘TVTAMDVVY’ 200300700800 m/z 400500600 intensity 100 0 AVTMDVV T MS/MS spectrum of peptide ‘TVTAM Δ DVVY’ with a modification of +Δ mass T TV TVT TVTA TVTAM TVTAMD TVTAMDV TVTAMDVV VTAMDVVY TAMDVVY AMDVVY MDVVY DVVY VVY VY Y ΔΔΔΔ M+ΔDVV Δ SHIFT AVT T intensity 100 0 200300700800 m/z 400500600 100 T TV TVT TVTA TVTAM Δ TVTAM Δ D TVTAM Δ DV TVTAM Δ DVV MDVV TVTAMDVVY vs. TVTAM Δ DVVY ‘TVTAM Δ DVVY’

27 Hanyang Univ. 27 46 Database search – modification analysis >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPK NKNRNRYRDVSPFDHSRKREADDNDYINASLIKMEEAQR SYILTQQIDKSGSWAAIYQDIRHEASDFHEASDFPCRVA KLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQ IDKSGSWAAIYQDIRHE >Protein B … Intensity 400300200100 m/z MS/MS Parent mass = 471 DVGAE READ DEAR GAGVDA … Candidate peptides Every substring Candidate peptides PEAK 425 471 Modification analysis Explosion of the no. of candidate peptides

28 Hanyang Univ. 28 Complexity for analyzing modified peptides O(N) Intensity 400300200100 m/z MS/MS Parent mass = 769 PTMPEPT 753 PTMPEPT PTMPEPT PTMPEPT PTMPEPT PTMPEPT PTMPEPT PTMPEPT 16 - Considering one modification per peptide

29 Hanyang Univ. 29 Complexity for analyzing modified peptides PTMPEPT 100 = 1 + 99 = 2 + 98 = 3 + 97 … = 101 + -1 = 102 + -2 … = 300 + -200 d (-200 ~ +200) O(dN 2 ) N(N-1) - Considering two modifications per peptide

30 Hanyang Univ. 30 Standard method for modification analysis PTMPEPT 100 = 1 + 99 = 2 + 98 = 3 + 97 … = 101 + -1 = 102 + -2 … = 300 + -200 Input modifications +1 on N +3 on M +97 on E +102 on T Restrictive search

31 Hanyang Univ. Spectral Library - Peptide assignment using MS/MS 31

32 Hanyang Univ. Spectral Library - Peptide assignment using MS/MS Consensus Spectrum 32

33 Hanyang Univ. Peptide Validation Peptide assignment  각각의 MS/MS 스펙트럼에 대해 독립적으로 해석  사용한 소프트웨어가 다를 경우에 대응이 어려움 Manual validation  Filtering by search scores, NTT(Number of Tryptic Termini)  주관적인 판단이 개입될 수 있음  Error rate 이 얼마나 되는지 알 수 없음  Dataset 이 커지면 ? Statistical validation  Search score 에 대한 확률모델을 근거로 각각의 peptide assignment 가 올바를 확률을 제시 — PeptideProphet  False discovery rate 를 decoy peptide 에 대한 match 를 근거로 추정 33

34 Hanyang Univ. (Un)reliability of Manual Validation Manual Authenticators Search Results Correct ValidationIncorrect ValidationValidation Withheld 34

35 Hanyang Univ. Peptide Validation  PeptideProphet AAAA m/z intensity m/z intensity CCCC m/z intensity m/z intensity m/z intensity GGGG KKKKLLLL TTTT m/z intensity m/z intensity LLLLQQQQIIII 4.5 3.4 0.97 1.15 0.4 2.97 3.15 4.14 1.97 m/z intensity m/z intensity 1.0 0.87 0.01 0.01 0.0 0.7 0.84 0.95 0.3 35

36 Hanyang Univ. Peptide Validation  Target/Decoy m/z intensity YILT DAER m/z intensity >Protein A (Target Sequence) MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPK NKNRNRYRDVSPFDHSRKREADDNDYINASLIKMEEAQR SYILTQQIDKSGSWAAIYQDIRHEASDFHEASDFPCRVA KLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQ IDKSGSWAAIYQDIRHE >Reversed Protein A (Decoy Sequence) EHRIDQYIAAWSGSKDIQEFEKEMEHRIDADVGAGKDIQ EFEKEMYRAEDKNKPLKAVRCPFDSAEHFDSAEHRIDQY IAAWGSGKDIQQTLIYSRQAEEMKILSANIYDNDDAERK RSHDFPSVDRYRNRNKNKPLKAVRCPFDEAGVDIDQYIA AWSGSKDIQEFEKEMEM 3.15 2.47 T = 1000# of matches to the target sequence (above score threshold) D = 20 # of matches to the decoy sequence (above score threshold) False Discovery Rate = ? 36

37 Hanyang Univ. Peptide Validation  Target/Decoy Target/Decoy 는 large dataset 에 대해서만 의미가 있음. Decoy database 로 적당한 것은 ? (amino acid composition, peptide sequence redundancy, precursor mass distribution)  Reversed sequence  Random sequences  Pseudo-reverse sequence Separated or concatenated?  Threshold score 30  Match to the target score 50  Match to the decoy score 40  Is this counted as a false positive? Calculating FDR  Concatenated:  Is this counted as a false positive? 37

38 Hanyang Univ. Semi-parametric PeptideProphet Decoy search results => distribution for incorrect assignments EM algorithm to estimate distributions of correct assignments  NTT (number of tryptic termini) 38

39 Hanyang Univ. Protein Assignment 39

40 Hanyang Univ. Protein Assignment >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNRYRDVSPFD HSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSGSWAAIYQDIRHEASDF HEASDFPCRVAKLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQIDK SGSWAAIYQDIRHE VAKLPKNKNR:p=0.96 YMEKEFEQIDK:p=0.65 EADDNDYINASLIK:p=0.83 P = 1 – (1-0.83)(1-0.65)(1-0.96) 40

41 Hanyang Univ. Protein Assignment >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNRYRDVSPFD HSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSGSWAAIYQDIRHEASDF HEASDFPCRVAKLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQIDK SGSWAAIYQDIRHE EADDNDYINASLIK:p=0.83 EADDNDYINASLIK:p=0.62 EADDNDYINASLIK:p=0.95 Probability(Protein A)=? 41

42 Hanyang Univ. Protein Assignment - ProteinProphet 42

43 Hanyang Univ. Protein Assignment - ProteinProphet Probabilistic Model with NSP(Number of Sibling Peptides) as a random var. 43

44 Hanyang Univ. Protein Assignment - ProteinProphet Degenerate Peptides (alternative splicing, paralogs, database redundancies) 44

45 Hanyang Univ. Protein Assignment - IDPicker Bipartite graph A.Initialize B.Collapse C.Separate D.Reduce 45

46 Hanyang Univ. Peptide Quantitation Labeled Quantitation  Use of stable isotope containing compound  Peptide assignment from MS/MS  Peptide Quantitation from MS: single ion chromatogram 46

47 Hanyang Univ. Peptide Quantitation Label-free Quantitation  Matching peptide features  AMT (Accurate Mass & Time) approach — normalized elution time  Spectral counting — number of spectra identified for a given peptide 47

48 Hanyang Univ. Pipeline : integrated tools for MS/MS proteomics Input Spectrum data (Protein database) Peptide assignment SEQUEST PEAKS MODi Peptide validation manual validation PeptideProphet Target/Decoy Protein assignment & validation ProteinProphet IDPicker Output Interpretation Quantitation ASAPRatio MaxQuant 48

49 Hanyang Univ. 49


Download ppt "Hanyang Univ. Introduction to Data Analyses for Mass Spectrometry-based Proteomics 1."

Similar presentations


Ads by Google