Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.

Similar presentations


Presentation on theme: "Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation."— Presentation transcript:

1 Protein sequencing and Mass Spectrometry

2 Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation

3 Single Stage MS Mass Spectrometry LC-MS: 1 MS spectrum / second

4 Tandem MS Secondary Fragmentation Ionized parent peptide

5 The peptide backbone H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses.

6 Ionization H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses. Ionized parent peptide H+H+

7 Fragment ion generation H...-HN-CH-CO NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses. Ionized peptide fragment H+H+

8 Tandem MS for Peptide ID 147 K 1166 L 260 1020 E 389 907 D 504 778 E 633 663 E 762 534 L 875 405 F 1022 292 G 1080 145 S 1166 88 y ions b ions 100 0 2505007501000 [M+2H] 2+ m/z % Intensity

9 Peak Assignment 147 K 1166 L 260 1020 E 389 907 D 504 778 E 633 663 E 762 534 L 875 405 F 1022 292 G 1080 145 S 1166 88 y ions b ions 100 0 2505007501000 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 b3b3 b4b4 b5b5 b8b8 b9b9 [M+2H] 2+ b6b6 b7b7 y9y9 y8y8 m/z % Intensity Peak assignment implies Sequence (Residue tag) Reconstruction!

10 Database Searching for peptide ID For every peptide from a database –Generate a hypothetical spectrum –Compute a correlation between observed and experimental spectra –Choose the best Database searching is very powerful and is the de facto standard for MS. –Sequest, Mascot, and many others

11 Spectra: the real story Noise Peaks Ions, not prefixes & suffixes Mass to charge ratio, and not mass –Multiply charged ions Isotope patterns, not single peaks

12 Peptide fragmentation possibilities (ion types) -HN-CH-CO-NH-CH-CO-NH- RiRi CH-R’ aiai bibi cici x n-i y n-i z n-i y n-i-1 b i+1 R” d i+1 v n-i w n-i i+1 low energy fragmentshigh energy fragments

13 Ion types, and offsets P = prefix residue mass S = Suffix residue mass b-ions = P+1 y-ions = S+19 a-ions = P-27

14 Mass-Charge ratio The X-axis is (M+Z)/Z –Z=1 implies that peak is at M+1 –Z=2 implies that peak is at (M+2)/2 M=1000, Z=2, peak position is at 501 –Suppose you see a peak at 501. Is the mass 500, or is it 1000?

15 Spectral Graph Each prefix residue mass (PRM) corresponds to a node. Two nodes are connected by an edge if the mass difference is a residue mass. A path in the graph is a de novo interpretation of the spectrum 87 144 G

16 Spectral Graph Each peak, when assigned to a prefix/suffix ion type generates a unique prefix residue mass. Spectral graph: –Each node u defines a putative prefix residue M(u). –(u,v) in E if M(v)-M(u) is the residue mass of an a.a. (tag) or 0. –Paths in the spectral graph correspond to a interpretation 300100 401 200 0 S G E K 273 87146144275 332

17 Re-defining de novo interpretation Find a subset of nodes in spectral graph s.t. –0, M are included –Each peak contributes at most one node (interpretation)(*) –Each adjacent pair (when sorted by mass) is connected by an edge (valid residue mass) –An appropriate objective function (ex: the number of peaks interpreted) is maximized 300100 401 200 0 S G E K 273 87146144275 332 87 144 G

18 Two problems Too many nodes. –Only a small fraction are correspond to b/y ions (leading to true PRMs) (learning problem) –Even if the b/y ions were correctly predicted, each peak generates multiple possibilities, only one of which is correct. We need to find a path that uses each peak only once (algorithmic problem). –In general, the forbidden pairs problem is NP-hard 300100 401 200 0 S G E K 273 87146144275 332

19 However,.. The b,y ions have a special non-interleaving property Consider pairs (b 1,y 1 ), (b 2,y 2 ) –If (b 1 y 2

20 Non-Intersecting Forbidden pairs 300 100 400 200 0 S G E K If we consider only b,y ions, ‘forbidden’ node pairs are non-intersecting, The de novo problem can be solved efficiently using a dynamic programming technique. 87 332

21 The forbidden pairs method There may be many paths that avoid forbidden pairs. We choose a path that maximizes an objective function, –EX: the number of peaks interpreted

22 The forbidden pairs method Sort the PRMs according to increasing mass values. For each node u, f(u) represents the forbidden pair Let m(u) denote the mass value of the PRM. 300100 400 200 0 87 332 u f(u)

23 D.P. for forbidden pairs Consider all pairs u,v –m[u] M/2 Define S(u,v) as the best score of a forbidden pair path from 0- >u, v->M Is it sufficient to compute S(u,v) for all u,v? 300100 400 200 0 87 332 uv

24 D.P. for forbidden pairs Note that the best interpretation is given by 300100 400 200 0 87 332 uv

25 D.P. for forbidden pairs Note that we have one of two cases. 1.Either u v) 2.Or, u > f(v) (and f(u) < v) Case 1. –Extend u, do not touch f(v) 300100 400 200 0 u f(u) v

26 The complete algorithm for all u /* increasing mass values from 0 to M/2 */ for all v /* decreasing mass values from M to M/2 */ if (u > f[v]) else if (u < f[v]) If (u,v)  E /* maxI is the score of the best interpretation */ maxI = max {maxI,S[u,v]}


Download ppt "Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation."

Similar presentations


Ads by Google