Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry.

Similar presentations


Presentation on theme: "Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry."— Presentation transcript:

1

2 Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry

3 Fa 05CSE182 Announcements Midterm 1: Nov 1, in class. Assignment 2: Online, due October 20.

4 Fa 05CSE182 Trivia Quiz What research won the Nobel prize in Chemistry in 2004? In 2002?

5 Fa 05CSE182 How are Proteins Sequenced? Mass Spec 101:

6 Fa 05CSE182 Nobel Citation 2002

7 Fa 05CSE182 Nobel Citation, 2002

8 Fa 05CSE182 Mass Spectrometry

9 Fa 05CSE182 Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation

10 Fa 05CSE182 Single Stage MS Mass Spectrometry LC-MS: 1 MS spectrum / second

11 Fa 05CSE182 Tandem MS Secondary Fragmentation Ionized parent peptide

12 Fa 05CSE182 The peptide backbone H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses.

13 Fa 05CSE182 Ionization H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses. Ionized parent peptide H+H+

14 Fa 05CSE182 Fragment ion generation H...-HN-CH-CO NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses. Ionized peptide fragment H+H+

15 Fa 05CSE182 Tandem MS for Peptide ID 147 K 1166 L 260 1020 E 389 907 D 504 778 E 633 663 E 762 534 L 875 405 F 1022 292 G 1080 145 S 1166 88 y ions b ions 100 0 2505007501000 [M+2H] 2+ m/z % Intensity

16 Fa 05CSE182 Peak Assignment 147 K 1166 L 260 1020 E 389 907 D 504 778 E 633 663 E 762 534 L 875 405 F 1022 292 G 1080 145 S 1166 88 y ions b ions 100 0 2505007501000 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 b3b3 b4b4 b5b5 b8b8 b9b9 [M+2H] 2+ b6b6 b7b7 y9y9 y8y8 m/z % Intensity Peak assignment implies Sequence (Residue tag) Reconstruction!

17 Fa 05CSE182 Database Searching for peptide ID For every peptide from a database –Generate a hypothetical spectrum –Compute a correlation between observed and experimental spectra –Choose the best Database searching is very powerful and is the de facto standard for MS. –Sequest, Mascot, and many others

18 Fa 05CSE182 Spectra: the real story Noise Peaks Ions, not prefixes & suffixes Mass to charge ratio, and not mass –Multiply charged ions Isotope patterns, not single peaks

19 Fa 05CSE182 Peptide fragmentation possibilities (ion types) -HN-CH-CO-NH-CH-CO-NH- RiRi CH-R’ aiai bibi cici x n-i y n-i z n-i y n-i-1 b i+1 R” d i+1 v n-i w n-i i+1 low energy fragmentshigh energy fragments

20 Fa 05CSE182 Ion types, and offsets P = prefix residue mass S = Suffix residue mass b-ions = P+1 y-ions = S+19 a-ions = P-27

21 Fa 05CSE182 Mass-Charge ratio The X-axis is (M+Z)/Z –Z=1 implies that peak is at M+1 –Z=2 implies that peak is at (M+2)/2 M=1000, Z=2, peak position is at 501 –Suppose you see a peak at 501. Is the mass 500, or is it 1000?

22 Fa 05CSE182 Isotopic peaks Ex: Consider peptide SAM Mass = 308.12802 You should see: Instead, you see 308.13 310.13

23 Fa 05CSE182 Isotopes C-12 is the most common. Suppose C-13 occurs with probability 1% EX: SAM –Composition: C11 H22 N3 O5 S1 What is the probability that you will see a single C-13? Note that C,S,O,N all have isotopes. Can you compute the isotopic distribution?

24 Fa 05CSE182 All atoms have isotopes Isotopes of atoms –O16,18, C-12,13, S32,34…. –Each isotope has a frequency of occurrence If a molecule (peptide) has a single copy of C-13, that will shift its peak by 1 Da With multiple copies of a peptide, we have a distribution of intensities over a range of masses (Isotopic profile). How can you compute the isotopic profile of a peak?

25 Fa 05CSE182 Isotope Calculation Denote: –N c : number of carbon atoms in the peptide –P c : probability of occurrence of C-13 (~1%) –Then +1 N c =50 +1 N c =200

26 Fa 05CSE182 Isotope Calculation Example Suppose we consider Nitrogen, and Carbon N N : number of Nitrogen atoms P N : probability of occurrence of N-15 Pr(peak at M) Pr(peak at M+1)? Pr(peak at M+2)? How do we generalize? How can we handle Oxygen (O-16,18)?

27 Fa 05CSE182 General isotope computation Definition: –Let p i,a be the abundance of the isotope with mass i Da above the least mass –Ex: P 0,C : abundance of C-12, P 2,O : O-18 etc. Characteristic polynomial Prob{M+i}: coefficient of x i in  (x) (a binomial convolution)

28 Fa 05CSE182 Isotopic Profile Application In DxMS, hydrogen atoms are exchanged with deuterium The rate of exchange indicates how buried the peptide is (in folded state) Consider the observed characteristic polynomial of the isotope profile  t1,  t2, at various time points. Then The estimates of p 1,H can be obtained by a deconvolution Such estimates at various time points should give the rate of incorporation of Deuterium, and therefore, the accessibility.

29 Fa 05CSE182 Quiz How can you determine the charge on a peptide?  Difference between the first and second isotope peak is 1/Z  Proposal:  Given a mass, predict a composition, and the isotopic profile  Do a ‘goodness of fit’ test to isolate the peaks corresponding to the isotope  Compute the difference

30 Fa 05CSE182 Tandem MS summary The basics of peptide ID using tandem MS is simple. –Correlate experimental with theoretical spectra In practice, there might be many confounding problems. A toolkit that resolves some of these problems will be useful.

31 Fa 05CSE182 MS Quiz: Why aren’t all tandem MS peaks of the same intensity? Do the intensities for a peptide vary from spectrum to spectrum?

32 Fa 05CSE182 De novo interpretation of mass spectra The so called de novo algorithms focus exclusively on the D module. There is no database (I/F). Limited scoring and validation

33 Fa 05CSE182 Computing possible prefixes We know the parent mass M=401. Consider a mass value 88 Assume that it is a b-ion, or a y-ion If b-ion, it corresponds to a prefix of the peptide with residue mass 88-1 = 87. If y-ion, y=M-P+19. –Therefore the prefix has mass P=M-y+19= 401-88+19=332 Compute all possible Prefix Residue Masses (PRM) for all ions.

34 Fa 05CSE182 Putative Prefix Masses Prefix Mass M=401 by 8887332 145144275 147146273 276275144 S G E K 0 87 144 273 401 Only a subset of the prefix masses are correct. The correct mass values form a ladder of amino- acid residues

35 Fa 05CSE182 Spectral Graph Each prefix residue mass (PRM) corresponds to a node. Two nodes are connected by an edge if the mass difference is a residue mass. A path in the graph is a de novo interpretation of the spectrum 87 144 G

36 Fa 05CSE182 Spectral Graph Each peak, when assigned to a prefix/suffix ion type generates a unique prefix residue mass. Spectral graph: –Each node u defines a putative prefix residue M(u). –(u,v) in E if M(v)-M(u) is the residue mass of an a.a. (tag) or 0. –Paths in the spectral graph correspond to a interpretation 300100 401 200 0 S G E K 273 87146144275 332

37 Fa 05CSE182 Re-defining de novo interpretation Find a subset of nodes in spectral graph s.t. –0, M are included –Each peak contributes at most one node (interpretation)(*) –Each adjacent pair (when sorted by mass) is connected by an edge (valid residue mass) –An appropriate objective function (ex: the number of peaks interpreted) is maximized 300100 401 200 0 S G E K 273 87146144275 332 87 144 G

38 Fa 05CSE182 Two problems Too many nodes. –Only a small fraction are correspond to b/y ions (leading to true PRMs) (learning problem) –Even if the b/y ions were correctly predicted, each peak generates multiple possibilities, only one of which is correct. We need to find a path that uses each peak only once (algorithmic problem). –In general, the forbidden pairs problem is NP-hard 300100 401 200 0 S G E K 273 87146144275 332

39 Fa 05CSE182 However,.. The b,y ions have a special non-interleaving property Consider pairs (b 1,y 1 ), (b 2,y 2 ) –If (b 1 y 2

40 Fa 05CSE182 Non-Intersecting Forbidden pairs 300 100 400 200 0 S G E K If we consider only b,y ions, ‘forbidden’ node pairs are non-intersecting, The de novo problem can be solved efficiently using a dynamic programming technique. 87 332

41 Fa 05CSE182 The forbidden pairs method There may be many paths that avoid forbidden pairs. We choose a path that maximizes an objective function, –EX: the number of peaks interpreted

42 Fa 05CSE182 The forbidden pairs method Sort the PRMs according to increasing mass values. For each node u, f(u) represents the forbidden pair Let m(u) denote the mass value of the PRM. 300100 400 200 0 87 332 u f(u)

43 Fa 05CSE182 D.P. for forbidden pairs Consider all pairs u,v –m[u] M/2 Define S(u,v) as the best score of a forbidden pair path from 0- >u, v->M Is it sufficient to compute S(u,v) for all u,v? 300100 400 200 0 87 332 uv

44 Fa 05CSE182 D.P. for forbidden pairs Note that the best interpretation is given by 300100 400 200 0 87 332 uv

45 Fa 05CSE182 D.P. for forbidden pairs Note that we have one of two cases. 1.Either u v) 2.Or, u > f(v) (and f(u) < v) Case 1. –Extend u, do not touch f(v) 300100 400 200 0 u f(u) v

46 Fa 05CSE182 The complete algorithm for all u /* increasing mass values from 0 to M/2 */ for all v /* decreasing mass values from M to M/2 */ if (u > f[v]) else if (u < f[v]) If (u,v)  E /* maxI is the score of the best interpretation */ maxI = max {maxI,S[u,v]}

47 Fa 05CSE182 De Novo: Second issue Given only b,y ions, a forbidden pairs path will solve the problem. However, recall that there are MANY other ion types. –Typical length of peptide: 15 –Typical # peaks? 50-150? –#b/y ions? –Most ions are “Other” a ions, neutral losses, isotopic peaks….

48 Fa 05CSE182 De novo: Weighting nodes in Spectrum Graph Factors determining if the ion is b or y –Intensity –Support ions –Isotopic peaks (InsPecT’)

49 Fa 05CSE182 De novo: Weighting nodes A probabilistic network to model support ions (Pepnovo)

50 Fa 05CSE182 De Novo Interpretation Summary The main challenge is to separate b/y ions from everything else (weighting nodes), and separating the prefix ions from the suffix ions (Forbidden Pairs). As always, the abstract idea must be supplemented with many details. –Noise peaks, incomplete fragmentation –In reality, a PRM is first scored on its likelihood of being correct, and the forbidden pair method is applied subsequently.

51 Fa 05CSE182


Download ppt "Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry."

Similar presentations


Ads by Google