Protein Identification by MALDI Mass Spectrometry Karin Hjernø and Peter Roepstorff Department of Biochemistry and Molecular Biology University of Southern Denmark Denmark
Peptide Mass Fingerprint (PMF) 2) Enzymatic digestion (trypsin) 3) Micropurification ( salt, + concentration) 4) Mass spectrometric analysis, data interpretation. 1) Separation of proteins, here by 2D-gels pI Mw
Theoretical Digestion Protein of Interest MALDI-MS...LIHGFYMNKPL......LVCDERTFGHG......HYIGFREWMKL......LIYTSARDEFW Database of known Sequences Experimental Peptide Masses Theoretical Peptide Masses Comparison Protein ID PMF
Enzymes for PMF Trypsin High specificity Peptides in a mass range compatible with MALDI Small enzyme
MALDI-MS spectrum 1 Da
Three examples of a normal isotopic distribution at different m/z-value. ? Isotope distribution
Removal of contaminants Common contaminants Keratin Tryptic autodigest projects (842.5, ) Matrix-clusters Overlapping matrix (871.94) and peptide (873.50) peaks.
Removal of contaminants Peak Erazor GPMAW
Outlier Taking advantage of contaminants! Identification of contaminants Multipoint calibration on contaminants < 20 ppm
What!?... No calibrants.. The mass defect: Difference between the monoisotopic and the integer mass value of a given amino acid residue, The mass defect of a peptide around Da is.5, around 2000 Da it is.0 (mass dependent) < 50 ppm
Filteret and calibrated peak list, what then?!? Make a database dependent search; Open the prefered search program Choose the search database Choose the search parameters Start the search Protein candidates
Mascot Interface
Results List 1. gi| Mass: Score: 103 gi| tryptophan synthetase; Trp5p [Saccharomyces cerevisiae] Observed Mr(expt) Mr(calc) Delta Start End Miss Peptide IGWDLR QALNVFR 1 Pyro-glu (N-term Q) QALNVFR FWVTNLK RQALNVFR DEFFAFQK INNALAQVLLAK DTPLAVGFGVSTR VLSKDEFFAFQK GDKDVQSVAEVLPK LTEHCQGAQIWLK SLYSYIGRPSSLHK AQFIAATDAQALLGFK FGLTCTVFMGAEDVR ………... Non- significant matches 1 MSEQLRQTFA NAKKENRNAL VTFMTAGYPT VKDTVPILKG FQDGGVDIIE 51 LGMPFSDPIA DGPTIQLSNT VALQNGVTLP QTLEMVSQAR NEGVTVPIIL 101 MGYYNPILNY GEERFIQDAA KAGANGFIIV DLPPEEALKV RNYINDNGLS 151 LIPLVAPSTT DERLELLSHI ADSFVYVVSR MGTTGVQSSV ASDLDELISR 201 VRKYTKDTPL AVGFGVSTRE HFQSVGSVAD GVVIGSKIVT LCGDAPEGKR 251 YDVAKEYVQG ILNGAKHKVL SKDEFFAFQK ESLKSANVKK EILDEFDENH 301 KHPIRFGDFG GQYVPEALHA CLRELEKGFD EAVADPTFWE DFKSLYSYIG 351 RPSSLHKAER LTEHCQGAQI WLKREDLNHT GSHKINNALA QVLLAKRLGK 401 KNVIAETGAG QHGVATATAC AKFGLTCTVF MGAEDVRRQA LNVFRMRILG 451 AKVIAVTNGT KTLRDATSEA FRFWVTNLKT TYYVVGSAIG PHPYPTLVRT 501 FQSVIGKETK EQFAAMNNGK LPDAVVACVG GGSNSTGMFS PFEHDTSVKL 551 LGVEAGGDGV DTKFHSATLT AGRPGVFHGV KTYVLQDSDG QVHDTHSVSA 601 GLDYPGVGPE LAYWKSTGRA QFIAATDAQA LLGFKLLSQL EGIIPALESS 651 HAVYGACELA KTMKPDQHLV INISGRGDKD VQSVAEVLPK LGPKIGWDLR 701 FEEDPSA Sequence Coverage: 51% Significant matches p < 0.05
m/z Intensity MSEQLRQTFA NAKKENRNAL VTFMTAGYPT VKDTVPILKG FQDGGVDIIE 51 LGMPFSDPIA DGPTIQLSNT VALQNGVTLP QTLEMVSQAR NEGVTVPIIL 101 MGYYNPILNY GEERFIQDAA KAGANGFIIV DLPPEEALKV RNYINDNGLS 151 LIPLVAPSTT DERLELLSHI ADSFVYVVSR MGTTGVQSSV ASDLDELISR 201 VRKYTKDTPL AVGFGVSTRE HFQSVGSVAD GVVIGSKIVT LCGDAPEGKR 251 YDVAKEYVQG ILNGAKHKVL SKDEFFAFQK ESLKSANVKK EILDEFDENH 301 KHPIRFGDFG GQYVPEALHA CLRELEKGFD EAVADPTFWE DFKSLYSYIG 351 RPSSLHKAER LTEHCQGAQI WLKREDLNHT GSHKINNALA QVLLAKRLGK 401 KNVIAETGAG QHGVATATAC AKFGLTCTVF MGAEDVRRQA LNVFRMRILG 451 AKVIAVTNGT KTLRDATSEA FRFWVTNLKT TYYVVGSAIG PHPYPTLVRT 501 FQSVIGKETK EQFAAMNNGK LPDAVVACVG GGSNSTGMFS PFEHDTSVKL 551 LGVEAGGDGV DTKFHSATLT AGRPGVFHGV KTYVLQDSDG QVHDTHSVSA 601 GLDYPGVGPE LAYWKSTGRA QFIAATDAQA LLGFKLLSQL EGIIPALESS 651 HAVYGACELA KTMKPDQHLV INISGRGDKD VQSVAEVLPK LGPKIGWDLR 701 FEEDPSA Tryptic auto digest Contaminants
1 MSEQLRQTFA NAKKENRNAL VTFMTAGYPT VKDTVPILKG FQDGGVDIIE 51 LGMPFSDPIA DGPTIQLSNT VALQNGVTLP QTLEMVSQAR NEGVTVPIIL 101 MGYYNPILNY GEERFIQDAA KAGANGFIIV DLPPEEALKV RNYINDNGLS 151 LIPLVAPSTT DERLELLSHI ADSFVYVVSR MGTTGVQSSV ASDLDELISR 201 VRKYTKDTPL AVGFGVSTRE HFQSVGSVAD GVVIGSKIVT LCGDAPEGKR 251 YDVAKEYVQG ILNGAKHKVL SKDEFFAFQK ESLKSANVKK EILDEFDENH 301 KHPIRFGDFG GQYVPEALHA CLRELEKGFD EAVADPTFWE DFKSLYSYIG 351 RPSSLHKAER LTEHCQGAQI WLKREDLNHT GSHKINNALA QVLLAKRLGK 401 KNVIAETGAG QHGVATATAC AKFGLTCTVF MGAEDVRRQA LNVFRMRILG 451 AKVIAVTNGT KTLRDATSEA FRFWVTNLKT TYYVVGSAIG PHPYPTLVRT 501 FQSVIGKETK EQFAAMNNGK LPDAVVACVG GGSNSTGMFS PFEHDTSVKL 551 LGVEAGGDGV DTKFHSATLT AGRPGVFHGV KTYVLQDSDG QVHDTHSVSA 601 GLDYPGVGPE LAYWKSTGRA QFIAATDAQA LLGFKLLSQL EGIIPALESS 651 HAVYGACELA KTMKPDQHLV INISGRGDKD VQSVAEVLPK LGPKIGWDLR 701 FEEDPSA ~5500 Da Maldi mass range : from ~ 700 Da to ~ 3500 Suppression effect ( preferential ionization of some components at the expence of others) Post-translational modifications Why not 100% sequence coverage
Example – a classical example
Manual evaluation – what to look for? Likely/unlikely missed cleavage sites Overlapping peptides Partial modifications Mass accuracy (Intensity of the peaks) ……
N C R R RP KR K N C N Missed cleavage sites Digestion using trypsin
Missed cleavages Only missed cleavages of one of the following kind are highly likely to be observed (relative to other sites): R/KxxxxxxxR/K xxxxxxxR/KR/K xR/KxxxxxxR/K xxxxE/D R/KxxxxR/K xxxxE/Dx R/KxxxxR/K xxxxR/K E/DxxxxR/K xxxxR/K xE/DxxxxR/K xxxxR/K PxxxxR/K Basic or acidic residue close to the cleavage site in question
Digestion using trypsin N C R R RP KR K N C N RxE DR KR
Manual evaluation Likely/unlikely missed cleavage sites Overlapping peptides Partial modifications Mass accuracy (Intensity of the peaks) ……
Digestion using trypsin N C R R RP KR K N C N DR KR M M ox
Oxidation of methionine 16 Da Non-oxidized Oxidized 64 Da -CH 3 SOH Metastable ion SIVPSGASTGVHEALEMRDEDKSK methanesulfenic acid
Intensity ( ) m/z m/z Intensity m/z Intensity Non- modified Partial modification Oxidized tryptophan (W) Oxidized methionine N-terminal pyro- glutamate
Manual evaluation – what to look for? Likely/unlikely missed cleavage sites Overlapping peptides Partial modifications Mass accuracy (Intensity of the peaks) ……
Example – a classical example
Example 2 – a false-positive
Contaminants and calibration
PeakErazor
Peak List Protein mixtures
m/z Intensity \\Hermes\prgroup\Karin\Andrea!\Spot # IT b23753al.massml (11:20 11/02/01) Description: Human Annexin VI Trypsin autodigest Protein mixtures
m/z Intensity \\Hermes\prgroup\Karin\Andrea!\Spot # IT b23753al.massml (11:20 11/02/01) Description: Human Annexin VI Hypothetical protein XP_ (heat shock 70D) Trypsin autodigest Protein mixtures
D11 When MS/MS is needed...
Digestion Protein of Interest MALDI-MS (MALDI-) MS/MS R A W G Y V L E Protein ID MS and MS/MS Verification of protein ID, analysis of unassigned peaks
Tandem mass spectrometer Mass analyzer Ion SourceDetector Mass analyzer CC Precursor selection Collision Cell, fragmentation of ions Separationof fragments
MS of a peptide mixture
MS/MS of a peptide 2+ (collision energy 10 eV)
MS/MS of a Peptide (collision energy 15 eV)
MS/MS of a Peptide (collision energy 18 eV)
Peptide fragmentation Roepstorff and Fohlman, 1984 Biemann, 1988
Paizs and Suhai, 2004 Formation of b- and y-ions
Identification of the peptide LLQVVEEPQALAAFLR Y1Y1 Y2Y2 Y3Y3 Y4Y4 Y5Y5 Y6Y6 Y7Y7 Y8Y8 Y 10 Y 11 Y 12 Y 13 Y9Y Y 13 EEVV Jens Andersen, Odense, Denmark
Manuel Interpretation 115
Amino Acid3 LetterCodeSingle Letter CodeResidue Mass Monoisotopic GlycineGlyG AlanineAlaA SerineSerS ProlineProP ValineValV ThreonineThrT CysteineCysC IsoleucineIleI LeucineLeuL AsparagineAsnN Aspartic AcidAspD GlutamineGlnQ LysineLysK Glutamic AcidGluE MethionineMetM HistidineHisH PhenylalaninePheF ArginineArgR TyrosineTyrY TryptophanTryW
D Manuel Interpretation
D Manuel Interpretation
Amino Acid3 LetterCodeSingle Letter CodeResidue Mass Monoisotopic GlycineGlyG AlanineAlaA SerineSerS ProlineProP ValineValV ThreonineThrT CysteineCysC IsoleucineIleI LeucineLeuL AsparagineAsnN Aspartic AcidAspD GlutamineGlnQ LysineLysK Glutamic AcidGluE MethionineMetM HistidineHisH PhenylalaninePheF ArginineArgR TyrosineTyrY TryptophanTryW ??
D Q/K Manuel Interpretation
D Q/K 172 Manuel Interpretation
Amino Acid3 LetterCodeSingle Letter CodeResidue Mass Monoisotopic GlycineGlyG AlanineAlaA SerineSerS ProlineProP ValineValV ThreonineThrT CysteineCysC IsoleucineIleI LeucineLeuL AsparagineAsnN Aspartic AcidAspD GlutamineGlnQ LysineLysK Glutamic AcidGluE MethionineMetM HistidineHisH PhenylalaninePheF ArginineArgR TyrosineTyrY TryptophanTryW ?!?
D Q/K Manuel Interpretation
Amino Acid3 LetterCodeSingle Letter CodeResidue Mass Monoisotopic GlycineGlyG AlanineAlaA SerineSerS ProlineProP ValineValV ThreonineThrT CysteineCysC IsoleucineIleI LeucineLeuL AsparagineAsnN Aspartic AcidAspD GlutamineGlnQ LysineLysK Glutamic AcidGluE MethionineMetM HistidineHisH PhenylalaninePheF ArginineArgR TyrosineTyrY TryptophanTryW
D Q/K G 115 Manuel Interpretation
D Q/K G D Manuel Interpretation
F D Q/K G D D L Manuel Interpretation
F D Q/K G D D L Manuel Interpretation
G D G
Immonium ions, diagnostic ions
K/Q I/L T F K/Q E Immonium ions, diagnostic ions
1. Ion series: DGD(Q/K)DFL 2. Ion series: GDG Immonium ions: F,T,(I/L),E,(Q/K)
Proline-induced fragmenation (N-terminal to Proline; Xaa|Pro) Most abundant when the Xaa is Val, His, Ile, Leu, Asp Not abundant when Xaa is Pro or Gly Breci, 2003; Kapp, 2003 Fragmentation of VVAASLNPVDFK, singly charged ion
Fragmentation of VPTVDVSVVDLTVR, singly charged ion No y13???
Peptide with proton predominantly associated with basic residue The proton is mobilized through by collisional activation NH 2 N N N O R1 R2 O R3 O N R4 O B CO 2 H H+H+ NH 2 N N N O R1 R2 O R3 O N R4 O Lys CO 2 H H+H+ Proton mobility If an arginine is present (most basic residue), then the proton is sequestered (non-mobile) and require more energy for mobilization NH 2 N N N O R1 R2 O R3 O N R4 O Arg CO 2 H H+H+ CID or PSD NH 2 N N N O R1 R2 O R3 O N R4 O Arg CO 2 H H+H+ H+H+ If more protons than arginines are present, then a mobile proton will be present MALDIESI
Acidic-induced fragmentation Paizs and Suhai Arg Asp Also Glu, but to a lesser degree Charge-remote fragmentation in contrast to Charge-induced fragmentation
Fragmentation of VPTVDVSVVDLTVR, singly charged ion y4y9
Dominating fragmentation-pathways for singly charged MALDI-peptide ions
LSM OX TNDPLEAAR Y6Y6 Y3Y3 Y1Y1 64 Y6Y6 Y3Y3 Y1Y1 Verification of MALDI-TOF/TOF search result
Prediction of intense ions
The human proteome Isoelectric point (pI) Molecular weight (Mw) Marked spots are differentially expressed between normal red strawberries and a white mutant (half up-regulated and half down-regulated). (fragaria ananassa) Strawberry (not sequenced) Hjernoe et al, 2005, Proteomics, in press
Difference in protein expression visual by eye Difference in protein expression detected by the differential analysis software (here DeCyder) Red contra white strawberries Number of spots Proteins found to be down regulated Protein identified based on homology to 1 Flavanone 3- hydroxylaseOnobrychis viciifolia 1 Dihydroflavonol reductaseArabidopsis thaliana 4 O-methyltransferaseFragaria x ananassa 4 Chalcone synthaseFragaria x ananassa.... All four proteins are known to be involved in the flavonoid biosynthesis pathway. One of the functions of flavonoids is to act as a pigment, giving colour to fruits.
MS/MS MS Search combining both MS and MS/MS spectra Example of protein identification based on MS/MS spectra from a MALDI-TOF/TOF instrument Search against all green plants
Significant hit Details from the search result
4-sulfophenyl- isothiocyanate (SPITC) Marakov et al, J. Mass Spectrom. (2003) 38; Wang et al., Rapid.Commun. Mass. Spectrom (2004), 18(1); SPITC-derivatization, an N-terminal Sulfonation
The peptide needs a net- charge of +1 in order to be detected Derivatization of peptides
m/z Intensity SPITC - SPITC R E AG I T I V Q G D P L E Y N MS/MS Fragmentation of NYELPDGQVITIGAER Found in gi|21538, actin [Solanum tuberosum] 215 Da Karin Hjernø Fragaria x ananassa MALDI-TOF/TOF spectra obtained at Applied Biosystems 4700 Proteomics Analyzer b2b2 b3b3 y1y1 y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9 y 10 y 11 y 12 y 13 y 14 y 15 y 10 y 12 y 13
Chalcone synthase O-methyltransferase Dihydroflavonol reductase Flavanone3-hydroxylase Red The Bet v 1-homologous strawberry allergen, Fra a Red White Mw pIpI Strawberry allergen-containing spots
K* L L T G G H P A S V L SPITC K* L V T G H P A S V L SPITC K* L V T G G H P A S V L SPITC m/z Intensity One spot, three distinct isoforms
The End....