Presentation is loading. Please wait.

Presentation is loading. Please wait.

Schedule 1.9.00Introduction 2.9.10Basic concepts of mass spectrometry (MS) and MALDI-TOF MS 3.9.30Protein identification by Peptide Mass Fingerprinting.

Similar presentations


Presentation on theme: "Schedule 1.9.00Introduction 2.9.10Basic concepts of mass spectrometry (MS) and MALDI-TOF MS 3.9.30Protein identification by Peptide Mass Fingerprinting."— Presentation transcript:

1 Schedule 1.9.00Introduction 2.9.10Basic concepts of mass spectrometry (MS) and MALDI-TOF MS 3.9.30Protein identification by Peptide Mass Fingerprinting (PMF) and db search; MASCOT score! 4.9.50Workflow example 1 : 2D-PAGE – MALDI-TOF 5.10.00Coffe break 6.10.15Electrospray MS, tandem MS, peptide fragmentation 7.10.45De novo sequencing exercise 8.11.20Discussion of exercise results 9.11.30LC-MS/MS, protein identification with MS/MS data 10.11.45Database search with MS/MS data 11.12.15 Lunch break 12.13.00Continuation of database search with MS/MS data 13.13.45Discussion of exercise results 14.14.15Shotgun proteomics (fractionation,complexity) 15.14.45Data interpretation and validation (homologous proteins, statistical evaluation) 16.15.00Example : typical experiment results : format and interpretation 17.15.30Advanced proteomics : PTMs and quantitation 18.16.00Test 19.16.30 End Red : exercises

2 Why do proteomics ? Proteins are the main effectors of all cellular functions Proteins have a complex life that cannot be studied by DNA- or RNA-based techniques –Regulation of translation –Maturation, processing –Sorting, targeting, secretion –PTM (Post Translational Modifications) –Protein-protein interactions –Degradation –… [ RNA ] [ protein ] ?

3 Challenges of in vivo proteomics Complexity : –35’000 genes (?) in H. sapiens, –15’000 expressed in a single cell ? – but 50’000-150’000 chemically different protein species ? Dynamic range : –10 5 x or 10 6 x between low and high-abundance proteins Plasticity : –continuous variation in protein expression pattern, PTM’s, degradation,… Proteins… –have vastly different physico-chemical properties (acidic, basic, hydrophilic, hydrophobic, …) –cannot be amplified …. 

4 www.plasmaproteome.org

5 Tools needed 1) Separate ABdyxzW1 a2Pn electrophoresis (1D, 2D), Isoelectric focusing liquid chromatography, affinity, … 2)Analyse mass spectrometry

6 Mass Spectrometry in proteomics

7 Mass spectrometry : essential functions SAMPLE ION GENERATION ION SEPARATION ION DETECTION ION SOURCEMASS ANALYZER DETECTOR ESI : Electrospray Ionisation MALDI : Matrix Assisted Laser Disorption/Ionization Quadrupoles Ion traps Time-of-flight with reflectron TOF/TOF FT-ICR (Fourier transform – Ion Cyclotron Resonance) Faraday cup Scintillation counter Electromultiplier High-energy dynodes with electronmultiplier Array (detector) FT-MS

8 Diagram of a mass spectrometerinletinstrumentcontrolsystem data storage vacuum system ionisationsourcemassanalyzerdetector

9 Masses and mass measurements All mass spectrometers function by measuring molecules in their ionized state All values determined by MS are relative to the m/z assumed by the molecule after the ionization process The relationship between the molecular mass (m) and the m/z value can be calculated as follows: m/z = (m + (mA * z )) / z mA is the mass of the adduct responsible for ionization (typically H+ for positive MS mode).

10 Amino acid residue masses and related theoretical immonium ions One letter Code Three-letter code Monoisotopic mass Immonium AAla:71.0371144.0495 CCys103.0091976.0215 DAsp115.0269488.0393 EGlu129.04259102.0550 FPhe147.06841120.0808 GGly57.0214630.0338 HHis137.05891110.0713 IIle113.0840686.0964 KLys128.09496101.1073 LLeu113.0840686.0964 MMet131.04049104.0528 NAsn114.0429387.0553 PPro97.0527670.0651 QGln128.05858101.0709 RArg156.10111129.1135 SSer87.0320360.0444 TThr101.0476874.0600 VVal99.0684172.0808 WTrp186.07931159.0917 YTyr163.06333136.0757 H R NHNH C O C OH H 117 n  m residues n + + m =

11 -> m/z (mono) of VATVSLPR 1+ : ( 841.502 + 1.008 ) /1 = 842.510 -> m/z (mono) of LGE…AAK 3+ : ( 2210.097 + (3 x 1.008 )) /3 = 737.707 SequenceCompositionMW(mono)MW(ave)m/z (mono) m/z(ave) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ VATVSLPRC37.H68.N11.O11841.5022841.999842.510843.014 LGEHNIDVLEGNEQFINAAKC96.H152.N27.O332210.0972211.3972211.1052211.105 Isotope Atomic mass% AbundanceElementMr(mono.)Mr(avg.) 1H 1.007899.985H1.00781.0080 2H 2.0141 0.015 ----------------------------------------------------------------------------------------------------------------------------------- 12C12.000098.93C 12.000012.0107 13C13.0034 1.07 ------------------------------------------------------------------------------------------------------------------------------------ 35Cl34.968975.78Cl34.968935.4525 37Cl24.2224.22 m/z = (m + (mA * z )) / z Mass distribution and mass measurements

12 Theoretical MS : the effect of resolution The average mass corresponds to the “top of the peak” of a measurement done at low resolution Avg. Mono.

13 839.0840.6842.2843.8845.4847.0 Mass (m/z) 00 10 20 30 40 50 60 70 80 90 100 842.502 (all 12C) 843.485 1x 13C 844.480 2x 13C % Intensity Average Monoisotopic Experimental : 1+ ion (MALDI) Example : Peptide VATVSLPR 1+ : ( 841.502 + 1.008 ) /1 = 842.510

14 MALDI (Matrix Assisted Laser Desoprtion Ionisation) MALDI TOF (Time Of Flight) MALDI IONISATION

15 Great for Peptide Mass Fingerprinting –Fast –Easy to measure –Sensitive –Salt-tolerant (to some extent) –Also good for larger MW (small proteins) –Sample on a stable support (no time constraints) –1+ ions  simpler data analysis High accuracy needs careful calibration Difficult (but possible) to do MS/MS by MALDI Signal suppression in complex mixtures Crystallisation conditions influence results disadvantages MALDI TOF

16 Protein identification by Peptide Mass Fingerprinting (PMF)

17 MS-based protein identification : concept Protein sample Protein fragments (5-30 AA peptides) Exact masses of peptides Fragmentation (MS/MS) spectrum of each peptide Protein sequence(s) Protein fragment sequences (same protease specificity) Calculated exact masses of peptides Calculated fragmentation spectrum of each peptide Specific protease e.g. trypsin MS software Best Match(es) experimentalIn silico

18 Software (MASCOT, SEQUEST,…) Protease digestion (Trypsin) Peptide extraction Theoretical MS data Sequence database Best matching Sequence(s) MS m/z Experimental MS data MS-based protein identification : concept

19 MALDI-TOF of a tryptic digest of a protein

20 Extracted peak list m/z 847.5896 869.0722 922.5712 923.5815 927.5904 1022.5551 1050.5533 1163.7695 1164.7531 1193.7393 1249.7705 1250.8103 1296.8556 1297.8499 1305.8668 1416.8929 1440.0008 1479.9773 1482.9583 1567.9417 1640.1635 1824.06 … Information contained in MS spectrum

21 Search form…

22

23 MALDI-TOF of a tryptic digest of BSA YLYEIAR LSQKFPK LVNELTEFAK FKDLGEEHFK HPEYAVSVLLR HLVDEPQNLIK LGEYGFQNALIVR KVPQVSTPTLVEVSR DAFLGSFLYEYSR ? ? ? ? ? ?

24 PMF example Dbtaxonomyerror tolscore#pepsthreshold 2nd hit -------------------------------------------------------------------------------------------------------------------------- ---------------- SPall50 ppm1082267 30 SPall500 ppm652367 44 SPA.thaliana500 ppm652350 28 SPA.thaliana50 ppm1082250 26 SPA.thaliana30 ppm1102250 20 1121.584 1277.698 1407.672 1415.738 1419.779 1547.878 1622.802 1658.941 1745.896 1815.038 1877.932 1893.92 1970.2 854.487 856.497 864.463 870.515 1045.564 1106.061 1106.563 1126.571 1143.570 1203.671 1384.788 1437.709 1442.767 1454.853 1464.699 1542.655 1607.827 1660.954 1671.840 1802.949 1940.971 2000.059 2211.106 2224.111 2225.115 2231.210 2233.104 2239.136 2249.064 2283.180 2299.177 Search Parameters : Type of search : Peptide Mass Fingerprint ----- Database : SwissProt 53.1 Enzyme : Trypsin Fixed modifications : Carbamidomethyl (C) ----- Variable modifications : Oxidation (M) Mass values : Monoisotopic - - - Protein Mass : Unrestricted Peptide Mass Tolerance : ± 30 ppm Peptide Charge State : 1+ Max Missed Cleavages : 1 Threshold = f (database size) ; score = f ( #matches, « uniqueness »)  Peak list

25 Factors affecting performance of PMF Sample purity : PMF cannot deal with mixtures of more than 2 proteins. Therefore this method is best coupled to high resolution protein purification techniques such as 2D-PAGE Specificity of the digestion : high purity trypsin (cleaves after K / R ) is the most commonly used enzyme Employing the highest mass accuracy available in the measurement of peptide masses. Less than 20 ppm error can be routinely achieved with modern instruments. Such a degree of precision is essential for identifying proteins from species with a large genome.

26 WORKFLOWS 1 : „classical“ 2D-PAGE + MALDI TOF

27 1 1 2 2 3A 3B 4 4 5 6 6 7 7 8 8 9 9 10 11 12 1 14 8 15 Workflow 1 : adaptation of bacteria to growth conditions Normal mediumLow Glucose Wick LM, et al, Environ Microbiol 3: 588-599, 2001 E.Coli adapts to a very low glucose medium by up- and downregulating a set of 15 proteins

28 M1M1 M2M2 M3M3 M4M4 M5M5 M6M6 M7M7 M8M8 Peptide Mass Fingerprinting Search with mass list M 1......M 8 Tryptic digestion Nrmw*pI*Acc. N.NameFunction 152.25.07P25553aldAcentral metabolism 260.36.21P23847dppApeptide transport 347.55.06P05313aceACentral metabolism 448.456.71P10904ugpBtransport 555.26.02P00822atpAproton transport/energy 6??P76108ydcStransport 741.37.03P02917livJamino acid transport 840.75.22P02928malEsugar transport 933.365..25P02927mglBsugar transport 1061.55.81P37192gatYcatabolism 1130.97.76P02925rbsBsugar transport 1225.85.22P09551argTamino acid transport Workflow 1 : adaptation of bacteria to growth conditions Metabolic enzymes and transport proteins affected Utilisation of alternative sources of energy

29 P. Kebarle, M. Peschke / Analytica Chimica Acta 406 (2000) 11–35 molecules compete for ionisation e.g. Na+ >> Peptide ELECTROSPRAY IONISATION

30 Great for MS/MS –Can be directly coupled to reversed phase LC (separation !) –Sensitive –Excellent for MS/MS due to 2+/3+ ions Sample introduction more complex Data analysis more difficult (2+/3+ ions) one-shot sample analysis (time constraints) Very low tolerance to contaminants disadvantages ELECTROSPRAY IONISATION

31 1+ versus 2+ ions

32 Modes of measurement : MS & MS/MS Ion production (ionisation) Ion separation Ion detection Ion production (ionisation) Ion separation – isolation of “parent” ion Ion fragmentation (CID) Ion separation – separate fragment ions Ion detection - measure fragment ions MS Tandem MS MS/MS

33 CID= collision induced dissociation Low energy ( > 100 eV) Precursor ion = parent ion : the one being fragmented Daughter ions = fragment ions produced by CID Tandem mass spectrometry = MS/MS - here : the combination of ion selection / CID / fragment analysis ESI of tryptic peptides typically generates doubly charged ions due to the presence of Lys or Arg at the C terminal end of the peptides y- and b- ion series fragments are usually observed in MS/MS fragmentation spectra. MS/MS Glossary and facts

34 Covalent bonds being broken  ion series

35

36 Method for partial interpretation of MS/MS spectra Determine charge state of parent ion, calculate MH+ mass, Clean up spectrum if necessary Inspect high mass end for b and y ions and residue mass loss: Last b ion is MH+ - 18 - residue mass; first y ion is MH+ - residue mass, Start at high mass end above precursor and look for descending ion series by sequentially subtracting residue masses OR Start at y1 ion (147 or 175 if tryptic) and look for ascending ion series If ambiguities arise: - check if any of the peaks of interest is a 2+ ion, - has a complementary peak that has already been used (thus belongs to another series), - is a -18 or -28 of another major peak. Check if any of the residues in the extracted tag is confirmed by immonium ions, Be careful about dipeptide masses that match residue masses, for example G+G=N Question :what are the main problems with this approach ?

37 ESI & Quadrupole-based instruments Triple Quadrupole Triple Quadrupole- Ion trap Quadrupole-Quadrupole TOF Ion Trap (3D trap) All these instruments can perform MS/MS fragmentation experiments

38 3D ion trap (Paul trap) functions : - Trap all ions / scan  MS (full scan) - Trap all ions / isolate precursor / scan  zoom scan - Trap all ions / isolate precursor / fragment / scan fragments  MS/MS scan - Trap all ions / isolate precursor / fragment / … repeat n times …/ scan fragments  MS n scan Example 1 : electrospray ion trap MS

39 REFLECTRON SKIMMER SAMPLING CONE ELECTROSPRAY NEEDLE HEXAPOLE GAS COLLISION CELL DETECTORPUSHER TOF RF HEXAPOLE Example 2 : Quadrupole Time of flight Mass Spectrometer (QTof) electrospray source

40 MS-based protein identification : concept Protein sample Protein fragments (5-30 AA peptides) Exact masses of peptides Fragmentation (MS/MS) spectrum of each peptide Protein sequence(s) Protein fragment sequences (same protease specificity) Calculated exact masses of peptides Calculated fragmentation spectrum of each peptide Specific protease e.g. trypsin MS software Best Match(es) experimentalIn silico

41 Experimental set-up : nanoLC-MS/MS database correlation C18 Column L = 10 cm ID = 50-100 µm Mass spectrometer T-splitter ~5 % HPLC pumps ~95% further analysis (waste) sample

42 510152025303540455055606570758085 Time, min Intensity, cps 1 2 Data collection Cycle : a) full scan (MS) b) detect peaks, choose ions to fragment (charge states, exclusion list) c) isolate ion, fragment, collect spectrum d) go back to a) 450500550600650700750800 m/z, amu 476.2484 554.2839634.6555 751.8398 514.2206 Full scan 100150200250300350400450500550600650700750800850900 m/z, amu 290.15 175.1259 86.1037 741.21 636.25 129.07 765.30476.22 405.18 272.07 310.13541.17 619.22872.31 MS/MS of (1) 15020025030035040045050055060065070080085090095010001050110011501200 m/z, amu 265.1288 247.11 275.11 404.16 538.21 136.0824 293.1229 796.31 667.27 378.19 909.38 201.0968 1010.43 558.23 422.1668 187.1512284.1341 475.191081.48 MS/MS of (2) An on-line LC-ESI-MS experiment with automatic data acquisition

43 For automatic identification with MS/MS data, raw data files are submitted for searching the database. An export software parses the LC-MS file and identifies all MS/MS spectra. Each spectrum is then extracted, together with associate information (precursor ion mass, elution time,etc). Individual spectra are then treated to remove noise and, importantly, all fragment peaks are centroided and deisotoped. It results, for each spectrum, a flat text file that consists of header information followed by a mass / intensity list. We show below a spectrum reduced in such a format for the MASCOT software. An input file for a database search can contain hundreds or thousands of these spectra. MASCOT input data – example (one spectrum) ; Mass / intensity list SEARCH=MIS REPTYPE=Peptide BEGIN IONS PEPMASS=488.475147958288 CHARGE=3+ TITLE=Elution from: 22.8 to 22.82 period: 0 experiment: 1 cycles: 2 74.05972.5 84.0392 1.5 119.1633 28.5 129.0928 3 143.8366 5.5 147.1138 4 171.1069 7 227.0473 2 233.6612 7 262.0674 11.5 …… ….. … END IONS Data export

44 +TOF Product (653.3) 1002003004005006007008009001000110012001300 m/z, amu 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 19.7 Intensity, counts 251.12 110.06 223.13 138.05 653.36 332.21 594.25 86.09 350.24712.46 1055.49 373.30 566.32 465.16 956.411168.56166.05841.53 206.16 a1 b1 I a2 b2 y10 I b3 y9 b4y8 a5 b5 y7 y6 b8 y3 I y2 y1 ResidueImmoniumaby ----------------------------------------------------------------------------------- H, His110.07110.07 138.061305.71 L, Leu86.09223.15 251.151168.65 V, Val72.08322.22 350.211055.57 D, Asp88.03437.25 465.24956.50 E, Glu102.05566.29 594.28841.47 P, Pro70.06663.34 691.34712.43 Q, Gln101.07791.40 819.39615.38 N, Asn87.05905.44 933.44487.32 L, Leu86.09 1018.53 1046.52373.28 I, Ile86.09 1131.61571159.61260.19 K, Lys 101.10 1259.71061287.70147.11 MH 2 2+ precursor Matching of MS / MS data Black : predicted Red : predicted and detected

45 Mascot search output Mascot Search Results Significant hits: ALBU_BOVINALBU_BOVIN (P02769) Serum albumin precursor (Allergen Bos d 6). ALBU_CANFAALBU_CANFA (P49822) Serum albumin precursor (Allergen Can f 3). VWF_PIGVWF_PIG (Q28833) Von Willebrand factor precursor (vWF) (Fragment). CIQ3_BOVINCIQ3_BOVIN (P58126) Voltage-gated potassium channel protein KQT-like 3 RYR2_RABITRYR2_RABIT (P30957) Ryanodine receptor 2 (Cardiac muscle-type ryanodin K2CA_BOVINK2CA_BOVIN (P04263) Keratin, type II cytoskeletal 68 kDa, component IA ALFB_RABITALFB_RABIT (P79226) Fructose-bisphosphate aldolase B (EC 4.1.2.13) (Li ALBU_BOVINALBU_BOVIN Mass: 71244 Total score: 711 Peptides matched: 12 (P02769) Serum albumin precursor (Allergen Bos d 6). Observed Mr(expt) Mr(calc) Delta Miss Score Rank Peptide 1515 461.80 921.58 921.48 0.10 0 55 1 AEFVEVTK 1919 501.80 1001.58 1001.58 0.01 0 31 1 LVVSTQTALA 3131 569.80 1137.58 1137.49 0.09 0 72 1 CCTESLVNR 3333 582.30 1162.58 1162.62 -0.04 0 76 1 LVNELTEFAK 4747 653.40 1304.78 1304.71 0.08 0 90 1 HLVDEPQNLIK 5858 722.40 1442.78 1442.63 0.15 0 87 1 YICDNQDTISSK 6060 739.80 1477.58 1477.52 0.07 0 43 1 ETYGDMADCCEK 6161 740.40 1478.78 1478.79 -0.00 0 61 1 LGEYGFQNALIVR 6464 751.90 1501.78 1501.61 0.18 0 37 1 EYEATLEECCAK 7474 547.30 1638.88 1638.93 -0.05 1 85 1 KVPQVSTPTLVEVSR 9797 627.70 1880.08 1879.91 0.16 0 34 1 RPCFSALTPDETYVPK 9898 636.70 1907.08 1906.91 0.16 0 42 1 LFTFHADICTLPDTEK

46 Orthogonal datasets and confidence levels Database : 100’000 sequences 500 spectra  Probability of one (any) spectrum “accidentally” matching a sequence (wrong match) : 1/100’000 x 500 = 5.10 -3 (0.005) Probability of 2 spectra “accidentally” matching the same sequence (wrong match) : 5.10 -3 x 5.10 -3 = 2.5 x 10 -5 Much higher confidence of identification with at least two peptides matching the same protein sequence Every peptide is unambiguously assigned to its “parent “ sequence, therefore many proteins can be identified in one sample during one run PMF : one MS spectrum  one dataset (peak list) MS/MS : n MS/MS spectra  n orthogonal datasets

47 m/z time Chromatographic Separation (reversed-phase) Tandem mass spectra of 50- 2000 peptides Q7Z5Y2Q7Z5Y2 Mass: 118789 Total score: 178 Peptides matched: 6 Rho-interacting protein 3. Mr(calc) ScorePeptide 930.48 42EGLTVQER 1032.54 11NWIQTIMK 1206.63 29FSLCILTPEK 1369.75 24LSTHELTSLLEK 1406.77 55FFILYEHGLLR 1775.88 16QVPIAPVHLSSEDGGDR Protease digestion Peptide extraction Nano-HPLC MS/MS Summary : Typical Analytical Workflow Database matches DHX9_HUMANDHX9_HUMAN ATP-dependent RNA helicase A NFM_HUMANNFM_HUMAN Neurofilament triplet M protein Q9BQG0Q9BQG0 Hypothetical protein MYO6_HUMANMYO6_HUMAN Myosin VI. TP2A_PIGTP2A_PIG DNA topoisomerase II, alpha isozyme Q7Z5Y2Q7Z5Y2 Rho-interacting protein 3. FLIH_HUMANFLIH_HUMAN Flightless-I protein homolog. TP2B_MOUSETP2B_MOUSE DNA topoisomerase II, beta isozyme S3B1_HUMANS3B1_HUMAN Splicing factor 3B subunit Q8VCW5Q8VCW5 Similar to alpha internexin neuronal Q8CHF9Q8CHF9MKIAA0376 protein (Fragment). Protein sequence database Database searching Software (MASCOT) Output : Protein identification in simple/complex mixtures Extensive sequence coverage and peptide mapping Analysis of modified peptides possible Biological question

48 DB search with MS/MS exercise http://www.matrixscience.com/cgi/search_form.pl?FORMVER=2&SEARCH=MIS ! Searches can be bookmarked for later accession !! ------------------------------------------------------------------------------------------------------------ Datasets : three different ones : (1)BSA spectra used for the novo ; file : 461_740_only.mgf (2)BSA LC run 136 spectra ; file : BSA2.mgf (3)Rafts complex sample partial 123 spectra ; file : rafts1_123spectra.mgf ------------------------------------------------------------------------------------------------------------ (1)Parameters Search : Database: SwissProt, Taxonomy: Mammalia, Enzyme: Trypsin, Instrument: ESI-QUAD-TOF, Report top 10 hits; i)search with Peptide tol. 0.3 Da, MS/MS tol. 0.6 Da ii)search with Peptide tol. 2.0 Da, MS/MS tol. 0.6 Da iii)search with Peptide tol. 2.0 Da, MS/MS tol. 2.0 Da Output : Peptide summary, Ions score cutoff = 0, Require bold red =0 Observe : peptide scores, score of first unrelated sequence, Mascot score threshold

49 (2)Parameters as in (1) with Search : Report top AUTO hits, Peptide tol. 0.3 Da, MS/MS tol. 0.6 Da, Fixed modification: Carbamidomethyl (C), Variable modification: Oxidation (M) i) search with Enzyme: Trypsin ii) search with Enzyme: semi-Trypsin Output : use Standard scoring and Ion cut-off of 14 Observe : number of matches for BSA, protein scores, species of matched proteins, mass error of peptides matched, compare spectra of high scored (> 60) and low scored (<35) peptides, for ii) look in particular at additional peptides

50 (3)Parameters as in (2) with i)search with Database: human, Variable modification: Oxidation (M) ii)search with Database: human, Variable modifications: Oxidation (M), Phospho (ST), Acetyl (N-term) iii)search with Database: Escherichia coli, Variable modification: Oxidation (M) In Mascot output, use Standard scoring, Ion cut-off of 14 and Require bold red  Observe : number of protein identified, homologous sequences (see bold red vs red peptides), Mascot score threshold, unassigned peptides for i) look in particular at the seven last hits: are they convincing? for ii) look in particular at additional peptides and protein identified: are they convincing ? Following discussion will highlight these aspects : - result format and what everything means - parameters for validation : mass error, sequence, fragment patterns, … - strong hits vs weak hits : relationship with protein amount - selectivity vs comprehensiveness : chosing the right error tolerance and other parameters - homologous proteins problem

51 click Results format Default Input (change according to exercise)

52 WORKFLOWS 2 : General shotgun protein identification techniques example : Affinity pull-down + 1D-PAGE + LC-MS/MS

53 Shotgun sequencing from complex mixtures Denaturation, Proteolytic digestion Db search Multiprotein complex List of identified proteins 1.P45218 2.P21543 3.Q12588 4.P32651 5.Q01245 6.…. Complex peptide mixture (1000-20000 species) Rp-LC-MSMS run

54 Alternative : MuDPIT (Multi Dimensional Protein Identification Technology) Denaturation, Proteolytic digestion Strong Cation Exchange (SCX) separation Db search Multiprotein mixture (complex) List of identified proteins 1.P45218 2.P21543 3.Q12588 4.P32651 5.Q01245 6.…. Complex peptide mixture (1000 - 2000000) Rp-LC-MSMS runs Post-digestion separation of peptides by two-dimensional liquid chromatography instead of separation of proteins

55 Peptide validation problem There are good hits, bad hits, and everything in between… ? True positives with weak spectra False positives Homologous sequences Many solutions proposed for validation : manual, semimanual, training sets, mass accuracy,…. statistical validation is now popular

56 Monoisotopic mass of neutral peptide Mr(calc): 1113.622 Fixed modifications: Carbamidomethyl (C) Ions Score: 69 Expect: 3.8e-07 Matches (Bold Red): 17/90 fragment ions using 32 most intense peaks A good hit (Mascot score=69) MS/MS Fragmentation of VMLAANIGTPK -- Found in Zurbriggen_Ttengcongensis Match to Query 525: 1113.616556 from(557.815554,2+) F020857.dat

57 Monoisotopic Mr(calc): 1478.7994 Fixed modifications: Carbamidomethyl (C) Ions Score: 13 Expect: 18 Matches (Bold Red): 10/110 fragment ions using 26 most intense peaks A bad hit (Mascot score=13) MS/MS Fragmentation of VSIALSSHWINPR Found in KLOT_MOUSE, Klotho precursor - Mus musculus (Mouse) Match to Query 2: 1478.838768 from(740.426660,2+) * * * * * * * Strong unmatched peaks *

58 Further processing of Mascot results ( shotgun experiment, one or more samples) Mascot Search Results Significant hits: ALBU_BOVINALBU_BOVIN (P02769) Serum albumin precursor (Allergen Bos d 6). ALBU_CANFAALBU_CANFA (P49822) Serum albumin precursor (Allergen Can f 3). VWF_PIGVWF_PIG (Q28833) Von Willebrand factor precursor (vWF) (Fragment). CIQ3_BOVINCIQ3_BOVIN (P58126) Voltage-gated potassium channel protein KQT-like 3 RYR2_RABITRYR2_RABIT (P30957) Ryanodine receptor 2 (Cardiac muscle-type ryanodin K2CA_BOVINK2CA_BOVIN (P04263) Keratin, type II cytoskeletal 68 kDa, component IA ALFB_RABITALFB_RABIT (P79226) Fructose-bisphosphate aldolase B (EC 4.1.2.13) (Li Mascot Search Results Significant hits: ALBU_BOVINALBU_BOVIN (P02769) Serum albumin precursor (Allergen Bos d 6). ALBU_CANFAALBU_CANFA (P49822) Serum albumin precursor (Allergen Can f 3). VWF_PIGVWF_PIG (Q28833) Von Willebrand factor precursor (vWF) (Fragment). CIQ3_BOVINCIQ3_BOVIN (P58126) Voltage-gated potassium channel protein KQT-like 3 RYR2_RABITRYR2_RABIT (P30957) Ryanodine receptor 2 (Cardiac muscle-type ryanodin K2CA_BOVINK2CA_BOVIN (P04263) Keratin, type II cytoskeletal 68 kDa, component IA ALFB_RABITALFB_RABIT (P79226) Fructose-bisphosphate aldolase B (EC 4.1.2.13) (Li Sample 1 Sample 2 Mascot -Statistical Validation -Protein assignment (parsimony) -Sample alignment Scaffold 1 2 Final list (.xls)

59 Protein validation problem ! We are not identifying proteins, but peptides ! 1) Apply principle of parsimony (Occam’s razor) : within a family, list protein which can explain the most of the identified peptides 2) To highlight the presence of a member of a protein family, at least one discriminating (unique) peptide must be present (what if it is a borderline hit ?) ??? Worst case : two distinct homologous members of the same family found in two samples, each with a weak discriminating peptide….what to say ?

60 Free Scaffold viewer : www.proteomesoftware.com Data analysis and distribution software : Scaffold Protein ID probability Nb unique peptides Nb identified spectra Nb unique spectra Percentage of total spectra

61 “Advanced” proteomics Quantitative proteomics –Isotope labelling : metabolic vs chemical : ICAT, iTRAQ, SILAC –Label-free Proteome subsets –Phosphoproteome –Ubiquitinated proteins –… Clinical proteomics (marker discovery) –Too vast to summarise Proteome imaging –MALDI of tissues

62 Relative quantification : Comparison of proteins from samples A vs B ? Which proteins change in amount and how much ? Applications : -Healthy vs. diseased tissues -Healthy vs. diseased body fluids -Drug treated / untreated cells -Stimulated / unstimulated cells -Mutants / wt cells -……..

63  m = 9 Da Peptide from sample A Peptide from sample B Relative quantitation : stable isotope labelling is very fashionable! Sample A : light isotopeSample B : heavy isotope mix, digest Quantitate and identify ( MS)

64 How to label ? -chemically, post protein synthesis  “specific” chemical modification of AA side chain (+) any sample can be done (-) side reactions -metabolically, during protein synthesis  Incorporation of one or more labelled amino acid (+) “native” proteins (-) need cultivable organism

65 d0- or d8-ICAT Biotin tag Linker (heavy or light) S N N O N O O O N I O O X X X X X X X X Thiolreactive (X= H or D) State 1 State 2 Transition states [protein 1 ] [protein 2 ]. [protein n ] [protein 1 ] [protein 2 ]. [protein n ] Isotope Coded Affinity Tag (ICAT) reagents

66 Cell State 1Cell State 2 Combine samples Intensity m/z A1 A2 A3 B1 B2 Identify proteins by MS/MS Intensity m/z aa1 aa2 aa3 aa4 Quantitate protein levels by H8 / D8 peak heigth ratios New Methods : ICAT:quantitation and identification Biotin NHNH O I HH HHHH HH NHNH O I DD DDDD DD Modify with (H8)-ICAT Modify with (d8)-ICAT Digest Trypsin Purify Cys-containing peptides on avidin column HS- -SH

67 Avidin Affinity Chromatography control treatment ICAT label Combine & proteolyze d0d0 d0d8 d0/d8 d8d8 Fraction # (Time (min)) KCl [M] 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.3 0.6 701020304050 60 Ion-Exchange OD 214 % AcN R.A. (%) 0 50 100 010050 Time (min) 10204050607030 LC-MS/MS Area = 1.21x10 9 Area = 1.01x10 9 d0 d8 d8/d0 d8/d0 = 1:0.83 A) Identify B) Quantify 1) 2) 3) 4) 5) 6) Pair wise ICAT with Multidimensional Chromatography

68 ICAT (+) and (-) - relative protein quantification by MS - simplification of complex mixtures by selecting a subset of peptides after digestion - eliminate analytical variability by mixing samples ~15 different isotope labelling methods developed in the last 5 years !! - protein quantification unreliable for weak signals - affinity purification (avidin) : losses for low amounts - multiple side reactions possible + -

69 Recent ICAT studies (R. Aebersold’s group) Wollscheid B, von Haller PD, Yi E, Donohoe S, Vaughn K, Keller A, Nesvizhskii AI, Eng J, Li XJ, Goodlett DR, Aebersold R, Watts JD. Lipid raft proteins and their identification in T lymphocytes. Subcell Biochem. 2004;37:121-52 Yan W, Lee H, Yi EC, Reiss D, Shannon P, Kwieciszewski BK, Coito C, Li XJ, Keller A, Eng J, Galitski T, Goodlett DR, Aebersold R, Katze MG. System-based proteomic analysis of the interferon response in human liver cells. Genome Biol. 2004;5(8):R54. Giglia-Mari G, Coin F, Ranish JA, Hoogstraten D, Theil A, Wijgers N, Jaspers NG, Raams A, Argentini M, van der Spek PJ, Botta E, Stefanini M, Egly JM, Aebersold R, Hoeijmakers JH, Vermeulen W. A new, tenth subunit of TFIIH is responsible for the DNA repair syndrome trichothiodystrophy group A. Nat Genet. 2004 Jul;36(7):714-9. Ranish JA, Hahn S, Lu Y, Yi EC, Li XJ, Eng J, Aebersold R. Identification of TFB5, a new component of general transcription and DNA repair factor IIH. Nat Genet. 2004 Jul;36(7):707-13. Hardwidge PR, Rodriguez-Escudero I, Goode D, Donohoe S, Eng J, Goodlett DR, Aebersold R, Finlay BB Proteomic analysis of the intestinal epithelial cell response to enteropathogenic Escherichia coli. J Biol Chem. 2004 May 7;279(19):20127-36. Zhang J, Goodlett DR, Peskind ER, Quinn JF, Zhou Y, Wang Q, Pan C, Yi E, Eng J, Aebersold RH, Montine TJ. Quantitative proteomic analysis of age-related changes in human cerebrospinal fluid. Neurobiol Aging. 2005 Feb;26(2):207-27. Marelli M, Smith JJ, Jung S, Yi E, Nesvizhskii AI, Christmas RH, Saleem RA, Tam YY, Fagarasanu A, Goodlett DR, Aebersold R, Rachubinski RA, Aitchison JD. Quantitative mass spectrometry reveals a role for the GTPase Rho1p in actin organization on the peroxisome membrane. J Cell Biol. 2004 Dec 20;167(6):1099-112. Epub 2004 Dec 13.

70 Label light / heavy cultures (Leu d0 / d3) Stimulate heavy cells Mix cells or lysates Purify fraction of interest Analyse by LC-MS/MS (->ID) Quantify signals of ion pairs SILAC Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics. 2002 May;1(5):376-86.

71 SILAC (+) and (-) relative protein quantification by MS eliminate preparative variability by mixing samples immediately after culture eliminate analytical variability peptides in native state (no side reactions) protein quantification unreliable for very weak signals mass shift variable (dependent on number of residues) only feasible with organisms in culture + -

72 Recent SILAC articles Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics. 2002 May;1(5):376-86. Blagoev B, Ong SE, Kratchmarova I, Mann M. Temporal analysis of phosphotyrosine-dependent signaling networks by quantitative proteomics. Nat Biotechnol. 2004 Sep;22(9):1139-45. Epub 2004 Aug 15. Gruhler A, Olsen JV, Mohammed S, Mortensen P, Faergeman NJ, Mann M, Jensen ON. Quantitative Phosphoproteomics Applied to the Yeast Pheromone Signaling Pathway. Mol Cell Proteomics. 2005 Mar;4(3):310-327. de Hoog CL, Foster LJ, Mann M. RNA and RNA binding proteins participate in early stages of cell spreading through spreading initiation centers. Cell. 2004 May 28;117(5):649-62. Ong SE, Kratchmarova I, Mann M. Properties of 13C-substituted arginine in stable isotope labeling by amino acids in cell culture (SILAC). J Proteome Res. 2003 Mar-Apr;2(2):173-81. Blagoev B, Kratchmarova I, Ong SE, Nielsen M, Foster LJ, Mann M. A proteomics strategy to elucidate functional protein-protein interactions applied to EGF signaling. Nat Biotechnol. 2003 Mar;21(3):315-8. Epub 2003 Feb 10. Foster LJ, De Hoog CL, Mann M. Unbiased quantitative proteomics of lipid rafts reveals high specificity for signaling factors. Proc Natl Acad Sci U S A. 2003 May 13;100(10):5813-8. Epub 2003 Apr 30.

73 Proteome subset : Phosphoproteome : S. Ficarro Brill LM, Salomon AR, Ficarro SB, Mukherji M, Stettler-Gill M, Peters EC. Robust phosphoproteomic profiling of tyrosine phosphorylation sites from human T cells using immobilized metal affinity chromatography and tandem mass spectrometry. Anal Chem. 2004 May 15;76(10):2763-72. Salomon AR, Ficarro SB, Brill LM, Brinker A, Phung QT, Ericson C, Sauer K, Brock A, Horn DM, Schultz PG, Peters EC. Profiling of tyrosine phosphorylation pathways in human cells using mass spectrometry. Proc Natl Acad Sci U S A. 2003 Jan 21;100(2):443-8. Epub 2003 Jan 9. P P  -P-Tyr antibody P P P P P P IMAC Immobilised Metal Affinity Chromatogra phy Esterify Asp, Glu Trypsin Digestion LC-MS/MS

74 Proteome imaging : tissue imaging by MALDI : R.Caprioli Caldwell RL, Caprioli RM. Tissue profiling by mass spectrometry: A review of methodology and applications. Mol Cell Proteomics. 2005 Jan 26; [Epub ahead of print] Reyzer ML, Caldwell RL, Dugger TC, Forbes JT, Ritter CA, Guix M, Arteaga CL, Caprioli RM. Early changes in protein expression detected by mass spectrometry predict tumor response to molecular therapeutics. Cancer Res. 2004 Dec 15;64(24):9093-100. Chaurand P, Sanders ME, Jensen RA, Caprioli RM. Proteomics in diagnostic pathology: profiling and imaging proteins directly in tissue sections. Am J Pathol. 2004 Oct;165(4):1057-68. Review. Pierson J, Norris JL, Aerni HR, Svenningsson P, Caprioli RM, Andren PE. Molecular profiling of experimental Parkinson's disease: direct analysis of peptides and proteins on brain tissue sections by MALDI mass spectrometry. J Proteome Res. 2004 Mar-Apr;3(2):289- 95. ….

75 References and web links REVIEWS AND TUTORIALS : 1) Molecular Biologist’s Guide to Proteomics, Paul R. Graves and Timothy A. J. Haystead MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, Mar. 2002, p. 39–63 Vol. 66, No. 1 2) Steen H, Mann M., The ABC's (and XYZ's) of peptide sequencing. Nat Rev Mol Cell Biol. 2004 Sep;5(9):699-711. Links : Matrix science help site : www.matrixscience.com/help_index.htmlwww.matrixscience.com/help_index.html Expasy knowledgebase:www.expasy.org/www.expasy.org/ Expasy proteomics portal :www.expasy.org/tools/www.expasy.org/tools/ MS info resource : www.ionsource.com/www.ionsource.com/ Our links : www.unil.ch/paf/links.htmlwww.unil.ch/paf/links.html Our tutorial filewww.unil.ch/paf/page19604.htmlwww.unil.ch/paf/page19604.html Human Protein Reference Databasewww.hprd.orgwww.hprd.org


Download ppt "Schedule 1.9.00Introduction 2.9.10Basic concepts of mass spectrometry (MS) and MALDI-TOF MS 3.9.30Protein identification by Peptide Mass Fingerprinting."

Similar presentations


Ads by Google