Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown.

Similar presentations


Presentation on theme: "Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown."— Presentation transcript:

1 Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown University Medical Center

2 Outline Tandem mass-spectrometry of peptides Detection of alternative splicing protein isoforms Phyloproteomics using top-down mass-spec. Characterization of glycoprotein microheterogeneity by mass-spectrometry 2

3 Mass Spectrometer 3 Ionizer Sample + _ Mass Analyzer Detector MALDI Electro-Spray Ionization (ESI) Time-Of-Flight (TOF) Quadrapole Ion-Trap Electron Multiplier (EM)

4 Mass Spectrum 4

5 Mass is fundamental 5

6 Sample Preparation for MS/MS 6 Enzymatic Digest and Fractionation

7 Single Stage MS 7 MS

8 Tandem Mass Spectrometry (MS/MS) 8 Precursor selection

9 Tandem Mass Spectrometry (MS/MS) 9 Precursor selection + collision induced dissociation (CID) MS/MS

10 Why Tandem Mass Spectrometry? MS/MS spectra provide evidence for the amino-acid sequence of functional proteins. Key concepts: Spectrum acquisition is unbiased Direct observation of amino-acid sequence Sensitive to small sequence variations 10

11 Unannotated Splice Isoform Human Jurkat leukemia cell-line Lipid-raft extraction protocol, targeting T cells von Haller, et al. MCP 2003. LIME1 gene: LCK interacting transmembrane adaptor 1 LCK gene: Leukocyte-specific protein tyrosine kinase Proto-oncogene Chromosomal aberration involving LCK in leukemias. Multiple significant peptide identifications 11

12 Unannotated Splice Isoform 12

13 Unannotated Splice Isoform 13

14 Splice Isoform Anomaly Human erythroleukemia K562 cell-line Depth of coverage study Resing et al. Anal. Chem. 2004. Peptide Atlas A8_IP SALT1A2 gene: Sulfotransferase family, cytosolic, 1A 2 ESTs, 1 mRNA mRNA from lung, small cell-cancinoma sample Single (significant) peptide identification Five agreeing search engines PepArML FDR < 1%. All source engines have non-significant E-values 14

15 Splice Isoform Anomaly 15

16 Splice Isoform Anomaly 16

17 Translation start-site correction Halobacterium sp. NRC-1 Extreme halophilic Archaeon, insoluble membrane and soluble cytoplasmic proteins Goo, et al. MCP 2003. GdhA1 gene: Glutamate dehydrogenase A1 Multiple significant peptide identifications Observed start is consistent with Glimmer 3.0 prediction(s) 17

18 Halobacterium sp. NRC-1 ORF: GdhA1 K-score E-value vs PepArML @ 10% FDR Many peptides inconsistent with annotated translation start site of NP_279651 18

19 Translation start-site correction 19

20 What if there is no "smoking gun" peptide… 20

21 What if there is no "smoking gun" peptide… 21

22 What if there is no "smoking gun" peptide… 22

23 HER2/Neu Mouse Model of Breast Cancer Paulovich, et al. JPR, 2007 Study of normal and tumor mammary tissue by LC-MS/MS 1.4 million MS/MS spectra Peptide-spectrum assignments Normal samples (N n ): 161,286 (49.7%) Tumor samples (N t ): 163,068 (50.3%) 4270 proteins identified in total 2-unique generalized protein parsimony 23

24 Nascent polypeptide-associated complex subunit alpha 24 7.3 x 10 -8

25 Pyruvate kinase isozymes M1/M2 25 2.5 x 10 -5

26 Phyloproteomics Fragment intact proteins (top-down MS) Match the spectra to protein sequences Place the organism phylogenetically Works even for unknown microorganisms without any available sequences 26

27 27 CID Protein Fragmentation Spectrum from Y. rohdei

28 28 CID Protein Fragmentation Spectrum from Y. rohdei Match to Y. pestis 50S Ribosomal Protein L32

29 Exact match sequence… 29

30 Phylogeny: Protein vs DNA 30 Protein Sequence16S-rRNA Sequence

31 What about mixtures? 31

32 32 Shared Small Ribosomal Proteins

33 33 Shared Small Ribosomal Proteins

34 34 DNA-binding protein HU-alpha m/z 732.71, z 13+, E-value 7.5e-26, Δ -14.128 Eight proteins identified with "large" |Δ| Identified E. herbicola proteins

35 35 DNA-binding protein HU-alpha m/z 732.71, z 13+, E-value 1.91e-58 Use "Sequence Gazer" to find mass shift ΔM mode can "tolerate" one shift for free! Identified E. herbicola proteins

36 36 DNA-binding protein HU-alpha m/z 732.71, z 13+, E-value 7.5e-26, Δ -14.128 Extract N- and C-terminus sequence supported by at least 3 b- or y-ions Identified E. herbicola proteins

37 37 E. herbicola protein sequences

38 38 E. herbicola sequences found in other species

39 39 Phylogenetic placement of E. herbicola Phylogram Cladogram phylogeny.fr – "One-Click"

40 Glycoprotein Microheterogeneity Glycosylation is important, but our analytic tools are rather rudimentary Detach glycans (PNGase-F) and analyze glycans Detach glycans (PNGase-F) and analyze peptides Get glycan structures, but no association with protein or protein site, or Get glycosylation sites, but no association with glycan structures. We analyze glycopeptides directly… Challenges all facets of glycoproteomics 40

41 Altered N-Glycosylation in Cancer 41 N X S/T COO- NH3+ Fut-VIII (α1-6 Fuc) Comunale, 2010 GnT-V (β1-6 GlcNAc) Wang, 2007 ST-VI Gal1 (α 2-6 NeuAc) Hedlund, 2008 Fut-VI (α1-3 Fuc) Higai,2008 Glycosyltransferase Expression or Glycan Analyses GalNAc Sialic Acid Gal GlcNAc Man K. Chandler

42 The informatics challenge Identify glycopeptides in large-scale tandem mass-spectrometry datasets Many glycopeptide enriched fractions Many tandem mass-spectra / fraction Good, but not great, instrumentation QStar Elite – CID, good MS1/MS2 resolution Strive for hypothesis-generating analysis Site-specific glycopeptide characterization Glycoform occupancy in differentiated samples 42

43 CID Glycopeptide Spectrum 43

44 Observations Oxonium ions (204, 366) help distinguish glycopeptides from peptides… …but do little to identify the glycopeptide Few peptide b/y-ions to identify peptides… …but intact peptide fragments are common If the peptide can be guessed, then… …the glycan's mass can be determined 44

45 Haptoglobin (HPT_HUMAN) NLFLNHSE*NATAK MVSHHNLTTGATLINE VVLHPNYSQVDIGLIK Haptoglobin Standard 45 N-glycosylation motif (NX/ST) * Site of GluC cleavage Pompach et al. Journal of Proteome Research 11.3 (2012): 1728–1740.

46 Tuning the filters… Oxonium ions: Number & intensity Match tolerance "Intact-peptide" fragments: Number & intensity Match tolerance Glycan composition: ICScore Constrain search space Match tolerance Glycan database: Constrain search space Match tolerance Precursor ion: Non-monoisotopic selection Sodium adducts Charge state Peptide search space: Semi-specific peptides Non-specific peptides Peptide MW range Variable modifications 46

47 Tuning the filters… We estimate the number of false-positives… …so that the user can tune the search parameters 47

48 Application of Exoglycosidases to locate Fucose At ITIH4 site N517 48 LPTQNITFQTE K. Chandler

49 NVVFVIDK ITIH4 Glycopeptide 49 K. Chandler

50 Similar Glycopeptides Spectra ( mass Δ ~ +162 Da) 50 MVSHHNLTTGATLINE ? +162 Da

51 Fragmented Glycopeptides ( mass Δ ~ +162 Da) 51 MVSHHNLTTGATLINE ? +162 Da MVSHHNLTTGATLINE

52 Propagating Annotations MVS+A1G1 MVS+A2G2 VVL+A1G1 VVL+A2G2 52 G. Berry

53 Summary Mass-spectrometry coupled with protein chemistry and good informatics can look beyond the obvious to the unexpected... …and there is plenty to find! 53

54 Acknowledgements Edwards lab Kevin Chandler Gwenn Berry Fenselau lab (UMD) Colin Wynne Avantika Dhabaria Goldman lab (GU) Kevin Chandler Petr Pompach NSF Graduate Fellowship (Chandler) Funding: NCI 54


Download ppt "Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown."

Similar presentations


Ads by Google