Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proteomics Informatics David Fenyő

Similar presentations


Presentation on theme: "Proteomics Informatics David Fenyő"— Presentation transcript:

1 Proteomics Informatics David Fenyő

2 Course Information

3 Protein Identification and Quantitation
Samples Peptides Mass Spectrometry Quantity intensity m/z Identity

4 Central Dogma of Molecular Biology
Transcription Replication Translation Modification P

5 Central Dogma of Molecular Biology
Transcription Replication Translation Modification Functional Gene Products P

6 Central Dogma of Molecular Biology
Transcription Replication Easy to measure Translation Modification P Difficult to measure

7 Central Dogma of Molecular Biology
Transcription Replication Slow Translation Fast Modification P

8 X X Central Dogma of Molecular Biology P Transcription Replication
Degradation Translation X Degradation Modification P X

9 ERBB2 Central Dogma of Molecular Biology Breast Cancer P Transcription
RNA Transcription DNA Translation Modification P

10 ERBB2 Central Dogma of Molecular Biology Breast Cancer P Transcription
RNA Transcription DNA Translation Modification P

11 ERBB2 Central Dogma of Molecular Biology Breast Cancer P Transcription
RNA Transcription DNA Translation Modification P

12 ERBB2 Central Dogma of Molecular Biology Breast Cancer Ovarian Cancer
RNA RNA Transcription DNA DNA Ovarian Cancer Translation Modification P

13 KRT5 Central Dogma of Molecular Biology Breast Ovarian Colon Cancer
Transcription Translation Modification P

14 Copy Number / Transcript Protein / Phosphoprotein
Correlations between copy number, transcript, protein and phosphoprotein quantities ~ ~ ~ Copy Number / Transcript Transcript / Protein Protein / Phosphoprotein 1.0 0.8 0.5 0.5 Correlation 0.2 0.0 -0.5

15 Correlations between different genes
Breast Cancer GRB7 Transcription ERBB2 GRB7 Translation ERBB2 Modification GRB7 P ERBB2

16 Correlations between different genes
Breast Cancer GRB7 ERBB4 Transcription ERBB2 ERBB2 GRB7 ERBB4 Translation ERBB2 ERBB2 Modification GRB7 ERBB4 P ERBB2 ERBB2

17 Protein-Protein Correlations: Both Positive and Negative
Breast Cancer

18 Motivating Example: Protein Complexes
Alber et al., Nature 2007

19 Motivating Example: Signaling
Choudhary & Mann, Nature Reviews Molecular Cell Biology 2010

20 Identified and Quantified Proteins
Mass Spectrometry Based Proteomics Lysis Fractionation Digestion Mass spectrometry Peak Finding Charge determination De-isotoping Integrating Peaks Searching MS Identified and Quantified Proteins

21 Ion Source Mass Analyzer Detector Mass Spectrometry intensity
mass/charge

22 y b Mass Spectrometry Mass Analyzer 1 Frag-mentation Detector
Ion Source Mass Analyzer 2 y b

23 Example data – ESI-LC-MS/MS
m/z m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022 MS/MS Time

24 Information Content in a Single Mass Measurement
Human 10 8 6 Avg. #of matching peptides 4 3 2 1 #of matching peptides Tryptic peptide mass [Da] S. cerevisiae 10 8 6 Avg. #of matching peptides 4 3 2 1 #of matching peptides Tryptic peptide mass [Da]

25 Compare, score, test significance Identified peptides and proteins
Protein Identification by Mass Spectrometry Samples Peptides MS/MS Protein DB Compare, score, test significance Identified peptides and proteins

26 Repeat for all proteins Compare, Score, Test Significance
Tandem MS – Database Search Sequence DB Lysis Fractionation Pick Protein Digestion LC-MS Pick Peptide Repeat for all proteins MS/MS All Fragment Masses all peptides Repeat for MS/MS Compare, Score, Test Significance

27 Search Results

28 Search Results Most proteins show very reproducible peptide patterns

29 Search Results

30 Compare, Score, Test Significance
Spectrum Library Search Spectrum Library Lysis Fractionation Digestion LC-MS/MS Pick Spectrum all spectra Repeat for MS/MS Compare, Score, Test Significance Identified Proteins

31 Interpretation of Mass Spectra
K L E D F G S m/z % Relative Abundance 100 250 500 750 1000

32 Interpretation of Mass Spectra
K L E D F G S K 1166 L 1020 E 907 D 778 663 534 405 F 292 G 145 S 88 b ions m/z % Relative Abundance 100 250 500 750 1000

33 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000

34 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022

35 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022

36 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022 113 113

37 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022 129 129

38 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022

39 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022

40 Interpretation of Mass Spectra
K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022

41 De Novo Sequencing Sequences consistent with spectrum
Amino acid masses 762 100 875 [M+2H]2+ % Relative Abundance 633 292 405 260 389 534 1022 504 663 778 907 1020 1080 250 500 750 1000 m/z Mass Differences Sequences consistent with spectrum

42 Significance Testing False protein identification is caused by random matching An objective criterion for testing the significance of protein identification results is necessary. The significance of protein identifications can be tested once the distribution of scores for false results is known.

43 C I Protein Quantitation by Mass Spectrometry Sample i Protein j Lysis
ij Protein j Lysis Peptide k Fractionation Digestion MS I LC - MS ik

44 Protein Quantitation by Mass Spectrometry

45 Protein Quantitation by Mass Spectrometry

46 Protein Quantitation by Mass Spectrometry

47 Protein Quantitation by Mass Spectrometry
Light Heavy Lysis Assumption: All losses after mixing are identical for the heavy and light isotopes and Fractionation Digestion Sample i Protein j Peptide k LC-MS MS H L Oda et al. PNAS 96 (1999) 6591 Ong et al. MCP 1 (2002) 376

48 Protein Quantitation MS MS MS/MS MS/MS LC-MS Digestion Fractionation
Shotgun proteomics LC-MS Targeted MS 1. Records M/Z 1. Select precursor ion MS MS Digestion 2. Selects peptides based on abundance and fragments Fractionation 2. Precursor fragmentation MS/MS MS/MS Lysis 3. Protein database search for peptide identification 3. Use Precursor-Fragment pairs for identification Data Dependent Acquisition (DDA) Uses predefined set of peptides

49

50 Compare, score, test significance Identified peptides and proteins
Protein Identification by Mass Spectrometry Samples Peptides MS/MS Protein DB Compare, score, test significance Identified peptides and proteins

51 Tumor Specific Databases
Next-generation sequencing of the genome and transcriptome Samples Peptides MS/MS Sample-specific Protein DB Compare, score, test significance Identified peptides and proteins

52 Proteogenomics Non-Tumor Sample Genome sequencing
Identify germline variants Genome sequencing RNA-Seq Tumor Sample Identify alternative splicing, somatic variants and novel expression TCGAGAGCTG TCGATAGCTG Exon 1 Exon 2 Exon 3 Variants Alt. Splicing Novel Expression Exon X Fusion Genes Gene X Gene Y Tumor Specific Protein DB Reference Human Database (Ensembl)

53 Posttranslational Modifications
Peptide with two possible modification sites Matching MS/MS spectrum Intensity m/z Which assignment does the data support? 1, or 2, or 1 and 2?

54 Protein Interactions Digestion Mass spectrometry Identification E F A
B Digestion Mass spectrometry Identification

55 Data Analysis - Normalization
Normalized: mean=0, std=1 Raw Data

56 Data Analysis - Normalization
Normalized 3 replicates Normalized 3 replicates + one more replicate a few months later

57 Data Analysis

58 FDA calls them “in vitro diagnostic multivariate assays”
Molecular Markers A molecular signature is a computational or mathematical model that links high-dimensional molecular information to phenotype or other response variable of interest. FDA calls them “in vitro diagnostic multivariate assays”

59 Mass Spectrometry–Based Proteomics and Network Biology
A. Bensimon, A.J.R. Heck R. Aebersold, "Mass Spectrometry–Based Proteomics and Network Biology", Annual Review of Biochemistry 81 (2012)

60 Spatial proteomics: a powerful discovery tool for cell biology
Lundberg E, Borner GHH. Spatial proteomics: a powerful discovery tool for cell biology. Nat Rev Mol Cell Biol. 2019

61 Spatial proteomics: a powerful discovery tool for cell biology
Lundberg E, Borner GHH. Spatial proteomics: a powerful discovery tool for cell biology. Nat Rev Mol Cell Biol. 2019

62


Download ppt "Proteomics Informatics David Fenyő"

Similar presentations


Ads by Google