Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generalized Protein Parsimony and Spectral Counting for Functional Enrichment Analysis Nathan Edwards Department of Biochemistry and Molecular & Cellular.

Similar presentations


Presentation on theme: "Generalized Protein Parsimony and Spectral Counting for Functional Enrichment Analysis Nathan Edwards Department of Biochemistry and Molecular & Cellular."— Presentation transcript:

1 Generalized Protein Parsimony and Spectral Counting for Functional Enrichment Analysis Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown University Medical Center

2 Systems Biology 2 Structured High-Throughput Experiments Knowledge Databases

3 molecular biology ↕ phenotype Systems Biology 3 Knowledge Databases Structured High-Throughput Experiments Localization Function Process Interactions Pathway Mutation Proteomics Sequencing Microarrays Metabolomics molecular biology ↕ biology

4 molecular biology ↕ phenotype Systems Biology 4 Mathematical Models Structured High-Throughput Experiments Localization Function Process Interactions Pathway Mutation Proteomics Sequencing Microarrays Metabolomics molecular biology ↕ biology Knowledge Databases

5 molecular biology ↕ phenotype Systems Biology 5 Mathematical Models Structured High-Throughput Experiments Localization Function Process Interactions Pathway Mutation Proteomics Sequencing Microarrays Metabolomics molecular biology ↕ biology Knowledge Databases Functional Annotation Enrichment

6 molecular biology ↕ phenotype Systems Biology 6 Mathematical Models Structured High-Throughput Experiments Localization Function Process Interactions Pathway Mutation Proteomics Sequencing Microarrays Metabolomics molecular biology ↕ biology Knowledge Databases Functional Annotation Enrichment

7 molecular biology ↕ phenotype Systems Biology 7 Mathematical Models Structured High-Throughput Experiments Localization Function Process Interactions Pathway Mutation Proteomics Sequencing Microarrays Metabolomics molecular biology ↕ biology Knowledge Databases Functional Annotation Enrichment

8 Functional Annotation Enrichment In any draw, we expect: ~ 5 "evens", ~ 2 "≤ 10", etc. Each ball is equally likely Balls are independent p-value is surprise! For transcriptomics: Genes↔ Balls Genome↔ Tumbler Diff. Expr.↔ Draw Annotation↔ "evens",… 8 Draw 10 of 50!

9 Why not in proteomics? Double counting and false positives… …due to traditional protein inference Proteomics cannot see all proteins… …proteins are not equally likely to be drawn Good relative abundance is hard… …extra chemistries, workflows, and software …missing values are particularly problematic 9

10 In proteomics… Double counting and false positives… Use generalized protein parsimony Proteomics cannot see all proteins… Use identified proteins as background Good relative abundance is hard… Model differential spectral counts directly 10

11 Ignore some PSMs FDR filtering leaves some false PSMs Enforce strict protein inference criteria Leave some PSMs uncovered 11 10% Proteins PSMs

12 Ignore some PSMs FDR filtering leaves some false PSMs Enforce strict protein inference criteria Leave some PSMs uncovered 12 Proteins PSMs 90%

13 Match uncovered PSMs to FDR 13

14 Plasma membrane enrichment Pellicle enrichment of plasma membrane Choksawangkarn et al. JPR 2013 (Fenselau Lab) Six replicate LC-MS/MS analyses each Cell-lysate (44,861 MS/MS) Fe 3 O 4 -Al 2 O 3 pellicle (21,871 MS/MS) 625 3-unique proteins to match 10% FDR: Lysate: 18,976 PSMs; Pellicle: 13,723 PSMs 89 proteins with significantly (< 10 -5 ) increased counts 14

15 Plasma membrane enrichment Na/K+ ATPase subunit alpha-1 (P05023): Lysate: 1; Pellicle: 90; p-value: 5.2 x 10 -33 Transferrin receptor protein 1 (P02786): Lysate: 17; Pellicle: 63; p-value: 2.0 x 10 -11 DAVID Bioinformatics analysis (89/625): Plasma membrane (GO:0005886) : 29 (5.2 x 10 -5 ) Transmembrane (SwissProtKW): 24 (1.3 x 10 -6 ) Transmembrane (SwissProtKW): Lysate: 524; Pellicle: 1335; p-value: 2.6 x 10 -158 15

16 A protein's PSMs rise and fall together! 16

17 A protein's PSMs rise and fall together? 17

18 Anomalies indicate proteoforms 18

19 Nascent polypeptide-associated complex subunit alpha 19 7.3 x 10 -8

20 20 Pyruvate kinase isozymes M1/M2 2.5 x 10 -5

21 Summary Functional annotation enrichment for proteomics too: Careful counting (generalized parsimony) Differential abundance by spectral counts Use (multivariate-)hypergeometric model for Differential abundance by spectral counts Proteoform detection 21

22 HER2/Neu Mouse Model of Breast Cancer Paulovich, et al. JPR, 2007 Study of normal and tumor mammary tissue by LC-MS/MS 1.4 million MS/MS spectra Peptide-spectrum assignments Normal samples (N n ): 161,286 (49.7%) Tumor samples (N t ): 163,068 (50.3%) 4270 proteins identified in total 2-unique generalized protein parsimony 22

23 Distribution of p-values (Yeast) 23


Download ppt "Generalized Protein Parsimony and Spectral Counting for Functional Enrichment Analysis Nathan Edwards Department of Biochemistry and Molecular & Cellular."

Similar presentations


Ads by Google