Presentation is loading. Please wait.

Presentation is loading. Please wait.

Paulo Costa Carvalho ::

Similar presentations


Presentation on theme: "Paulo Costa Carvalho ::"— Presentation transcript:

1 A computational approach for analyzing proteomes of unsequenced organisms
Paulo Costa Carvalho :: Laboratory for Proteomics and Protein Engineering Carlos Chagas Institute, Fiocruz – Paraná, Curitiba, Brazil

2 Computational Proteomics
Editorial “There has been an unprecedented improvement in the quality and quantity of commercial proteomics data generation technologies, making data generation more accessible to many researchers. However, more and more discoveries will be led by researchers in command of the skills necessary to mine and extensively interpret the volumes of data. Already the ability to generate data vastly outpaces our ability to interpret it, and the lack of expertise in interpreting data is the current gating factor in the advancement of proteomics sciences. Proteomics scientists with training solely in data generation techniques will be shut out of more and more research opportunities. Nuno Bandeira, July 2011 E a proposito eu achei pertinente trazer o que nuno Bandeira destaca no editorial; em essencia, ele coloca que a capacidade de gerar dados exedeu a de interpreta-los. Assim o gargalo da proteomica encontra-se portanto neste ponto.

3 Peptide Sequence Matching approach (e. g. , SEQUEST, X
Peptide Sequence Matching approach (e.g., SEQUEST, X!Tandem) requires the protein sequence to be found in the database for identification Ground breaking discoveries usually emerge from studying “non-canonical” organisms (e.g., Thermus aquaticus , Bothrops jararaca with Captopril for hypertension).

4

5 Analyzing de novo results and writing the manuscript

6 towards proteomic analysis that meets current standards and demands.
Aim Development of an integrated de novo sequencing post-processing tool tailored towards proteomic analysis that meets current standards and demands. Present global FDR with solid statistics and state of the art pattern recognition approaches Work with any de novo sequencing tool, Works with any database Free for academic use Integrated into the PatternLab pipeline Multiplatform software capable of working in different Oss Customized and dynamic reports that can switch between protein, peptide and MS levels quick and easy interpretation. Cloud computing and cluster compatible

7 Outline Introduction Materials and Methods Results Conclusions
PepExplorer Smith-Waterman Radial Basis Function Neural Networks Parameter tuning with grid search Experimental datasets: The P furiosus proof of concept The B jararaca plasma case Putting it all together Results The PFU proof of concept Conclusions

8 New Analysis Methodology
Peptide Spectrum Match post PSM SEPRO DTASelect Percolator SEQUEST, Mascot ProLuCID, SIM de novo sequencing post de novo PepExplorer PepNovo, UniNovo, Pnovo+, PEAKS, etc.

9 PepExplorer PEPEXPLORER Analysis Workflow Report PEAKS
Post de novo sequencing target RBF Network PepNovo decoy Report Pam30ms matrix Pnovo PepExplorer Your favorite de novo tool

10 Smith-Waterman algorithm for sequence alignment
Guaranteed to find the optimal alignment at the cost of extra computational resources as compared to BLAST. Match s(a,b) = 2 Mismatch s(a,b) = – 1 Gap = 1 (penalidade) A T G I 2 1 3 5 4 7 E 6

11 PepExplorer’s Radial basis function neural network
Z <= 2 Decision surface Z >= 3 De novo score Alignment Score Peptide Length

12 Dynamic Report Generation for easy report interpretation

13 Outline Introduction Materials and Methods Results Conclusions
PepExplorer Smith-Waterman Radial Basis Function Neural Networks Parameter tuning with grid search Experimental datasets: The P furiosus proof of concept The B jararaca plasma case Putting it all together Results The PFU proof of concept Conclusions

14 The P furiosus proof of concept
Using a previously published 2h Orbitrap XL analysis of PFU acquired in the Yates lab. Evaluate effectiveness of the ProLuCID -> SEPro approach on this dataset. Evaluate effectiveness of the PEAKS (de novo only) -> PepExplorer on this dataset. Insert mutations / gaps in the sequence database respecting probabilities in the PAM30MS to generate a modified DB. Reanalyze data with both approaches using the modified DB.

15 The B jararaca plasma experiment
Three technical replicates of B jararaca plasma were prepared for shotgun proteomics and analyzed using a 2h RP gradient using an Orbitrap XL with HCD fragmentation. Mass spectra were analyzed using the ProLuCID -> SEPro against the NCBI Reptilia + amphibian Mass spectra were analyzed using the PEPExplorer against the same database

16 Results

17 The P furiosus proof of concept

18 The Proof of Concept: P furiosus
A: ProLuCID -> SEPro in Original DB: 581 proteins B: ProLuCID -> SEPro in Modified DB: 4 proteins * * B is a subset of A

19 The Proof of Concept: P furiosus
A: ProLuCID -> SEPro in Original DB: 581 proteins B: PepExplorer x Original DB (232)

20 The Proof of Concept: P furiosus
A: PepExplorer x Original DB (232) B: PepExplorer x Modified DB (160)

21 The Proof of Concept: P furiosus
A: ProLuCID -> SEPro in Modified DB (4) B: PepExplorer in Modified DB (160)

22 The B jararaca plasma analysis

23 Search Database: Reptilia plus Amphibians

24 ProLuCID -> SEPro search results in the reptilia + amphibian database

25 PepExplorer results in the reptilian + amphibian database

26 Phospholipase

27 Albumin

28

29 PepExplorer Interface
Multiplatform Free

30

31 PepExplorer Command Line
Runs in cluster environment.

32 PatternLab for proteomics: a one stop shop for data analysis
Carvalho P.C. et al., 2008, 2010, 2012 Quantitative proteomics Cluster analysis Differentially expressed proteins Trends in time-course experiments Venn Diagrams Gene Ontology Analysis

33 with the cloud service of PatternLab for Proteomics
Pinpointing differentially expressed domains in complex protein mixtures with the cloud service of PatternLab for Proteomics Felipe da Veiga Leprevost1, Diogo Borges Lima1, Juliana Crestani2, Yasset Perez-Riverol3,4, Nilson Zanchin1, Valmir C. Barbosa5, Paulo Costa Carvalho1,* (Addressing reviewers after a minor revision request)

34 Identification Statistical Filtering And Organizing Quantitation

35 PepExplorer Identification Statistical Filtering And Organizing
Quantitation

36 Conclusions * Presents global FDR with solid statistics by using a state of the art pattern recognition approaches * Works with any de novo sequencing tool * Works with any database * Free for academic use * Multiplatform software capable of working in different Oss * Customized and dynamic reports that can switch between protein, peptide and MS levels quick and easy interpretation. * Cloud computing and cluster compatible * Integrated in PatternLab for proteomics offering an unprecedented arsenal of tools for quantitative proteomics * Simplifies data analysis

37 http://proteomics.fiocruz.br paulo@pcarvalho.com Acknowledgments
Financial Support


Download ppt "Paulo Costa Carvalho ::"

Similar presentations


Ads by Google