How to identify peptides October 2013 Gustavo de Souza IMM, OUS.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
Mass Fingerprint. Protease A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
Data Processing Algorithms for Analysis of High Resolution MSMS Spectra of Peptides with Complex Patterns of Posttranslational Modifications Shenheng Guan.
Proteomics The proteome is larger than the genome due to alternative splicing and protein modification. As we have said before we need to know All protein-protein.
Sangtae Kim Ph.D. candidate University of California, San Diego
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
ProReP - Protein Results Parser v3.0©
Protein Identification with Mascot Software (Laxmana Rao Y. and Gopalacharyulu P.V.)
Basics of 2-DE and MALDI-ToF MS
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
Proteomics Informatics – Overview of Mass spectrometry (Week 2) Ion Source Mass Analyzer Detector mass/charge intensity.
Each results report will contain:
Scaffold Download free viewer:
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
My contact details and information about submitting samples for MS
Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes 2.Identify global protein-protein interactions 3.Define protein localizations.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Evaluated Reference MS/MS Spectra Libraries Current and Future NIST Programs.
Tryptic digestion Proteomics Workflow for Gel-based and LC-coupled Mass Spectrometry Protein or peptide pre-fractionation is a prerequisite for the reduction.
Mueller LN, Brusniak MY, Mani DR, Aebersold R
Chapter 9 Mass Spectrometry (MS) -Microbial Functional Genomics 조광평 CBBL.
The dynamic nature of the proteome
Lab 2.41 Peptide Mass Fingerprinting and MS/MS Fragment Ion analysis with MASCOT Gary Van Domselaar University of Alberta Edmtonton, AB
Introduction to Protein Chemistry October 2013 Gustavo de Souza IMM, OUS.
UPDATE! In-Class Wed Oct 6 Latil de Ros, Derek Buns, John.
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Laxman Yetukuri T : Modeling of Proteomics Data
PeptideProphet Explained Brian C. Searle Proteome Software Inc SW Bertha Blvd, Portland OR (503) An explanation.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Protein Identification via Database searching Attila Kertész-Farkas Protein Structure and Bioinformatics Group, ICGEB, Trieste.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
INF380 - Proteomics-71 INF380 – Proteomics Chap 7 –Protein Identification and Characterization by MS Protein identification in our context means that we.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
A New Strategy of Protein Identification in Proteomics Xinmin Yin CS Dept. Ball State Univ.
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
Separates charged atoms or molecules according to their mass-to-charge ratio Mass Spectrometry Frequently.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Tag-based Blind Identification of PTMs with Point Process Model 1 Chunmei Liu, 2 Bo Yan, 1 Yinglei Song, 2 Ying Xu, 1 Liming Cai 1 Dept. of Computer Science.
The observed and theoretical peptide sequence information Cal.MassObserved. Mass ±da±ppmStart Sequence EndSequenceIon Score C.I%modification FLPVNEK.
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
ISA Kim Hye mi. Introduction Input Spectrum data (Protein database) Peptide assignment Peptide validation manual validation PeptideProphet.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
Constructing high resolution consensus spectra for a peptide library
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
Protein identification by mass spectrometry The shotgun proteomics strategy, based on digesting proteins into peptides and sequencing them using tandem.
Protein identification by mass spectrometry The shotgun proteomics strategy, based on digesting proteins into peptides and sequencing them using tandem.
Post translational modification n- acetylation Peptide Mass Fingerprinting (PMF) is an analytical technique for identifying unknown protein. Proteins to.
Mass Spectrometry makes it possible to measure protein/peptide masses (actually mass/charge ratio) with great accuracy Major uses Protein and peptide identification.
Mass Spectrometry 101 (continued) Hackert - CH 370 / 387D
Proteomics Informatics – Overview of Mass spectrometry (Week 2)
Protein Identification via Database searching
Proteomics February 15, 2017 Dr. ir. Perry Moerland
Proteomics Informatics David Fenyő
Proteomic Approaches to Cancer Biomarkers
Protein Identification by Peptide Mass Fingerprinting
Proteomics Informatics –
A, high resolution MS/MS spectrum (lower panel) of 1435
Bioinformatics for Proteomics
Mass Spectrometry THE MAIN USE OF MS IN ORG CHEM IS:
Sim and PIC scoring results for standard peptides and the test shotgun proteomics dataset. Sim and PIC scoring results for standard peptides and the test.
Proteomics Informatics David Fenyő
Presentation transcript:

How to identify peptides October 2013 Gustavo de Souza IMM, OUS

Peptide or Proteins?

Bottom-up Proteomics

2DE-based approach

Peptide Mass Fingerprinting MALDI (Matrix Assisted Laser Desorption Ionization)

Peptide Mass Fingerprinting m/z Intensity

MS/MS Ion Source Mass Analyzer Detector Mass Analyzer Mass Analyzer Collision cell

MS/MS

Fragmentation Nomenclature for peptide sequence-ions: Collision-Induced Dissociation (CID): MH n n+ * + N 2 --> b + y Electron Capture Dissociation (ECD): MH n n+ + e - --> MH n (n-1)+ · --> c + z·

Fragmentation H 2 N N H H N N H H N N H R 1 R 2 R 3 R 4 R 5 H N R 6 N H R 7 R 8 O O O O O O O O OH y 7 b 1 y 6 b 2 y 4 b 4 y 5 b 3 y 2 b 6 y 3 b 5 y 1 b 7 Roepstorff-Fohlmann-Biemann-Nomenclature

Fragmentation 12 aa …… b ionsy ions

MS/MS of a peptide P y13 y12 y11 y10 y9 y8 y7 y6 y5y4 y3 y2 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 P y++13 VPTVDVSVVDLTVK

How to Identify MS/MS Stenn and Mann, Peptide Sequence Tags Autocorrelation Probability based match

Submitting to Search

How identification happen? Your data Protein database (fasta) Step 1: which theoretical peptides has the same mass of the observed ion? Step 2: From those, which one have the most similar fragmentation pattern? x x x

High mass accuracy – what is it good for? All theoretical tryptic peptide masses from human IPI database Example Tryptic HSP-70 peptide: ELEEIVQPIISK, mass Da 11 Ext. 2 ppm LTQ-FT # of tryptic peptides for m/z Ext-SIMInt.Ext.Ext.Calibration 1 ppm 10 ppm 20 ppm 500 Mass Accuracy LTQ-FTQSTARQSTARLTQInstrument 3 Int. 0.5 ppm LTQ-FT

Defining the “Search Space”

The “Search Space” 0 mcl / /3 3/4 4/5 5/6 1 mcl 1/ /3 3/4 4/5 5/6 2 mcl 1/2/3 2/3/4 3/4/5 4/5/6

Importance of Search Space Size Search tool does not identify a peptide. It only reports the statiscally most suitable theoretical sequence related with the experimental data. If you increase the size of the database too much, or the size of the search space, false-positive rates also increase.

Steen and Mann, 2004 Defining FDRs

Chance that two peptides with different sequences but approximate Mr and sharing MS/MS similarities. More variables inserted during search  Higher chance to get random events  Higher MOWSE score threshold Parameters that can modify the MOWSE calculation: -Database size; -MMD (measured mass deviation); -Number of PTMs choosen; -Data quality. MOWSE

Mycoplasma sp. sample (Munich 2006): -Database had ~ 700 entries; -Data accuracy had 0.7ppm average; -MMD used during search: 3 ppm. Probability Based Mowse Score Ions score is -10*Log(P), where P is the probability that the observed match is a random event. Individual ions scores > 7 indicate identity or extensive homology (p<0.05). Protein scores are derived from ions scores as a non-probabilistic basis for ranking protein hits. Example of MMD issue

Peng et al (2003). Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J Prot Res 2, Reversed database sequence Strategies to Visualize FDRs

False positive identification using reversed database

Typical Result

Are there any Reversed hit protein with 2 peptides above MOWSE score? -No: All proteins identified with 2 peptides score higher than p<0.05 are good -Yes: Repeat mascot search with more stringent parameters. What about 1-hit wonders? (Proteins identified with only 1 peptide) How to Validate the Data

Basically, the idea is to ”play around” with the statistics to make your result more reliable. How to Validate the Data

Take home message 1.Data quality (mass accuracy) and a well-defined search space are key for reliable peptide identification 2.Reliable identification is an interplay between asking enough without asking too much (careful when trying to get “as many IDs as I can”!)

PTMs October 2013 Gustavo de Souza IMM, OUS

PTMs in biology

Complexity of Protein Samples in Eukaryotes Modifications are specific to a group of amino acids

What difference to expect at MS level? Larsen MR et al, 2006.

Defining the “Search Space”

PTM abundance in a cell Total peptides in a sample Modified peptides Number of Peptides Abundance level Differences from 10e2 to 10e4

PTM abundance in a cell

Stable vs. Labile PTMs Larsen MR et al, 2006.

Neutral loss Boersema PJ et al, 2009.

Identifying Labile PTMs Larsen MR et al, 2006.

HCD fragmentation Larsen MR et al, 2006.

Status of PTM coverage Lemeer and Heck, 2009.

Status of PTM coverage Derouiche A et al, 2012.

Take home message - Dependent on stability under fragmentation and abundance in the sample - ID improvement was mostly defined by instrumentation improvements (sensitivity etc) - Depending on PTM, identification can be very easy or very hard