CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Kaizhong Zhang Department of Computer Science University of Western Ontario London, Ontario, Canada Joint work with Bin Ma, Gilles Lajoie, Amanda Doherty-Kirby,
The Proteomics Core at Wayne State University
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
CSE182 CSE182-L13 Mass Spectrometry Quantitation and other applications.
Protein Sequencing and Identification by Mass Spectrometry.
Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
Gene Finding (DNA signals) Genome Sequencing and assembly
Fa 06CSE182 CSE182-L13 Mass Spectrometry Quantitation and other applications.
Introduction to BioInformatics GCB/CIS535
CSE182-L12 Gene Finding.
Mass Spectrometry Peptide identification
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
The restriction mapping problem revisited Gopal Pandurangan and H. Ramesh Journal of Computer and System Sciences 526~544(2002)
Fa 05CSE182 CSE182-L6 Protein structure basics Protein sequencing.
Fa 06CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Fa 05CSE182 CSE182-L8 Mass Spectrometry. Fa 05CSE182 Bio. quiz What is a gene? What is a transcript? What is translation? What are microarrays? What is.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Mass spectrometry in proteomics Modified from: I519 Introduction to Bioinformatics, Fall, 2012.
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Previous Lecture: Regression and Correlation
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
My contact details and information about submitting samples for MS
CSE182-L12 Mass Spectrometry Peptide identification CSE182.
1 Mass Spectrometry-based Proteomics Xuehua Shen (Adapted from slides with textbook)
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
CSE182 CSE182-L13 Mass Spectrometry Quantitation and other applications.
Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications.
Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
The dynamic nature of the proteome
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
es/by-sa/2.0/. Large Scale Approaches to the Study of Protein Levels and Activity Prof:Rui Alves
PROTEIN QUANTIFICATION AND PTM JUN SIN HSS.I. PROJECT 1.
Wi’07Bafna Proteomics via Mass Spectrometry (a bioinformatics perspective) Vineet Bafna
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
1 Chemical Analysis by Mass Spectrometry. 2 All chemical substances are combinations of atoms. Atoms of different elements have different masses (H =
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Laxman Yetukuri T : Modeling of Proteomics Data
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
CS 461b/661b: Bioinformatics Tools and Applications Software Algorithm Mathematical Models Biology Experiments and Data.
Temple University MASS SPECTROMETRY INTRODUCTION Ilyana Mushaeva and Amber Moscato Department of Electrical and Computer Engineering Temple University.
INF380 - Proteomics-51 INF380 – Proteomics Chapter 5 – Fundamentals of Mass Spectrometry Mass spectrometry (MS) is used for measuring the mass-to-charge.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
Central dogma: the story of life RNA DNA Protein.
Peptide Identification via Tandem Mass Spectrometry Sorin Istrail.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
1 CH908 Structural Analysis by Mass Spectrometry revision lecture. Prof. Peter O’Connor.
Aggressive Enumeration of Peptide Sequences for MS/MS Peptide Identification Nathan Edwards Center for Bioinformatics and Computational Biology.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Protein Identification Using Tandem Mass Spectrometry Nathan Edwards Center for Bioinformatics and Computational Biology University of Maryland, College.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Constructing high resolution consensus spectra for a peptide library
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
Identify proteins. Proteomic workflow Trypsin A typical sample We add a solution of 50 mM NH 4 HCO 3 (pH 7.8) containing trypsin ( µg/µl). Volume.
Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
The Syllabus. The Syllabus Safety First !!! Students will not be allowed into the lab without proper attire. Proper attire is designed for your protection.
General Overview of the module and the methods
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
Proteomics Informatics David Fenyő
Presentation transcript:

CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry

CSE182 Course Summary Sequence Comparison (BLAST & other tools) Protein Motifs: –Profiles/Regular Expression/HMMs Discovering protein coding genes –Gene finding HMMs –DNA signals (splice signals) How is the genomic sequence itself obtained? –LW statistics –Sequencing and assembly Next topic: the dynamic aspects of the cell Protein sequence analysis ESTs Gene finding

CSE182 The Dynamic nature of the cell The molecules in the body, RNA, and proteins are constantly turning over. –New ones are ‘created’ through transcription, translation –Proteins are modified post- translationally, –‘Old’ molecules are degraded

CSE182 Dynamic aspects of cellular function Expressed transcripts –Microarrays to ‘count’ the number of copies of RNA Expressed proteins –Mass spectrometry is used to ‘count’ the number of copies of a protein sequence. Protein-protein interactions (protein networks) Protein-DNA interactions Population studies

CSE182 The peptide backbone H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses.

CSE182 Mass Spectrometry

CSE182 Nobel citation ’02

CSE182 The promise of mass spectrometry Mass spectrometry is coming of age as the tool of choice for proteomics –Protein sequencing, networks, quantitation, interactions, structure…. Computation has a big role to play in the interpretation of MS data. We will discuss algorithms for –Sequencing, Modifications, Interactions..

CSE182 Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation

CSE182 Single Stage MS Mass Spectrometry LC-MS: 1 MS spectrum / second

CSE182 Tandem MS Secondary Fragmentation Ionized parent peptide

CSE182 The peptide backbone H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses.

CSE182 Ionization H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses. Ionized parent peptide H+H+

CSE182 Fragment ion generation H...-HN-CH-CO NH-CH-CO-NH-CH-CO-…OH R i-1 RiRi R i+1 AA residue i-1 AA residue i AA residue i+1 N-terminus C-terminus The peptide backbone breaks to form fragments with characteristic masses. Ionized peptide fragment H+H+

CSE182 Tandem MS for Peptide ID 147 K 1166 L E D E E L F G S y ions b ions [M+2H] 2+ m/z % Intensity

CSE182 Peak Assignment 147 K 1166 L E D E E L F G S y ions b ions y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 b3b3 b4b4 b5b5 b8b8 b9b9 [M+2H] 2+ b6b6 b7b7 y9y9 y8y8 m/z % Intensity Peak assignment implies Sequence (Residue tag) Reconstruction!

CSE182 Database Searching for peptide ID For every peptide from a database –Generate a hypothetical spectrum –Compute a correlation between observed and experimental spectra –Choose the best Database searching is very powerful and is the de facto standard for MS. –Sequest, Mascot, and many others

CSE182 Spectra: the real story Noise Peaks Ions, not prefixes & suffixes Mass to charge ratio, and not mass –Multiply charged ions Isotope patterns, not single peaks

CSE182 Peptide fragmentation possibilities (ion types) -HN-CH-CO-NH-CH-CO-NH- RiRi CH-R’ aiai bibi cici x n-i y n-i z n-i y n-i-1 b i+1 R” d i+1 v n-i w n-i i+1 low energy fragmentshigh energy fragments

CSE182 Ion types, and offsets P = prefix residue mass S = Suffix residue mass b-ions = P+1 y-ions = S+19 a-ions = P-27

CSE182 Mass-Charge ratio The X-axis is not mass, but (M+Z)/Z –Z=1 implies that peak is at M+1 –Z=2 implies that peak is at (M+2)/2 M=1000, Z=2, peak position is at 501 Quiz: Suppose you see a peak at 501. Is the mass 500, or is it 1000?

CSE182 Isotopic peaks Ex: Consider peptide SAM Mass = You should see: Instead, you see

CSE182 Isotopes C-12 is the most common. Suppose C-13 occurs with probability 1% EX: SAM –Composition: C11 H22 N3 O5 S1 What is the probability that you will see a single C-13? Note that C,S,O,N all have isotopes. Can you compute the isotopic distribution?

CSE182 All atoms have isotopes Isotopes of atoms –O16,18, C-12,13, S32,34…. –Each isotope has a frequency of occurrence If a molecule (peptide) has a single copy of C-13, that will shift its peak by 1 Da With multiple copies of a peptide, we have a distribution of intensities over a range of masses (Isotopic profile). How can you compute the isotopic profile of a peak?

CSE182 Isotope Calculation Denote: –N c : number of carbon atoms in the peptide –P c : probability of occurrence of C-13 (~1%) –Then +1 N c =50 +1 N c =200

CSE182 Isotope Calculation Example Suppose we consider Nitrogen, and Carbon N N : number of Nitrogen atoms P N : probability of occurrence of N-15 Pr(peak at M) Pr(peak at M+1)? Pr(peak at M+2)? How do we generalize? How can we handle Oxygen (O-16,18)?

CSE182 General isotope computation Definition: –Let p i,a be the abundance of the isotope with mass i Da above the least mass –Ex: P 0,C : abundance of C-12, P 2,O : O-18 etc. Characteristic polynomial Prob{M+i}: coefficient of x i in  (x) (a binomial convolution)

CSE182 End of L11

CSE182 Isotopic Profile Application In DxMS, hydrogen atoms are exchanged with deuterium The rate of exchange indicates how buried the peptide is (in folded state) Consider the observed characteristic polynomial of the isotope profile  t1,  t2, at various time points. Then The estimates of p 1,H can be obtained by a deconvolution Such estimates at various time points should give the rate of incorporation of Deuterium, and therefore, the accessibility.

CSE182 Quiz  How can you determine the charge on a peptide?  Difference between the first and second isotope peak is 1/Z  Proposal:  Given a mass, predict a composition, and the isotopic profile  Do a ‘goodness of fit’ test to isolate the peaks corresponding to the isotope  Compute the difference

CSE182 Tandem MS summary The basics of peptide ID using tandem MS is simple. –Correlate experimental with theoretical spectra In practice, there might be many confounding problems. –Isotope peaks, noise peaks, varying charges, post-translational modifications, no database. Recall that we discussed how peptides could be identified by scanning a database. What if the database did not contain the peptide of interest?

CSE182 De novo analysis basics Suppose all ions were prefix ions? Could you tell what the peptide was? Can post-translational modifications help?

CSE182