Mass Spectrometry in a drug discovery setting Claus Andersen Senior Scientist Sienabiotech Spa.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

From Genome to Proteome Juang RH (2004) BCbasics Systems Biology, Integrated Biology.
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
Protein Sequencing and Identification by Mass Spectrometry.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
ProReP - Protein Results Parser v3.0©
Mass spectrometry in proteomics Modified from: I519 Introduction to Bioinformatics, Fall, 2012.
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
My contact details and information about submitting samples for MS
1 Mass Spectrometry-based Proteomics Xuehua Shen (Adapted from slides with textbook)
Proteomics Josh Leung Biology 1220 April 13 th, 2010.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Proteomics Informatics Workshop Part III: Protein Quantitation
Gene Set Enrichment and Splicing Detection using Spectral Counting Nathan Edwards Department of Biochemistry and Mol. & Cell. Biology Georgetown University.
Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications.
Evaluated Reference MS/MS Spectra Libraries Current and Future NIST Programs.
Proteome.
A highly abbreviated introduction to proteomics
Proteomics Informatics – Data Analysis and Visualization (Week 13)
Phosphoproteomics and motif mining Martin Miller Ph.d. student CBS DTU
The dynamic nature of the proteome
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
Introduction The GPM project (The Global Proteome Machine Organization) Salvador Martínez de Bartolomé Bioinformatics support –
Center for Human Health and the Environment
es/by-sa/2.0/. Large Scale Approaches to the Study of Protein Levels and Activity Prof:Rui Alves
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Laxman Yetukuri T : Modeling of Proteomics Data
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
CS 461b/661b: Bioinformatics Tools and Applications Software Algorithm Mathematical Models Biology Experiments and Data.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Protein Identification via Database searching Attila Kertész-Farkas Protein Structure and Bioinformatics Group, ICGEB, Trieste.
Software Project MassAnalyst Roeland Luitwieler Marnix Kammer April 24, 2006.
PEAKS: De Novo Sequencing using Tandem Mass Spectrometry Bin Ma Dept. of Computer Science University of Western Ontario.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
Proteomics Session 1 Introduction. Some basic concepts in biology and biochemistry.
Central dogma: the story of life RNA DNA Protein.
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Introduction to biological molecular networks
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
A New Strategy of Protein Identification in Proteomics Xinmin Yin CS Dept. Ball State Univ.
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Tag-based Blind Identification of PTMs with Point Process Model 1 Chunmei Liu, 2 Bo Yan, 1 Yinglei Song, 2 Ying Xu, 1 Liming Cai 1 Dept. of Computer Science.
Peptide-assisted annotation of the Mlp genome Philippe Tanguay Nicolas Feau David Joly Richard Hamelin.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Constructing high resolution consensus spectra for a peptide library
What is proteomics? Richard Mbasu and Ben Richards.
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
Peptide de novo sequencing Peptide de novo sequencing is the analytical process that derives a peptide’s amino acid sequence from its tandem mass spectrum.
Mass spectrometry data enhancement software
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
Bioinformatics Solutions Inc.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
A perspective on proteomics in cell biology
Proteomics Informatics –
Top-down protein identification.
Brandon Ho, Anastasia Baryshnikova, Grant W. Brown  Cell Systems 
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Presentation transcript:

Mass Spectrometry in a drug discovery setting Claus Andersen Senior Scientist Sienabiotech Spa

Bioinformatics and statistics in a drug discovery companyClaus Andersen Overview From genes to phenotype Proteins an introduction Mass Spec for protein Mass Spec data Mass Spec data analysis Mass Spec database searching Recent advances identification quantification characterization

Bioinformatics and statistics in a drug discovery companyClaus Andersen From genes to phenotype genes proteins functions pathways metabolites phenotypes mRNA expression Regulation Degradation Activation/inactivation Interactions Kinematics Protein abundance Metabolite levels ADME/Tox Structure Pharmacophore Genome comparison mRNA expression Activation/inactivation Protein abundance

Bioinformatics and statistics in a drug discovery companyClaus Andersen Proteins as functional units Glucose ATP D.S. Goodsell pdb.org Vale and Milligan Science 2000 Myosin

Bioinformatics and statistics in a drug discovery companyClaus Andersen What affects the proteome Cellular proteome Interactions Temperature Stress Environment Physiological role Pharmaceutical substances Proteasome protein degradation mRNA Ribosome protein production Genome

Bioinformatics and statistics in a drug discovery companyClaus Andersen Protein extraction and digestion Mass Spec on proteins Treated/Sick Control/Healthy Mass Spectrometer Protein peptides identification MS spectra quantification characterization KKYAAELHLV P O Phosphorylation KAVQQPDGLA Oxidation … post translational modifications (PTM) QFHFHWGSLDQPDGLA Peptides and MS/MS spectra HPLC

Bioinformatics and statistics in a drug discovery companyClaus Andersen Mass Spec data 5  g 3000 MS spectra 500 MB Total 700 MB Gygi et al. Mol. Cell Bio. (1999) 400 MS/MS spectra 200 MB

Bioinformatics and statistics in a drug discovery companyClaus Andersen Mass Spec data analysis Fourier transformation (noise filtering) Gaussian peak fitting (peak detection) Generation of theoretical spectra (sequence  spectra) Large scale spectral comparison (DB searching) Spectral deconvolution (de-novo sequencing) Large scale sequence searching (DB searching) Data fitting (quantitation) Statistics and probability theory (reliability estimation) Linear discriminant analysis (quality assessment) …. and lots more Large scale spectral comparison (DB searching)

Bioinformatics and statistics in a drug discovery companyClaus Andersen Large scale spectral comparison Mass spec data MS spectrum FLIDSSRFSYPERPIIFLSMCYNIYSIAYIVRLTVGRERISCDFEEAAEPVLIQEGLKNT Protein sequence DB~2 mil Protein peptides~60 mil Peptide fragments~2000 mil ERPIIFLSMCYNIYSIAYIV etc. etc… ERPIIFLSMCYNIYSIAYIV ERPIIFLSMCYNIYSIAYI ERPIIFLSMCYNIYSIAY ERPIIFLSMCYNIYSIA ERPIIFLSMCYNIYSI ERPIIFLSMCYNIYS ERPIIFLSMCYNIY ERPIIFLSMCYNI ERPIIFLSMCYN ERPIIFLSMCY ERPIIFLSMC ERPIIFLSM … In-silico data MS/MS Spectrum (M peptide +H) + ±Δ i NiNi KiKi { V IV YIV AYIV IAYIV SIAYIV YSIAYIV IYSIAYIV NIYSIAYIV …

Bioinformatics and statistics in a drug discovery companyClaus Andersen Large scale spectral comparison PEP_PROBE by Sadygov and Yates Anal. Chem Hypergeometric probability model where is the binomial coefficient

Bioinformatics and statistics in a drug discovery companyClaus Andersen where is the cumulative distribution function given by the hypergeometric model, is the number of all peptides in the database matching the (M+H) + mass value. Sadygov and Yates Anal. Chem Expectation value (E-value) Large scale spectral comparison The E-value tells you how many peptides from the database are expected to have the same or better matches to the experimental spectrum by chance alone.

Bioinformatics and statistics in a drug discovery companyClaus Andersen Large scale spectral comparison Sadygov and Yates Anal. Chem An example from yeast (Saccharomyces cerevisiae) MS/MS spectrum (M+H) + = ± AMU Yeast proteins6 200 Yeast peptides~ Peptide fragments ~5 mil N= K= ATHILDFGPGGASGLGVLTHR Top candidate peptides K1K1 N1N1 LTPPQLPPQLENVILNKY 4034 E-value FAS1 SIP2 PeptideProtein name

Bioinformatics and statistics in a drug discovery companyClaus Andersen Large scale spectral comparison The protein FAS1 is part of the fatty acid biosynthesis of yeast. Its enzyme classification number is (EC ) FAS1 Protein identification In general several peptides are found for each protein (3-10)

Bioinformatics and statistics in a drug discovery companyClaus Andersen Inverted sequence DB used for background distribution estimation (PRISM) Emili’s group Mol. Cell Proteomics, 2(2), p96-106, 2003 Number of Sibling peptides (ProteinProphet) Aebersold’s group Anal. Chem. 74, p , 2004 Suffix tree searching: Lu and Chen Bioinformatics 19(2), pii113-ii121, 2003 Bayesian approach: Chen Biosilico in press 2004 Most recent advances Large scale spectral comparison An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Yates’ group J.Am.Soc.Mass Spec. 5(11) 1994 ProbID: a probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data. Aebersold’s group Proteomics 2(10) 2002 Other approaches