Computing fragmentation trees from tandem mass spectrometry data Florian Rasche1, Aleš Svatoš2, Ravi Kumar Maddula2, Christoph Böttcher3 & Sebastian Böcker1*

Slides:



Advertisements
Similar presentations
Every edge is in a red ellipse (the bags). The bags are connected in a tree. The bags an original vertex is part of are connected.
Advertisements

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Welcome! Mass Spectrometry meets Cheminformatics Tobias Kind and Julie Leary UC Davis Course 7: Concepts for LC-MS Class website: CHE Spring 2008.
Computing Kemeny and Slater Rankings Vincent Conitzer (Joint work with Andrew Davenport and Jayant Kalagnanam at IBM Research.)
More Graph Algorithms Minimum Spanning Trees, Shortest Path Algorithms.
CSC5160 Topics in Algorithms Tutorial 2 Introduction to NP-Complete Problems Feb Jerry Le
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
1 Spanning Trees Lecture 20 CS2110 – Spring
Protein Sequencing and Identification by Mass Spectrometry.
Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry.
Optimal Sleep-Wakeup Algorithms for Barriers of Wireless Sensors S. Kumar, T. Lai, M. Posner and P. Sinha, BROADNETS ’ 2007.
Dynamic Programming Reading Material: Chapter 7..
FAST: A Novel Protein Structure Alignment Algorithm Jianhua Zhu and Zhiping Weng PROTEINS: Structure, Function, and Bioinformatics 58:618–627 (2005) Created.
Smart Templates for Chemical Identification in GCxGC-MS QingPing Tao 1, Stephen E. Reichenbach 2, Mingtian Ni 3, Arvind Visvanathan 2, Michael Kok 2, Luke.
Graphs and Trees This handout: Trees Minimum Spanning Tree Problem.
1 Internet Networking Spring 2002 Tutorial 6 Network Cost of Minimum Spanning Tree.
Fa 05CSE182 CSE182-L8 Mass Spectrometry. Fa 05CSE182 Bio. quiz What is a gene? What is a transcript? What is translation? What are microarrays? What is.
The Shortest Path Problem
Balanced Binary Search Trees height is O(log n), where n is the number of elements in the tree AVL (Adelson-Velsky and Landis) trees red-black trees get,
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Evaluated Reference MS/MS Spectra Libraries Current and Future NIST Programs.
Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
Physical Properties & Density. Physical Properties How would you describe someone or something? How would you describe someone or something? The weight,
Algorithms for Network Optimization Problems This handout: Minimum Spanning Tree Problem Approximation Algorithms Traveling Salesman Problem.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Organic Mass Spectrometry
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
1 Section 1.4 Graphs and Trees A graph is set of objects called vertices or nodes where some pairs of objects may be connected by edges. (A directed graph.
Phylogenetics II.
Organic Mass Spectrometry
Prims’ spanning tree algorithm Given: connected graph (V, E) (sets of vertices and edges) V1= {an arbitrary node of V}; E1= {}; //inv: (V1, E1) is a tree,
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
A Comprehensive Comparison of the de novo Sequencing Accuracies of PEAKS, BioAnalyst and PLGS Bin Ma 1 ; Amanda Doherty-Kirby 1 ; Aaron Booy 2 ; Bob Olafson.
Temple University MASS SPECTROMETRY FURTHER INVESTIGATIONS Ilyana Mushaeva and Amber Moscato Department of Electrical and Computer Engineering Temple University.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
PEAKS: De Novo Sequencing using Tandem Mass Spectrometry Bin Ma Dept. of Computer Science University of Western Ontario.
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
F LORIDA I NTERNATIONAL U NIVERSITY Advanced Mass Spectrometry Piero R. Gardinali/Yong Cai/ Bruce McCord Revised on August 23, 2009.
Introduction to Graphs And Breadth First Search. Graphs: what are they? Representations of pairwise relationships Collections of objects under some specified.
Relative Mass ysis/masspec/elements.html.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
LIMITATIONS OF ALGORITHM POWER
Tag-based Blind Identification of PTMs with Point Process Model 1 Chunmei Liu, 2 Bo Yan, 1 Yinglei Song, 2 Ying Xu, 1 Liming Cai 1 Dept. of Computer Science.
Spectral Interpretation General Process for Structure Elucidation of an Unknown Nat. Prod. Rep., 1999, 16,
Game Playing Evolve a strategy for two-person zero-sum games. Help the user to determine the next move. Constructing a game tree Each node represents a.
Tree - in “math speak” An ________ graph is a set of vertices/nodes and a set of edges, each edge connects two vertices. Any undirected graph in which.
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
In carbon-13 NMR, what do the number of peaks represent? The number of chemically different carbon atoms present.
Constructing high resolution consensus spectra for a peptide library
Graphs David Kauchak cs302 Spring Admin HW 12 and 13 (and likely 14) You can submit revised solutions to any problem you missed Also submit your.
B Monoisotopic mass of neutral peptide M r (calc): Fixed modifications: Carbamidomethyl Ions score: 45 † Expect: ‡ Matches (red): 18/50.
Mass Spectrometry u Chapter 12 Chapter 12.
MS Libraries for Forensics: DART-MS and GC-MS
NonTarget 2016 Ascona, Switzerland
Accelerating Research in Life Sciences
Detection of 3-methylmethcathinone and its metabolites 3-methylephedrine and 3- methylnorephedrine in pubic hair samples by liquid chromatography–high.
Accelerating Research in Life Sciences
From: Methoxycarbonyl-etomidate:A Novel Rapidly Metabolized and Ultra–short-acting Etomidate Analogue that Does Not Produce Prolonged Adrenocortical Suppression.
Lecture 23 Quiz 6 Average = 72% (36 out of 50) Mass Spectrometry
Character-Based Phylogeny Reconstruction
C2.8 Instrumental Analysis
Balanced Binary Search Trees
Analytical techniques
MS Review.
Sample Spectra 1/1/2019.
Chapter 12: Part 2 Molecular Spectroscopy
Volume 95, Issue 3, Pages (August 2008)
The Coming Age of Complete, Accurate, and Ubiquitous Proteomes
Compound is first vaporized and converted into ions, which are then separated and detected. Electron Impact (EI) Mass Spectrometry 1.
Prims’ spanning tree algorithm
Presentation transcript:

Computing fragmentation trees from tandem mass spectrometry data Florian Rasche1, Aleš Svatoš2, Ravi Kumar Maddula2, Christoph Böttcher3 & Sebastian Böcker1* 1Chair for Bioinformatics, Friedrich-Schiller- University Jena, Ernst-Abbe-Platz 2, D Jena, Germany

The Crux Mass Spec small molecules depends on spectral library search What about unknown compounds? Proposed solution ▫At least annotate the MS2 peaks as something.

Data instrumentppm a a CID (eV)IP b b compoun ds usedmass rangemedia n average Orbitrap535,45,55,70 yes − API QSTAR(16)(16) 2015,25,45,55,90 e e yes − Micromass QTOF(11)(11) 2010,20,3 0,40,50 no −

(Left) Fragmentation graph for (S,R)-noscapine (C22H23NO7) using Orbitrap data. Nodes of the same color correspond to annotations of one measured peak (m/z, intensity, and collision energies). Arcs correspond to potential neutral losses. The weight of arcs is encoded by different line types. NLs can be computed by subtracting molecular formulas for end node and start node. Right: The corresponding hypothetical fragmentation tree of noscapine computed by our method. Nodes (blue) correspond to peaks in the tandem mass spectra and their annotated molecular formula (CE is range of collision energies); arcs (red) correspond to hypothetical neutral losses. Published in: Florian Rasche; Ales ̌ Svatos ̌ ; Ravi Kumar Maddula; Christoph Bo ̈ ttcher; Sebastian Bo ̈ cker; Anal. Chem. Article ASAP DOI: /ac101825k Copyright © 2011 American Chemical Society

Construction of graph Properties ▫Each vertex is a molecular formula associated with a peak. ▫A vertex color indicates a peak. ▫A directed edge (neutral loss) u->v implies v is a fragment of u Weighting (real serious math here) ▫ Goal: ▫Find a “colorful” tree with maximal score.

Generating Fragmentation Tree Given a directed acyclic G(V,E), a set of colors C where c(u) \in C, and edge weights w(u,v) where u,v \in V. Output a directed tree with maximum edge weight sum and is “colorful”. ▫NP-Hard ▫Heuristics were bad.

Dynamic Programming Solution Find the maximum score of the subtree rooted at v using the color set S, where S \subset C. They don’t specify, but “efficient runtime” looks like ▫O(|V|2^|C|)?

Results MS1 – mostly correct id of chemical formula Evaluation against Expert Knowledge and MS n ▫ Checked if the Neutral Losses were consistent with expert expectations  Orbitrap : 76.9% “correct”, 12.4% “unsure”, 10.7% “wrong” ▫Analyzed fragmentation trees generated by Greedy solution (pointless) Evaluation against Mass Frontier (predicts spectrum based on molecular structure) ▫FragTrees annotated 4x more ▫97% agreement of peak annotation overlap (p-value 10^-167) Comparing Fragmentation Trees ▫Eyeballing.

Critiques? Not very systematic in the analysis They describe useless bits in the paper Are fragmentation trees useful?