Rainer Breitling – Groningen Bioinformatics Centre

Slides:



Advertisements
Similar presentations
Chapter 4 Chemistry of Carbon
Advertisements

Bioinformatics for Metabolomics and Fluxomics E’ ? ? ? ? ? ? A?C? E ABCD E v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 ABCD E RL 1; metabolite identification RL.
Improvements in Mass Spectrometry for Life Science Research – Does Agilent Have the Answer? Ashley Sage PhD.
Modern Tools for Drug Discovery NIMBUS Biotechnology Modern Tools for Drug Discovery
Proteomics Examination Yvonne (Bonnie) Eyler Technology Center 1600 Art Unit 1646 (703)
Where does a Pyrolysis method fit into answering the question of the fate of an ingredient in Cigarette combustion? To Be or Not to be, That is the question.
Computational Modelling of Biological Pathways Kumar Selvarajoo
17.1 Mass Spectrometry Learning Objectives:
Carbon Isotopes in Individual Compounds 03 February 2010.
In silico aided metaoblic engineering of Saccharomyces cerevisiae for improved bioethanol production Christoffer Bro et al
Agilent: The Company, The Myth, The Lengend. Agilent: Agilent Technologies Inc. (NYSE: A) is a world-wide, diverse technology company focused on expansion.
Molecular Mass Spectrometry
Previous Lecture: Regression and Correlation
HOW MASS SPECTROMETRY CAN IMPROVE YOUR RESEARCH
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
My contact details and information about submitting samples for MS
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
What do you remember about mass spectrometry?
2007 GeneSpring MS GeneSpring for Metabolite BioMarker Analysis using Mass Spectrometry data Agilent Q-TOF VIP Visit Jan 16-17, 2007 Santa Clara, CA Thon.
Deciphering Drug Modes of Action by Metabolomics Isabel Vincent 1,2, David Ehmann 3, Manos Perros 3, Scott Mills 3, Deborah Woods 4, Srinivasa Rao 5, Karl.
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
Center for Human Health and the Environment
Anindya Bhattacharya and Rajat K. De Bioinformatics, 2008.
Mass Spectrometry I Basic Data Processing. Mass spectrometry A mass spectrometer measures molecular masses. The mass unit is called dalton, which is 1/12.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
1 Chemical Analysis by Mass Spectrometry. 2 All chemical substances are combinations of atoms. Atoms of different elements have different masses (H =
U N I V E R S I T Y O F S O U T H F L O R I D A Database-centric Data Analysis of Molecular Simulations Yicheng Tu *, Sagar Pandit §, Ivan Dyedov *, and.
Laxman Yetukuri T : Modeling of Proteomics Data
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Infrared Spectroscopy
LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.
An approach to carry out research and teaching in Bioinformatics in remote areas Alok Bhattacharya Centre for Computational Biology & Bioinformatics JAWAHARLAL.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
05/02/2008 Jae Hyun Kim Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor Faulon, J. L.,
1 I. Introduction 1.Definition: Protein Characterization/Proteomics i.Classical Proteomics ii.Functional Proteomics 2.Mass spectrometery I.Advantages in.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Tag-based Blind Identification of PTMs with Point Process Model 1 Chunmei Liu, 2 Bo Yan, 1 Yinglei Song, 2 Ying Xu, 1 Liming Cai 1 Dept. of Computer Science.
Metabolomics MS and Data Analysis PCB 5530 Tom Niehaus Fall 2015.
Functional Groups: - Aldehydes - Ketones - Organic Acids - Esters.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Constructing high resolution consensus spectra for a peptide library
Advanced Strategies for Metabolomic Data Analysis Dmitry Grapov, PhD.
Peptide de novo sequencing Peptide de novo sequencing is the analytical process that derives a peptide’s amino acid sequence from its tandem mass spectrum.
Biomolecules: carbohydrates and lipids
Metabolomics Data Analysis
Accelerating Research in Life Sciences
Metabolomics Study of Human Seminal Plasma of Infertile Men
AP Biology Chapter 4 P58-64 Modeling Organic Molecules
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
Summary Presented by : Aishwarya Deep Shukla
Ivana Blaženović Postdoctoral Researcher
Jan Stanstrup Bioactive Foods and Health
Bioinformatics Solutions Inc.
Carbon and the Molecular Diversity of Life
Untargeted metabolomics profiling by GC-TOF-MS reveals a human PCa-associated metabolic phenotype in Zn-deficient middle-aged Wistar-Unilever rat prostates.
Microbiome: Metabolomics
Standards Development for Metabolomics
Protein structure prediction.
Statistical Data Analysis
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Bioinformatics for Proteomics
A 13C Isotope Labeling Strategy Reveals the Influence of Insulin Signaling on Lipogenesis in C. elegans  Carissa L. Perez, Marc R. Van Gilst  Cell Metabolism 
Microbiome: Metabolomics
Framework for integrating taxonomic and metabolomic data.
Proteomics Informatics David Fenyő
Presentation transcript:

New algorithms for high-resolution metabolomics A case study on trypanosome parasites Rainer Breitling – Groningen Bioinformatics Centre University of Groningen Michael P. Barrett – Infection & Immunity Division, University of Glasgow Breitling et al., Ab initio prediction of metabolomic networks using FT-ICR MS, Metabolomics, 2006, 2:155 Breitling et al., Precision mapping of the metabolome, Trends in Biotechnology, 2006, 24:543

The biological context – trypanosomiasis sleeping sickness is a major health problem in tropical Africa current drugs are becoming ineffective and are a health risk themselves (they kill up to 10% of patients, rather than healing them!) metabolite profiling in drug-treated and mutant parasites may identify new drug targets first pilot study: compare metabolome of in vivo and in vitro parasites to the composition of media – identify metabolite scavenging To test the technology, a simple experiment was done on the pathogen that causes sleeping sickness (historically the most important disease of tropical Africa, because it limits successful animal husbandry, and still one major cause of human morbidity and mortality). Drugs are very old, resistance is developing and the side effects are terribly bad. New drug targets are searched for. We wanted to find those metabolites that are taken up (scavenged) by the parasite from the medium. Also those that are produced specifically by the parasite. Both should lead to interesting drug targets, transporters in the first case, enzymes in the second.

FT-ICR mass spectrometry High-resolution mass spectrometry. Detection/separation is done in an ion trap. Ions are cycling in a magnetic field and are excited by radio frequency pulses. The resonance frequency depends on the mass/charge ratio of an ion. The resulting detected signal is a convolution of the signals from all ions, it is deconcoluted by Fourier transformation and directly corresponds to a mass spectrum. The stronger the magnetic field, the better the resolution (similar to NMR) – almost unlimited.  measurement of very small mass differences at very high accuracy in complex mixtures of biomolecules

The advantage of high resolution The chemical composition of a metabolite can be estimated Exact identification by mass may be possible (within limits) CH6N2 Methylhydrazine Mw = 46.0718 C2H6O Ethanol Mw = 46.0684 Only a limited number of molecular formulae can explain a given exact mass within acceptable limits of accuracy. The complexity of this task increases rapidly with the total mass (interesting computational challenge!!). And of course, it can’t discriminate between compounds with the same formula, but different connectivity.

High accuracy confirmed by standards Compound predicted mass measured mass ppm average S/N glutathione 307.083807 307.0835 1 438 oxidized glutathione 612.152 612.1516 328 trypanothione 723.3044 723.3036 16 oxidized trypanothione 721.2887 721.2889 281 NADP 743.075458 743.0766 2 442 NAD 663.109125 663.1096 1229 ATP 506.99575 506.9945 289 ADP 427.029418 427.0293 118 AMP 347.063086 347.0633 14 berenil 281.138894 281.139 9 pentamidine 340.1899 340.1897 67 DB75 304.132411 304.1325 115 melarsen oxide 292.00538 292.0053 113 spermine 202.215747 [202.1721] 216 - spermidine 145.157898 [141.1402] 28466 putrescine 88.100048 [100] 119000 ornithine 132.089878 [128.0479] 31566 Strong signal to noise ratios (S/N) are detected for most compounds in the standard mixture – but some of them are not detectable at all (polyamines). Those that are found, are measured accurately up to the third decimal place.

Overview of experimental results The global results are displayed in the form of a Venn diagram. Usually this is done for 3 sets at most, but there are solutions for 4 sets (like here) and even for 5. But they get increasingly difficult to interpret. The message is that (A) there is a large set of ubiquitous metabolites, (B) a smaller set of parasite-specific metabolites, and (C) some metabolites that are restricted to a single sample type. Also, the in vivo samples are consistently more complex.  1251 mass peaks detected in total in the four sample types Breitling et al., Metabolomics, 2006, 2:155

Can we use accuracy to get identities? Searches against the PubChem database to identify putative molecular identities Few useful hits, indicating that many metabolites are novel But some hits reveal interesting clues – many are fatty acid related, and this can be used to guide further more targeted exploration The high accuracy limits the number of possible hits tremendously. Usually there is at most a single mass hit (the lists are still quite long, because there are usually many “isoforms”) MetabolomeExplorer Classic (Breitling, unpubl.)

Phospholipids of regular structure Possible variations: Length of sidechain, in steps of 2C units (+C2H4) Degree of unsaturation (-H2) Type of headgroup (choline, ethanolamine, glycine…) connection via ester or ether bond (acyl or alkyl lipids)

The phospholipid metabolome of trypanosomes Even a small number of good hits allows further exploration.

Do mass differences contain additional information? Cluster of common distances Mass difference (all possible pairwise comparisons) Breitling et al., Trends in Biotechnology, 2006, 24:543

Do mass differences contain additional information? Real Masses (differences) Frequency Formula exact mass RANDOM masses (differences) 2.015950785 382 H2 2.015650074 92.7097502 7 21.98312914 326 Na-H 21.98194466 205.304917 1.003209507 284 13C isotope 1.00335484 52.82462466 24.00000115 260 C2 24 193.6001474 6 26.01629789 237 C2H2 26.01565007 243.2921378 28.03188991 218 C2H4 28.03130015 254.7535545 4.032019289 197 H4 4.031300148 6.467240667 1.012596951 164 H2-13C isotope 1.012295234 52.69339973 3.019108784 148 H2+13C isotope 3.019004914 21.98649217 22.99695714 140 C2-13C isotope 22.99664516 22.12482588 TOTAL 25370 115 (+/-22) (in 2472 clusters of >5) (in 19 +/- 4 clusters of >5)

Biochemically expected transformations Not all kinds of mass differences are equally interesting But some are particularly important, because they are expected: (de)hydrogenation (de)amination (de)phosphorylation …and many more (about 100 are really common)

Biochemically expected transformations Frequency Formula exact mass RANDOM hydrogenation/ dehydrogenation 284 H2 2.015650074 Glycine 8 C2H2 211 26.01565007 cytosine (-H) ethyl addition (-H2O) 191 C2H4 28.03130015 Threonine 7 hydroxylation (-H) 84 O 15.99491464 Serine palmitoylation (-H2O) 57 C16H30O 238.2296658 isoprene addition (-H) ketol group (-H2O) C2H2O 42.01056471 condensation/dehydration methanol (-H2O) 56 CH2 14.01565007 primary amine 6 40 H2O 18.01056471 Leucine Formic Acid (-H2O) 28 CO 27.99491464 Carboxylation 25 CO2 43.98982928 carbamoyl P transfer (-H2PO4) TOTAL 1438 271 (+/- 25) If masses are randomly distributed, their differences are not enriched in interesting transformations (right hand side), but in the real data, there are many of them, e.g. 284 pairs of metabolites differ by a mass of 2.01565 (+/- 1ppm), corresponding to a hydrogenation/dehydrogenation reaction

Visualization of “common” metabolic relationships Based on the common “textbook transformations” one can find the metabolic neighbors of a certain mass…

Visualization of “common” metabolic relationships “metabolic network” of masses that correlate with the amount of 809.5939 (C38:4) in trypanosome metabolism …and this can be repeated iteratively, to build an entire network of interrelated metabolites. This corresponds to a biochemical pathway map, although not each step is necessarily catalyzed by an enzyme (some of the mass differences may refer to compounds with related formula, but without any metabolic relationship)

de novo network generation In the end, a huge graph results from the de novo network building process – this is difficult to visualize, navigate and analyze – interesting challenges for bioinformatics

de novo network generation In the end, a huge graph results from the de novo network building process – this is difficult to visualize, navigate and analyze – interesting challenges for bioinformatics Does this network have a random structure, or are there certain patterns?

Degree distributions metabolites  exponential  random net The distribution of “textbook transformations” in the trypanosome metabolome follows a power law (linear graph in a log-log plot) [right]. The distribution of “clusters of common distances” is closer to an exponential distribution [left]. The reason is simple: Many reactions involve small molecules, which are not detectable in the FTMS machine. These compounds would be hubs in the network. They are missing here, but are implicitly considered in the “textbook transformations”  lesson: Understand the limits of the data acquisition before trying an analysis transformations  power-law  scale-free net metabolites  exponential  random net Power law:

Conclusions FT-ICR MS provides highly accurate measurements of metabolites in complex mixtures accuracy is sufficient to identify metabolites based on mass information mass differences are particularly informative de novo metabolic network construction and exploration are a distinct possibility new analysis tools are necessary to make full use of the available information

MetabolomeExplorer platform Scheltema et al., submitted