ChEBI and SABIO-RK: Association of chemical compound information and reaction kinetics data Ulrike Wittig Scientific Databases and Visualization Group.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

ENZYMES: KINETICS, INHIBITION, REGULATION
SBML2Murphi: a Translator from a Biology Markup Language to Murphy Andrea Romei Ciclo di Seminari su Model Checking Dipartimento di Informatica Università.
CellDesigner Tutorial Laurence Calzone, Andrei Zinovyev UMR U900 INSERM/Institut Curie/Ecole des Mines de Paris Wednesday, April 30th.
EBI Proteomics Services Team – Standards, Data, and Tools for Proteomics Henning Hermjakob European Bioinformatics Institute SME forum 2009 Vienna.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
EBI is an Outstation of the European Molecular Biology Laboratory. IntEnz Integrated relational Enzyme database 23 May 2015.
EBI is an Outstation of the European Molecular Biology Laboratory. Alex Mitchell InterPro team Using InterPro for functional analysis.
Developing a protein-interactions ontology Esther Ratsch European Media Laboratory.
University of Leeds Department of Chemistry The New MCM Website Stephen Pascoe, Louise Whitehouse and Andrew Rickard.
XML Documentation of Biopathways and Their Simulations in Genomic Object Net Speaker : Hungwei chen.
NTHU AI Lab 共頁,第 1 頁 Ontology-based Quantitative Simulation of Chemical Reactions in Biological Pathway Systems Biology Presentation.
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Comparing protein structure and sequence similarities Sumi Singh Sp 2015.
Enzymatic Function Module (KEGG, MetaCyc, and EC Numbers)
Ch10. Intermolecular Interactions and Biological Pathways
Lecture 3: Pathway Generation Tool I: CellDesigner: A modeling tool of biochemical networks Y.Z. Chen Department of Pharmacy National University of Singapore.
1 SRI International Bioinformatics Large-Scale Metabolic Network Alignment: MetaCyc and KEGG Tomer Altman Bioinformatics Research Group SRI International.
Bioinformatics Dr. Víctor Treviño BT4007
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
EBI is an Outstation of the European Molecular Biology Laboratory. BioModels Database, a public model- sharing resource In silico systems biology: network.
IProLINK – A Literature Mining Resource at PIR (integrated Protein Literature INformation and Knowledge ) Hu ZZ 1, Liu H 2, Vijay-Shanker K 3, Mani I 4,
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
Biological Databases By : Lim Yun Ping E mail :
Martin Golebiewski Scientific Databases and Visualization Group EML Research, Heidelberg 2nd BioModels.net Training Camp th of January 2007, Manchester,
1 Bio-Trac 40 (Protein Bioinformatics) October 8, 2009 Zhang-Zhi Hu, M.D. Associate Professor Department of Oncology Department of Biochemistry and Molecular.
Semantic Web for Life Sciences Workshop Session VII: Semantic Aggregation, Integration, and Inference Moderator: Joanne Luciano October, Cambridge,
Aude Dufresne and Mohamed Rouatbi University of Montreal LICEF – CIRTA – MATI CANADA Learning Object Repositories Network (CRSNG) Ontologies, Applications.
Copyright OpenHelix. No use or reproduction without express written consent1.
EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: The story so far Paula de Matos.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
LibAnnotationSBML Neil Swainston Manchester Centre for Integrative Systems Biology 29 March 2009.
Playing Biology ’ s Name Game: Identifying Protein Names In Scientific Text Daniel Hanisch, Juliane Fluck, Heinz-Theodor Mevissen and Ralf Zimmer Pac Symp.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
Sharing Models. How Can I Exchange Models? SBML (Systems Biology Markup Language): de facto standard for representing cellular networks. A large number.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
A curated database of biological pathways.
Systems Biology Markup Language Ranjit Randhawa Department of Computer Science Virginia Tech.
A database of biological pathways and processes (borrowed from a presentation created by Steve Jupe)
Data Management Support for Life Sciences or What can we do for the Life Sciences? Mourad Ouzzani
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Taming the Big Data in Computational Chemistry #euroCRIS2015 Barcelona 9-11-XI-2015 Carles Bo ICIQ (BIST) -
Workshop: Linking Models and Data in SysMO Katy Wolstencroft, SysMO-DB University of Manchester, UK.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Lisa Matthews, 1 Esther Schmidt, 2 Suzanna Lewis, 3 David Croft, 2 Bernard de Bono, 2 Peter D'Eustachio, 1 Marc Gillespie, 1 Gopal Gopinath, 1 Bijay Jassal,
InterPro Sandra Orchard.
Importing KEGG pathway and mapping custom node graphics on Cytoscape Kozo Nishida Keiichiro Ono Cytoscape retreat 2010 University of Michigan Jul 18, 2010.
GENBANK FILE FORMAT LOCUS –LOCUS NAME Is usually the first letter of the genus and species name, followed by the accession number –SEQUENCE LENGTH Number.
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
OncoTrack Bioinformatics Workshop Max Planck Institute for Molecular Genetics, Berlin Wednesday 6 th November 2013 TimeSubject 13:30-15:00 Introduction.
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
Pathway Team SNU, IDB Lab. DongHyuk Im DongHee Lee.
Cheminformatics and Metabolism Team The EBI Enzyme Portal.
Classifying Chemistry: Current Efforts in Canada
Networks and Interactions
ENZYMES: KINETICS, INHIBITION, REGULATION
GO : the Gene Ontology & Functional enrichment analysis
Using ArrayExpress.
Data Exchange & Public Reference Data
Open PHACTS 1.3 Release ( triples)
Developing a protein-interactions ontology
PIR: Protein Information Resource
ENZYME INHIBITION.
Annotation Presentation
Presentation transcript:

ChEBI and SABIO-RK: Association of chemical compound information and reaction kinetics data Ulrike Wittig Scientific Databases and Visualization Group EML Research, Heidelberg

Substrates Products Enzyme ActivatorInhibitor Modifier Biochemical reaction

Reaction kinetics V max  maximal enzyme velocity K M  Michaelis-Menten constant (k2+k-1)/k1

Essential for simulations of biochemical reactions /pathways (not only parameter values but also rate law equations) Dependency on environmental conditions Dependency on biological location (organism, tissue, cellular location) Data must be easily accessible and interchangeable  Collection and standardization of reaction kinetics data  Links to biochemical, environmental and experimental context  Links to corresponding data of external resources  Complex search and export functions Reaction kinetics

SABIO-RK SABIO-RK describes Reaction Kinetics and is an extension of SABIO (System for the Analysis of Biochemical Pathways) SABIO Pathways Reaction Enzymes Reactants Organisms SABIO-RK Concentrations Kinetic Law Environment Reactants Parameters

Automatic extraction of information from other databases (e.g. KEGG) Kinetic data manually extracted from literature Kinetic data directly obtained from laboratory experiments Manual curation, assisted by semi-automatic tools for standardization (unification, structuring, normalization, annotation) Access through a web-based user interface Access through web-services Export in SBML (Systems Biology Markup Language) format Data flow

Automatic input (XML based) SABIO-RK Database of experimental raw data Storage of kinetic data (e.g. Km, Vmax) Link to original data Data access Generation of experimental data Collaboration with Douglas Kell, Neil Swainston (Manchester Center for Integrative Systems Biology)

Manual data input Kinetic data from publications Tables, Formula, Graphs, Pictures Some assay conditions or methods description only noted as reference No controlled vocabulary in the literature (different names of compounds, enzymes etc.) Missing or incomplete information -Incomplete reactions (e.g. products not mentioned) -Missing organism specification -Isoenzyme not specified (old papers)  Web-based input interface

Input interface

Curation Data first inserted in an intermediate database Curation process (search for errors and inconsistencies) –Manually by biological experts –Semi-automatically by consistency checks –Standardization –Unification –Annotation to controlled vocabularies and external databases Transfer data from intermediate to public SABIO-RK database

Curation of chemical compounds  Search for multiple entries for identical compounds –KEGG IDs: C00201, C03802 –SABIO-RK ID: 1674 –ChEBI IDs: 37413, 37076

–KEGG IDs: C00677, C04283 –SABIO-RK ID: 1690 –ChEBI IDs: 16381, 16516

Identification of identical chemical compounds Goal:  Linking different databases by names of chemical compounds (e.g. ChEBI, KEGG, PubChem) Methods developed to support curation:  Normalization of compound names  Analysis of compound names and generation of SMILES string  Generation of chemical structure based on SMILES string  Classification of chemical compounds

Normalization of compound names Examples of normalization rules: -Lowercase letters only -Normalize different types of brackets to one type -Remove spaces -Replace ‘-p’ at end of name by ‘-phosphate’ -Replace all suffixes ‘-phosphate’ by prefixes ‘phospho-’ -Replace ‘ate’ followed by a delimiting character by ‘ic acid’ (acid-base pairs) -Three-letter amino acid code get replaced by full amino acid names -Prefixes of Chemical Compound names are normalised by sorting them: 2-propanone-1-amino-3-(phosphono…  1-amino-2-propanone-3-(phosphono… -Identification of synonymous parts in different names …valpro... ...2-propylpentano...  Generation of lists of normalized compound names for different databases (e.g. SABIO-RK, ChEBI, KEGG, PubChem)  Matching the lists from different databases to find synonyms of chemical compounds

Analysis of compound names

Classification of compounds -Generation of structural formula, totals formula and molecular weight based on SMILES string -Rules for compound classifications -Graph and sub-graph search for functional groups -Classification using different criteria, for example D-Glucose is a: -Aldose (functional group aldehyde) -Hexose (number of C-Atoms = 6)  SABIO-RK search for general compounds and all sub-classes

Access to SABIO-RK  Web-based user interface for browsing and searching the data manually  Web services (API access) can be automatically called by external tools, e.g. by other databases or simulation programs for biochemical network models (

SABIO-RK web interface Web access to search for general reaction information, kinetic laws, kinetic parameters, experimental conditions etc. Colour-coded representation of results –Kinetic data available matching search criteria –Kinetic data available but not matching search criteria –No kinetic data available Export of kinetic data in SBML (Systems Biology Mark-up Language) Annotations to external databases, ontologies and controlled vocabularies

Links to external databases Cellular location –Gene Ontology Protein –UniProt Literature source –PubMed Compound –KEGG –PubChem –ChEBI Enzyme –KEGG –ExPASy Enzyme –IntEnz –IUBMB –Reactome

Annotations in SBML

ChEBI links to SABIO-RK

SABIO-RK links to ChEBI

SABIO-RK Summary  curated data based on literature information (ca papers with >25000 entries)  offers a platform for kinetic data storage and exchange  free available for academic use  uses existing standard data formats, controlled vocabularies and ontologies  links to original data sources (literature, databases etc.)

Kinetic information about reaction mechanism –separate reactions for intermediate steps Kinetic data for signaling reactions Information about in vivo metabolite concentrations SABIO-RK as platform for experimental kinetic data –scientists producing the data can directly enter it into SABIO-RK Extention of web interface (less reaction oriented) –search for inhibitors/ activators (drugs) etc. –search for enzymes Future Perspectives

Financial support: Renate Kania Olga Krebs Andreas Weidemann Saqib Mir Henriette Engelken Martin Golebiewski Ulrike Wittig Isabel Rojas SABIO-RK is member of: SABIO-RK team