Presentation is loading. Please wait.

Presentation is loading. Please wait.

EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: an EBI chemistry reference.

Similar presentations


Presentation on theme: "EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: an EBI chemistry reference."— Presentation transcript:

1 EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: an EBI chemistry reference

2 ChEBI – Chemical Entities of Biological Interest 30.08.2015 2 Overview Introduction to ChEBI Searching and browsing Understanding the ontology Downloads and programmatic access

3 EBI is an Outstation of the European Molecular Biology Laboratory. Introduction to ChEBI Block 1

4 Small Molecules within Bioinformatics Literature Nucleotide sequences Genomes Expressions Protein sequences Protein domains, families 3D structures Enzymes Small molecules Pathways Systems

5 Literature Nucleotide sequences Genomes Expressions Protein sequences Protein domains, families 3D structures Enzymes Small molecules Pathways Systems Small Molecules within Bioinformatics Small molecules

6 Small molecules participate in all the processes of life

7 Signaling γ-aminobutyric acid GABA: chief inhibitory neurotransmitter in the mammalian central nervous system. In humans, also regulates muscle tone. synthesized by neurons found mostly as a zwitterion, that is, with the carboxyl group deprotonated and the amino group protonated conformational flexibility of GABA is important for its biological function, as it has been found to bind to different receptors with different conformations GABA deficiency linked to anxiety disorder, depression, alcoholism multiple sclerosis, action tremors, tardive dyskinesia

8 Metabolism Adenosine 5’-triphosphate (ATP): the "molecular unit of currency" of intracellular energy transfer. generated in the cell by energy-consuming processes, broken down by energy-releasing processes proteins that bind ATP do so in a characteristic protein fold known as the Rossmann fold, which is a general nucleotide-binding structural domain that can also bind the cofactor NAD Adenosine 5'-triphosphate

9 Enzymes Enzyme inhibitors are molecules that bind to enzymes and decrease their activity. Many drugs are enzyme inhibitors. They are also used as herbicides and pesticides. Enzyme activators bind to enzymes and increase their enzymatic activity. Enzyme activators are often involved in the allosteric regulation of enzymes in the control of metabolism. clavulanic acid acts as a suicide inhibitor of bacterial β-lactamase enzymes

10 Pathways http://www.genome.jp/kegg-bin/highlight_pathway?scale=1.0&map=map00231&keyword=tryptophan

11 Systems biology BioModels: quantitative models of biochemical and cellular systems tryptophan D-enantiomer: sweetL-enantiomer: bitter

12 Drug design Ligand-based: relies on knowledge of other molecules that bind to the biological target of interest. Structure-based: relies on knowledge of the 3D structure of the biological target. A lead has evidence that modulation of the target will have therapeutic value: e.g. disease linkage studies showing associations between mutations in the biological target and certain disease states. evidence that the target is druggable, i.e. capable of binding to a small molecule and that its activity can be modulated by the small molecule. Target is cloned and expressed, then libraries of potential drug compounds are screened using screening assays

13 Drug types 2003 - 2009 'Small molecules' in various shades of blue (http://chembl.blogspot.com/)

14 Small molecule annotations Often appear as free text in biological databases, in which they are not the core data Are frequently referred to by common names which may be chemically ambiguous eg. adrenaline = (S)-adrenaline ? (R)-adrenaline ? May be referred to by several different names paracetamol, acetaminophen, 4-acetamidophenol, N-(4-hydroxyphenyl)acetamide, …

15 Getting the chemistry right Thalidomide a non-barbiturate hypnotic Thalidomide displays immunosuppresive and anti-angiogenic activity. It inhibits release of tumor necrosis factor-alpha from monocytes, and modulates other cytokine action. Thalidomide is racemic — it contains both left and right handed isomers in equal amounts: one enantiomer is effective against morning sickness, and the other is teratogenic. Enantiomers are interconverted in vivo. That is, if a human is given D- thalidomide or L-thalidomide, both isomers can be found in the serum. Hence, administering only one enantiomer does not prevent the teratogenic effect in humans. http://www.drugbank.ca/drugs/DB01041

16 Small molecule data sources Deposition-driven publicly available compound repository, containing more than 25 million unique structures. http://pubchem.ncbi.nlm.nih.gov/ http://www.chemspider.com/ Automatic aggregation of publicly available chemistry data with crowdsourced annotation. http://www.ebi.ac.uk/chebi/ Manually annotated database and ontology Small molecules and bioactivity http://www.ebi.ac.uk/chembldb/

17 Chemicals - ChEBI Visualisation caffeine 1,3,7-trimethylxanthine methyltheobromine Nomenclature Formula: C8H10N4O2 Charge: 0 Mass: 194.19 Chemical data metabolite CNS stimulant trimethylxanthines Ontology MSDchem: CFF KEGG DRUG: D00528 Database Xrefs Chemical Informatics InChI=1/C8H10N4O2/c1-10-4-9-6- 5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 SMILES CN1C(=O)N(C)c2ncn(C)c2C1=O

18 ChEBI – Chemical Entities of Biological Interest 30.08.2015 18 What is ChEBI? Chemical Entities of Biological Interest Freely available Focused on ‘small’ chemical entities (no proteins or nucleic acids) Illustrated dictionary of chemical nomenclature High quality, manually annotated Provides chemical ontology Access ChEBI at http://www.ebi.ac.uk/chebi/

19 ChEBI – Chemical Entities of Biological Interest 30.08.2015 19 ChEBI home page http://www.ebi.ac.uk/chebihttp://www.ebi.ac.uk/chebi

20 ChEBI – Chemical Entities of Biological Interest 30.08.2015 20 How is ChEBI maintained? Automatic loading of preliminary data Automatic loading of 2 star annotated data Manual annotation User requests via Submission Tool Public release: First Wednesday of every month.

21 ChEBI – Chemical Entities of Biological Interest 30.08.2015 21 ChEBI entries contain A unique, unambiguous, recommended ChEBI name and an associated stable unique identifier An illustration where appropriate (compounds and groups, but generally not classes) A definition where appropriate (mostly classes) A collection of synonyms, including the IUPAC recommended name for the entity where appropriate A collection of cross-references to other databases Links to the ChEBI ontology

22 ChEBI – Chemical Entities of Biological Interest 30.08.2015 22 ChEBI entry view ChEBI – Chemical Entities of Biological Interest

23 30.08.2015 23 Automatic Cross-references

24 ChEBI – Chemical Entities of Biological Interest 30.08.2015 24 Chemical Structures Chemical structure may be interactively explored using MarvinView applet Available in formats Image Molfile InChI and InChIKey SMILES

25 ChEBI – Chemical Entities of Biological Interest 30.08.2015 25 Molfile format

26 EBI is an Outstation of the European Molecular Biology Laboratory. Time for Exercises

27 EBI is an Outstation of the European Molecular Biology Laboratory. Searching and browsing ChEBI Block 2

28 ChEBI – Chemical Entities of Biological Interest 30.08.2015 28 Simple text search Wildcard: * Enter any text

29 Simple Text Search ChEBI – Chemical Entities of Biological Interest 30.08.2015 29

30 Advanced Search ChEBI – Chemical Entities of Biological Interest 30.08.2015 30

31 Advanced text search ChEBI – Chemical Entities of Biological Interest 30.08.2015 31 Narrow to category AND, OR and BUT NOT

32 ChEBI – Chemical Entities of Biological Interest 30.08.2015 32 Structure search Search options Structure drawing tools

33 ChEBI – Chemical Entities of Biological Interest 30.08.2015 33 Search Results Click to go to entry page Hover-over for zoomed in image Download your search results

34 ChEBI – Chemical Entities of Biological Interest 30.08.2015 34 Fingerprints Chemical substructure searching is computationally expensive…

35 ChEBI – Chemical Entities of Biological Interest 30.08.2015 35 Fingerprints [2] … so heuristics must be used to decrease the number of search candidates C8H9NO2 Fingerprints are a generalized, abstract encoding of structural features which can be used as an effective screening device cannot be a substructure of an entity which does not have at least 8 carbon atoms, 9 hydrogen atoms…

36 ChEBI – Chemical Entities of Biological Interest 30.08.2015 36 Fingerprints [3] Encoding of structural patterns water (HOH) 0-bond pathsHOH 1-bond pathsHOOH 2-bond pathsHOH Hashed to create bit strings, which are added together to give final fingerprint PatternHashed bitmap H0000010000 O0010000000 HO1010000000 OH0000100010 HOH0000000101 Result:1010110111

37 ChEBI – Chemical Entities of Biological Interest 30.08.2015 37 Types of structure search Identity – based on InChI Substructure – uses fingerprints to narrow search range, then performs full substructure search algorithm Similarity – based on Tanimoto coefficient calculated between the fingerprints InChI=1/H2O/h1H2 10101101110010110010 1010110111 0010110010 Tanimoto(a,b) = c / (a+b-c) = 4 / (4+7-4) = 0.57 a b

38 ChEBI – Chemical Entities of Biological Interest 30.08.2015 38 Browse via Periodic Table Molecular entities / Elements

39 ChEBI – Chemical Entities of Biological Interest 30.08.2015 39 Navigate via links in ontology Click to follow links

40 EBI is an Outstation of the European Molecular Biology Laboratory. Time for Exercises

41 EBI is an Outstation of the European Molecular Biology Laboratory. Understanding the ChEBI ontology Block 3

42 ChEBI – Chemical Entities of Biological Interest 30.08.2015 42 Annotation of bioinformatics data Essential for capturing understanding and knowledge associated with core data Often captured in free text, which is easier to read and better for conveying understanding to a human audience, but… Difficult for computers to parse Quality varies from database to database Terminology used varies from annotator to annotator Towards annotation using standard vocabularies: ontologies within bioinformatics

43 ChEBI – Chemical Entities of Biological Interest 30.08.2015 43 The ChEBI ontology Organised into three sub-ontologies, namely Molecular structure ontology Subatomic particle ontology Role ontology ( R ) -adrenaline

44 ChEBI – Chemical Entities of Biological Interest 30.08.2015 44 Molecular structure ontology

45 ChEBI – Chemical Entities of Biological Interest 30.08.2015 45 Role ontology

46 ChEBI – Chemical Entities of Biological Interest 30.08.2015 46 ChEBI ontology relationships Generic ontology relationships Chemistry-specific relationships

47 ChEBI – Chemical Entities of Biological Interest 30.08.2015 47 Viewing ChEBI ontology

48 ChEBI – Chemical Entities of Biological Interest 30.08.2015 48 Viewing ChEBI ontology [2] Tree view

49 ChEBI – Chemical Entities of Biological Interest 30.08.2015 49 Browsing ChEBI ontology (OLS) Ontology Lookup Service (OLS): http://www.ebi.ac.uk/ontology-lookup/ Browse the ontology

50 ChEBI – Chemical Entities of Biological Interest 30.08.2015 50 Ontology Lookup Service Provides a centralised query interface for ontology and controlled vocabulary lookup Can integrate any ontology available in OBO (Open Biomedical Ontologies) format At last release, 58 ontologies integrated, including GO ChEBI Molecular interaction (PSI MI) Pathway ontology (PW) Human disease (DOID) and many more… Provides a search and a browse facility, as well as displaying a graph of terms and relationships

51 “The OBO Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain.” OBO Foundry ChEBI – Chemical Entities of Biological Interest 30.08.2015 51

52 EBI is an Outstation of the European Molecular Biology Laboratory. Time for Exercises

53 EBI is an Outstation of the European Molecular Biology Laboratory. Download and programmatic access Block 4

54 ChEBI – Chemical Entities of Biological Interest 30.08.2015 54 ChEBI domain model Self-referencing - merging

55 ChEBI – Chemical Entities of Biological Interest 30.08.2015 55 Compound IDs and Merging Compound accessions are maintained after merging, but… only the main accession of a merged group is displayed Navigated accession: CHEBI:5585 Main accession: CHEBI:15377

56 ChEBI – Chemical Entities of Biological Interest 30.08.2015 56 Compound IDs and Merging [2] IDSTATUSCHEBI_ACCNSOURCEPARENT_IDNAMEDEFINITION 15377CCHEBI:15377ChEBInullwaternull 5585CCHEBI:5585KEGG15377null Additional acc Parent ID IDCOMPOUNDACCN_NUMBERTYPESTATUSSOURCEURL_ABBR 162135585C00001KEGG accnCKEGG 1731455857732-18-5CAS Registry CKEGGnull This compound ID = additional acc

57 Downloading ChEBI flavours ChEBI – Chemical Entities of Biological Interest 30.08.2015 57 All downloads come in two flavours 3 star only entries (manually annotated ChEBI entries) all stars entries (manually annotated ChEBI and user submissions)

58 ChEBI – Chemical Entities of Biological Interest 30.08.2015 58 Downloading ChEBI OBO file Use on OBO-edit SDF File Chemistry software compliant such as Bioclipse Flat file, tab delimited Import all the data into Excel Parse it into your own database structure Oracle binary dumps Import into an oracle database Generic SQL insert statements Import into MySQL or postgresql database

59 File format defined specifically for capturing biological ontologies Why use this format? Use it if you are primarily interested in the ontology. Don’t use it if you are interested in chemical structural information. What can you do with it? Can parse it directly using parsers such as OBO-Edit Can upload and browse the ontology using OBO- Edit OBO File Format ChEBI – Chemical Entities of Biological Interest 30.08.2015 59 General header information Synonym types used in terms Root terms Relationships to other terms

60 Chemistry software compliant format Why use this format? Use it to obtain the ChEBI entries with their chemical structural information. Don’t use it for the ontology. What can I do with this format? Parse it using existing software libraries such as CDK. Open it in standalone tools such as Bioclipse Copy and paste individual structures into JChemPaint SDF File Lite format ChEBI – Chemical Entities of Biological Interest 30.08.2015 60 Entries separated by $$$$

61 SDF File complete format ChEBI – Chemical Entities of Biological Interest 30.08.2015 61 Entries separated by $$$$

62 Flat-file tab and comma delimited ChEBI – Chemical Entities of Biological Interest 30.08.2015 62 Why use this format? Use it to obtain the entire ChEBI database structure. What can I do with this format? Open it using Excel Import it into a relevant database such as Oracle

63 Table dumps Similar structure to the flat-file tab delimited files Why use this format? Use it to obtain the entire ChEBI database structure. Oracle binary dumps Import into an oracle database Generic SQL insert statements Import into MySQL or postgresql database ChEBI – Chemical Entities of Biological Interest 30.08.2015 63

64 ChEBI – Chemical Entities of Biological Interest 30.08.2015 64 Web services Allow users to create their own applications to query data User application

65 ChEBI – Chemical Entities of Biological Interest 30.08.2015 65 The ChEBI web service Programmatic access to a ChEBI entry SOAP based Java implementation Clients currently available in Java and perl Methods getLiteEntity getCompleteEntity and getCompleteEntityByList getOntologyParents getOntologyChildren and getAllOntologyChildrenInPath getStructureSearch Documented at http://www.ebi.ac.uk/chebi/webServices.do. http://www.ebi.ac.uk/chebi/webServices.do

66 ChEBI – Chemical Entities of Biological Interest 30.08.2015 66 Web service client object model getLiteEntity getCompleteEntity getOntology (Parents and Children)

67 Methods and parameters (1) ChEBI – Chemical Entities of Biological Interest 30.08.2015 67

68 Methods and parameters (2) ChEBI – Chemical Entities of Biological Interest 30.08.2015 68

69 Methods and parameters (3) ChEBI – Chemical Entities of Biological Interest 30.08.2015 69

70 EBI is an Outstation of the European Molecular Biology Laboratory. Time for Exercises

71 ChEBI – Chemical Entities of Biological Interest 30.08.2015 71 For more information Email : chebi-help@ebi.ac.ukchebi-help@ebi.ac.uk SourceForge: https://sourceforge.net/projects/chebi/https://sourceforge.net/projects/chebi/ User Manual: http://www.ebi.ac.uk/chebi/userManualForward.do http://www.ebi.ac.uk/chebi/userManualForward.do RSS Feed

72 ChEBI – Chemical Entities of Biological Interest 30.08.2015 72 Acknowledgements The ChEBI team Adriano DekkerMarcus Ennis Janna HastingsAmit Walijnkar Steve TurnerVenkat Muthukrishnan (trainee) Gareth Owen (Project Manager) Paula de Matos (Group Coordinator) Christoph Steinbeck (Group Leader) Everyone @ the EBI and elsewhere who uses or contributes to ChEBI ChEBI is funded by the European Commission under SLING, grant agreement number 226073 (Integrating Activity) within Research Infrastructures of the FP7 Capacities Specific Programme; and by the BBSRC, grant agreement number BB/G022747/1 within the "Bioinformatics and biological resources" fund.

73 EBI is an Outstation of the European Molecular Biology Laboratory. Thank you


Download ppt "EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: an EBI chemistry reference."

Similar presentations


Ads by Google