Presentation is loading. Please wait.

Presentation is loading. Please wait.

ChEBI Kirill Degtyarenko, EMBL-EBI / EPO. Rafael Alcántara Michael Ashburner * Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings.

Similar presentations


Presentation on theme: "ChEBI Kirill Degtyarenko, EMBL-EBI / EPO. Rafael Alcántara Michael Ashburner * Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings."— Presentation transcript:

1 ChEBI Kirill Degtyarenko, EMBL-EBI / EPO

2 Rafael Alcántara Michael Ashburner * Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings Alan McNaught * Inma Spiteri Christoph Steinbeck Martin Zbinden * The team

3 Chemical Entities of Biological Interest – an EBI database/dictionary of ‘ biochemical compounds ’ ChEBI: What is it?

4 Can be defined as consisting of “ molecules not directly encoded by the genome... that are either the products of nature or are synthetic products used... to intervene in the processes of living organisms ” [Michael Ashburner] What are the ‘ biochemical compounds ’ ?

5 “ Any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer etc., identifiable as a separately distinguishable entity ” [IUPAC “ Gold Book ” ]IUPAC “ Gold Book ” Molecular entity

6 Molecular entities  trans -vaccenic acid Groups  trans -vaccenoyl group Classes  fatty acids In fact, ChEBI contains

7 ‘ Small molecules ’ ? Yes, but big molecules as well! alumina amylose metaborate poly(vinyl alcohol)

8 Current status (17.12.08)

9 1-D ChEBI Numeric ID Carefully checked terminology Unambiguous ChEBI name IUPAC names Cross-references to free resources

10 Unambiguous ChEBI name CHEBI:28918 L -adrenaline not just ‘ adrenaline ’

11 2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid Systematic Name (IUPAC) 1 2 3 4 5 6 1 2 3 4 5 6

12 flufenamic acid (INN English) acide flufénamique (INN French) ácido flufenámico (INN Spanish) acidum flufenamicum (INN Latin) Flufenaminsäure (German) Common Name

13 The Unpronounceables CHEBI:48935 ( E )-roxithromycin IUPAC name: (3 R,4 S,5 S,6 R,7 R,9 R,10 E,11 S,12 R,13 S,14 R )-4-(2,6-dideoxy-3- C -methyl-3- O -methyl-α- L - ribo -hexopyranosyloxy)-14- ethyl-7,12,13-trihydroxy-10-{[(2- methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3- (dimethylamino)-β- D - xylo -hexopyranosyloxy]- 3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one

14 CHEBI:32109 ( Z )-roxithromycin What is the common name of roxithromycin? CHEBI:48935 ( E )-roxithromycin INN: roxithromycin

15 Roxithromycin (2) CHEBI:48844CHEBI:48844 roxithromycin ( E )-roxithromycin( Z )-roxithromycin

16 What is thiamine? CHEBI:18385 thiamine(1+) aka thiamine CHEBI:33283 thiamine(1+) chloride INN: thiamine CHEBI:49105CHEBI:49105 thiamine(2+) dichloride aka thiamine chloride hydrochloride aka thiamine hydrochloride

17 “ Better to see the face than to hear the name ” (Zen proverb) Structures and identifiers based on structures offer new ways of crosslinking to other databases Structure search Need for 2-D

18 ChEBI 9 10 0 0 0 0 999 V2000 11.8219 -7.2713 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.8219 -8.0922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6074 -7.0165 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.1072 -6.8574 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6039 -8.3505 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.1072 -8.5027 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.0886 -7.6818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.3923 -7.2713 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.3888 -8.0922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 0 0 0 1 3 1 0 0 0 0 1 4 1 0 0 0 0 2 5 1 0 0 0 0 2 6 1 0 0 0 0 3 7 1 0 0 0 0 4 8 2 0 0 0 0 6 9 2 0 0 0 0 5 7 2 0 0 0 0 8 9 1 0 0 0 0 M END Connection table

19 2-D ChEBI One or more 2-D (or 3-D) connection tables One is default Autogenerated images (PNG) Default diagrams should be unambiguous

20 The Fine Art of chemical drawing

21 Linear forms of monosaccharides

22 Pyranose forms of monosaccharides

23 Fused systems ( R )-camphor ambiguousunambiguous

24 Square planar geometry cisplatintransplatin

25 §SMILES §InChI From 2-D back to 1-D

26 S implified M olecular I nput L ine E ntry S pecification Developed by David Weininger in 1988 Extended by others (e.g. Daylight) String of standard ASCII characters A number of valid SMILES can be produced for the same molecule SMILES (1)

27 SMILES (2) §N1C=NC2=C1C=NC=N2 §c1ncc2ncnc2n1 §C=1N\C=N/C\2=N/C=N\C=1/2 §c1ncnc2/N=C\Nc12 §n1cc2c(nc1)ncn2 §[H]c1nc([H])c2n([H])c([H])nc2n1

28 InChI (1) IUPAC International Chemical Identifier or InChIInChI Open source Developed by Stein, Heller, Tchekhovskoi and McNaught Used by NIST, PubChem, CML… and ChEBI

29 InChI (2) InChI=1/C5H4N4/c1-4-5(8-2-6-1)9-3-7-4/h1-3H,(H,6,7,8,9)/f/h7H InChIKey=KDCGOANMDULRCW-QDQILVOLCG

30 Limitations (1) Stereochemistry other than sp 3 tetrahedral and sp 2 trigonal planar Polymers Conformers Radicals/different spin state Topological isomers Mixtures Markush structures

31 Limitations (2) InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2 cisplatintransplatin

32 3-D ChEBI cisplatin

33  Compositional uncertainty  Positional uncertainty  Configurational uncertainty  Conformational uncertainty Uncertainty and ambiguity in chemistry

34 Examples  an alkali metal cation  vanadate( V ) anion  [ 2 H]ethanol Compositional uncertainty

35 Examples  L -bromohistidine residue  pteroic acid (several tautomers)pteroic acid Positional uncertainty

36 Examples  androstane  rel -(2 R,3 R )-2-amino-3-methylpentanoic acid  tetradec-11-enoic acid Configurational uncertainty

37 Examples  cyclohexane: chair, boat, twist  protein secondary structure: , ,  … Conformational uncertainty

38 Molecular structure ontology Subatomic particle ontology Role ontology  Biological role  Application ChEBI ontology

39 Molecular structure ontology  catecholamines Biological role  hormone Application  antiglaucoma  bronchodilator  cardiostimulant L -adrenaline

40 The family relations L -cysteine L -cysteine() L -cysteinate(2–) L -cysteinate(1–) L -cysteinyl L -cysteinium L -cysteino L -cystein- S -yl L -cysteine residue L -cysteinate residue D -cysteine cysteine L -cysteine zwitterion

41 Relationships in ChEBI ∆ Is A generic ⋄ Has Part generic ♯ Is Conjugate Acid Of specific ♭ Is Conjugate Base Of specific  Is Enantiomer Of specific  Is Tautomer Of specific ℛ Is Substituent Group From specific ℋ Has Parent Hydride specific ℱ Has Functional Parent specific  Has Role generic?

42 Is A relationship ∆ L -cysteinecysteine is a

43 Is Enantiomer Of  L -cysteine ∆∆ D -cysteine is enantiomer of

44 L -cysteinium Has Part ⋄ L -cysteine hydrochloride is part of has part

45 Is Conjugate Acid Of ♯ L -cysteine L -cysteinate(1–) is conjugate acid of L -cysteinium L -cysteinate(2–) ♯♯

46 Is Conjugate Base Of ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭♭

47 Acid/base relationships ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭ ♯ ♭♯ ♯

48 Is Tautomer Of L -cysteine L -cysteine zwitterion  is tautomer of

49 Is Tautomer Of 3 H -pyrrole2 H -pyrrole  1 H -pyrrole 

50 salutaridinol Has Parent Hydride has parent hydride is parent hydride of ℋ morphinan

51 7- O -acetylsalutaridinol Has Functional Parent has functional parent is functional parent of ℱ salutaridinol

52 L -cysteinyl Is Substituent Group From L -cysteine L -cysteine residue L -cysteino ℛ ℛ ℛ * * * *

53 The family relations L -cysteine L -cysteine() L -cysteinate(2–) L -cysteinate(1–) L -cysteinyl L -cysteinium L -cysteino L -cystein- S -yl L -cysteine residue L -cysteinate residue D -cysteine cysteine L -cysteine zwitterion ♭♯ ♯♭ ℛ ℛ ℛ ℛ ℛ  ℱ  ∆ ∆ ♯ ♭ ♯ ♭ ♯ ♭♯ ♭

54 Ontology of L-cysteineL-cysteine

55 Ontology of L -cysteine (1)

56 Ontology of L -cysteine (2)

57 Thank you


Download ppt "ChEBI Kirill Degtyarenko, EMBL-EBI / EPO. Rafael Alcántara Michael Ashburner * Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings."

Similar presentations


Ads by Google