Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia First presented at Applications of Cheminformatics.

Similar presentations


Presentation on theme: "1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia First presented at Applications of Cheminformatics."— Presentation transcript:

1 1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia First presented at Applications of Cheminformatics and Chemical Modelling to Drug Discovery · 8-19 Nov 2004 Updated. April, 2005 Structural Search Using ChemAxon Tools

2 Slide 2 Structural Search Using ChemAxon Tools April 2005 2 Contents Structural search in cheminformatics The JChem suite of tools Structural search in JChem Interfaces Database solutions: JChemBase, Cartridge Standardization Search features MCS/MCES and Library MCS R-group decomposition The Chemical Terms language Future plans All examples generated by ChemAxons Marvin

3 Slide 3 Structural Search Using ChemAxon Tools April 2005 3 Structural search in cheminformatics A few examples to highlight the diversity of applications. : Compound registration – duplicate checking Database search e.g. imidazole derivatives Pharmacophoric group identification (JChem Screen, JKlustor) Functional group identification Cleavage bond identification (JChem Fragmenter) Virtual reaction processing (JChem Reactor) Standardization (canonicalization of structures, JChem Standardizer) Toxical fragment identification (superstructure search)

4 Slide 4 Structural Search Using ChemAxon Tools April 2005 4 Search types in JChem ABAS(Atom By Atom Search) or structural search: – Exact – Substructure – Superstructure – MC(E)S – maximum common (edge) substructure – R-group decomposition (identify ligands of a given scaffold) Similarity search: – Different Descriptors – Different Metrics

5 Slide 5 Structural Search Using ChemAxon Tools April 2005 5 ABAS search interfaces JSP(Java Server Pages): web GUI for database –Similarity & structural search –Substructure highlighting –Additional constraints –Insert, modify, delete Command line utility: jcsearch: for files and DB Java API –isMatching() – Only to check matching –findFirst(), findNext()Enumerate all –findAll() possible matchings Cartridge: access all functionality from SQL Chemical Terms

6 Slide 6 Structural Search Using ChemAxon Tools April 2005 6 ABAS options General options: Order sensitive hits e.g. Pre-assignment of query and target atoms Consider stereo or not, absolute stereo (ignore chiral flag) Timeout limit Exact charge/radical/isotope/query features/bond/stereo matching Double bond stereo: no check/marked/all double bonds Chemical Terms filter expression etc Database search: Maximum search time/number of hits Additional SQL SELECT expression for prefiltering Output table Reverse hits mode

7 Slide 7 Structural Search Using ChemAxon Tools April 2005 7 Structural search in database Search: two stage method: – Rapid pre-screening based on chemical hashed fingerprints – ABAS (isMatching) Duplicate check at compound registration: – Hash code: primary filter – ABAS (isMatching) Standardization Caching of structures and fingerprints allow top performance

8 Slide 8 Structural Search Using ChemAxon Tools April 2005 8 Import with JChem Base Manager

9 Slide 9 Structural Search Using ChemAxon Tools April 2005 9 JChem Base molecular file formats and integration Import formats: SMILES MDL molfile (v2000 and v3000) MDL SDF RXN RDF MRV Database engines: Oracle MySQL MS SQL Server PostgreSQL MS Access DB2 etc. CML PDB Sybyl molfile XYZ Gaussian cube Image formats for export (JPG, PNG, SVG) OS: any operating systems running java Windows Linux Mac OS X Solaris etc.

10 Slide 10 Structural Search Using ChemAxon Tools April 2005 10 JChem Base performance (1) Compound registration: Substructure search in a table of 3 million compounds: Server parameters: Windows XP; 1 CPU: Intel P4 3.0GHz; 2GB RAM; Oracle 9i Number of compounds Elapsed time Duplicates not checkedDuplicates checked 10,00032s45s 100,0004min 11s6min 20s 200,0008min 17s12min 26s 10.749740 1.20 0.9936 0.112 Search time (s)Number of hitsQuery

11 Slide 11 Structural Search Using ChemAxon Tools April 2005 11 JChem Base performance (2) Similarity search: Tanimoto >0.8 Server parameters: Windows XP; 1 CPU: Intel P4 3.0GHz; 2GB RAM; Oracle 9i 1.3336 1.3156 1.524 Search time (s)Number of hitsQuery

12 Slide 12 Structural Search Using ChemAxon Tools April 2005 12 JChem Cartridge for Oracle

13 Slide 13 Structural Search Using ChemAxon Tools April 2005 13 JChem Cartridge for Oracle Oracle is extended to support chemical database operations using the JChem Cartridge for Oracle Examples: Substructure search displaying ID, SMILES codes, and molweight: SELECT cd_id, cd_smiles, cd_molweight FROM my_structures WHERE jc_contains(cd_smiles, 'CC(=O)Oc1ccccc1C(O)=O') = 1; Similarity search filtered with predicted pKa values, which displays predicted logP and logD values: SELECT cd_id, jc_logP(cd_smiles), jc_logD(cd_smiles, 7.4) FROM my_structures WHERE jc_tanimoto(cd_smiles, 'CC(=O)Oc1ccccc1C(O)=O') >= 0.8 AND jc_pKa(cd_smiles, 'acidic', 1) < 4; JChem Cartridge for Oracle

14 Slide 14 Structural Search Using ChemAxon Tools April 2005 14 JChem Cartridge for Oracle Chemical Terms examples: Number of compounds in table nci_10m containing benzene and conforming the Lipinski rule of 5: SELECT count(*) FROM nci_10m WHERE jc_compare(structure, 'c1ccccc1','sep=! t:s!ctFilter:(mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10)') = 1 Compounds in table nci_10m containing 3-bromoindole and restricting TPSA, molecular weight, rotatable and aromatic ring counts: SELECT cd_structure FROM nci_10m WHERE jc_compare(structure, 'Brc1cnc2ccccc12','sep=! t:s!ctFilter:(PSA() <= 200) && (rotatableBondCount() <= 10) && (mass() <= 500) && (aromaticRingCount() <= 4) ') = 1 New interface to ChemAxon API features from SQL accessible from non-java programs as well. Enhanced performance of certain SQL queries. JChem Cartridge for Oracle

15 Slide 15 Structural Search Using ChemAxon Tools April 2005 15 Query features 1. Atomic features Query atom types: any, hetero, list, not list Pseudo atoms e.g. Resin Explicit lone pairs (matches to implied lone pairs as well. Charge, isotope, radical Query properties: SymbolDescription H Total hydrogen count aAromatic AAliphatic R Ring count in SSSR r Ring size in SSSR v valence X Connectivity

16 Slide 16 Structural Search Using ChemAxon Tools April 2005 16 Query features 2. Atomic SMARTS features SMARTS atoms: Additional query properties: Example: Carbonyl C, but not amide SymbolDescription D Degree h Implicit H count & ;, !Logical operators $( )Recursive smarts +0, -0Zero charge

17 Slide 17 Structural Search Using ChemAxon Tools April 2005 17 Query features 3. Bond features & components Query bond types: Any, single or double, single or aromatic, double or aromatic Bond topology: chain/ring Smarts bonds Component level grouping SymbolDescription - = #Single, double, triple :aromatic &, ; !Logical operators @Ring bond / \ /? \?Directional bond (cis/trans) SymbolDescription (C.C)Same component (C).(C)Different component C.CNo component restrictions

18 Slide 18 Structural Search Using ChemAxon Tools April 2005 18 Stereo searching 1. Double bonds Levels of check: –All –Only marked double bonds (MDL: stereo care flag) –None Not cis Not trans Cis or trans (unknown) Trans Cis MeaningDepiction

19 Slide 19 Structural Search Using ChemAxon Tools April 2005 19 Stereo searching 2. Tetrahedral chirality Stereo bond types: Relative stereo configuration Chiral flag model Enhanced stereo representation: AND, OR, ABS groups Up or downDownUp

20 Slide 20 Structural Search Using ChemAxon Tools April 2005 20 Reaction search Reactants, agents, products Transformation recognition (mapping) Stereospecific reactions (inversion, retention) Reactant grouping

21 Slide 21 Structural Search Using ChemAxon Tools April 2005 21 R-group search Scaffold, R-group definitions Monovalent, divalent R-groups R-logic Occurrence If-then RestH

22 Slide 22 Structural Search Using ChemAxon Tools April 2005 22 Hydrogens H representations: – Explicit – Implicit – Query H count (total or implicit) Example: Considered in ABAS Explicit HImplicit HQuery H count Query Target Query

23 Slide 23 Structural Search Using ChemAxon Tools April 2005 23 Standardization Explicit hydrogens removal Aromatic bonds Mesomers Tautomers Counterions Stereo representation

24 Slide 24 Structural Search Using ChemAxon Tools April 2005 24 Standardization - Aromaticity Representations KekuléAromatic Example: The two Kekulé representations below dont match Two options available: ChemAxon & Daylight aromatization

25 Slide 25 Structural Search Using ChemAxon Tools April 2005 25 Standardization Example afterbefore

26 Slide 26 Structural Search Using ChemAxon Tools April 2005 26 Similarity search Descriptors: – Chemical hashed fingerprint – 2D (topological) pharmacophore fingerprint – BCUT – Structural keys – Hypothesis fingerprints: minimum, average Dissimilarity Metrics: – Tanimoto: standard, scaled, asymmetric – Euclidean: standard, normalized, weighted, asymmetric – Optimized for a set of actives

27 Slide 27 Structural Search Using ChemAxon Tools April 2005 27 MC(E)S 1. Pairs of molecules The largest connected common subgraph Application: reaction automapping in Marvin

28 Slide 28 Structural Search Using ChemAxon Tools April 2005 28 MCS 2. Library MCS The LibMCS program rapidly creates a hierarchy of MCS-es on a library. Applications: Identification of the most frequently occurring MCS. Focused set analysis Clustering based on common substructures

29 Slide 29 Structural Search Using ChemAxon Tools April 2005 29 Hierarchy calculation performance LibraryLibrary size Time(s)ClustersTop level clusters No. of levels NCI (small molecules, random, diverse sets) 5006.8279145 1,00013.4440166 5,000141851425 d2 inhibitors (medium sized molecules, low diversity) 50010.824337 1,00025.649596 Thrombin inhibitors (medium sized molecules, medium diversity) 1,0006752886 3,0002421186126

30 Slide 30 Structural Search Using ChemAxon Tools April 2005 30 R-group decomposition JChem is able to identify the ligands of a given scaffold at specified substitution positions: Query(scaffold) Result Library R-group decomposition

31 Slide 31 Structural Search Using ChemAxon Tools April 2005 31 Applications of Chemical Terms CT virtual synthesis reaction and synthesis rules pharmacophore analysis pharmacophore definitions drug design goal functions structural search advanced query expressions e.g. in the Cartridge

32 Slide 32 Structural Search Using ChemAxon Tools April 2005 32 Chemical Terms searching match("olefine.mol") && !match("c1ccncc1") && (atomCount(16) == 0) || (mass() < 300); goal functions inhibitor = inhibitor.mol; (similarity(inhibitor, pharmacophore_tanimoto) > 0.8) && (similarity(inhibitor, chemical_tanimoto) < 0.5); filtering (mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10); structure matching functions (describing functional groups, reaction sites, similarity…) property calculations (partial charge distribution, pKa, logP, electrophility…, etc) arithmetic and logic-operators Elements of the language Chemical Terms examples

33 Slide 33 Structural Search Using ChemAxon Tools April 2005 33 Chemical Terms Some available functions Structural search (match, matchcount) Partial charge distribution pKa, Log P, Log D, major microspecies Polarizability Topological Polar Surface Area Number of rotatable bonds, rings, aromatic rings, etc. Number of HB donors/acceptors Exact mass Arithmetic and logic operators Extensible: your own Java plugins can be easily added. Etc.

34 Slide 34 Structural Search Using ChemAxon Tools April 2005 34 Future plans More query features (e.g link nodes, ring bond count, unsaturated atom) Flexible search options: tautomeric search, ignore bond types, salts, etc. Search targets having R-groups (Markush structures) etc.

35 Slide 35 Structural Search Using ChemAxon Tools April 2005 35 Summary Structural search provides a useful set of tools for chemists and cheminformaticians. ChemAxon JChem suite contains a broad range of chemical search facilities and the presented benchmark results illustrate the high performance of JChem search. The new Chemical Terms language is a beneficial complement to structural searches allowing data mining made easy.

36 Slide 36 Structural Search Using ChemAxon Tools April 2005 36 Links Home page –www.chemaxon.com Forum –www.chemaxon.com/forum Animated demos and tutorials –www.chemaxon.com/demos Presentations and posters –www.chemaxon.com/conf

37 Slide 37 Structural Search Using ChemAxon Tools April 2005 37 Máramaros köz 3/a Budapest, 1037 Hungary info@chemaxon.com www.chemaxon.com Thank you for your attention


Download ppt "1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia First presented at Applications of Cheminformatics."

Similar presentations


Ads by Google