Scientific & technical presentation JChem Cartridge for Oracle

Slides:



Advertisements
Similar presentations
February 2013 Szilárd Dóránt Scientific & technical Presentation Pipeline Pilot Integration.
Advertisements

Solutions for Cheminformatics
Solutions for Cheminformatics April 2010 Company and product overview.
Virtual Synthesis - Reactor
August 2010, ACS National meeting, Boston Representation of Markush structures from molecules towards patents Szabolcs Csepregi Solutions for Cheminformatics.
1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia First presented at Applications of Cheminformatics.
Solutions for Cheminformatics September 2009 Company and product overview.
Version 5.3, February 2010 Scientific & technical presentation JChem Base.
1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia January, 2007 Structural Search Using ChemAxon.
May, 2008 Presenting: Szabolcs Csepregi The ChemAxon Markush project overview and development discussion.
Scientific & technical presentation Fragmenter Nóra Máté Sept 2005.
Integrating ChemAxon technology into your End User Applications Java solutions for cheminformatics Ver. Mar., 2005.
JKlustor clustering chemical libraries presented by … maintained by Miklós Vargyas Last update: 25 March 2010.
Scientific & technical presentation Calculator Plugins January 2011.
Instant JChem INFORMATICS MATTERS
Scientific & technical presentation MarvinSketch and MarvinView
Java Solutions for Cheminformatics Feb 2008 Whats new for PP.
Scientific & technical presentation Structure Visualization with MarvinSpace Oct 2006.
Version 5.3, April 2010 The ChemAxon Markush project overview and development discussion.
Calculator Plugins József Szegezdi, Nóra Máté. ChemAxon Calculator Plugins ChemAxons plugin handling mechanism provides a framework for calculating various.
Structural Search Using ChemAxon Tools
JChem Web Services Server Jonathan Lee Solutions for Cheminformatics Technical Product Presentation.
Scientific & technical presentation Standardizer January 2008.
Chemical Naming Daniel Bonniot, PhD October 2008.
Nov 2008 Scientific & technical presentation JChem for Excel.
ChemAxon European UGM Visegrad 2008 Sketching and viewing with Marvin Features, tips and tricks Akos Papp.
Pipeline Pilot Integration Szilard Dorant Solutions for Cheminformatics.
Whats new in JChem back-end and Markush storage, search and enumeration Szabolcs Csepregi Solutions for Cheminformatics.
JChem Base chemical database
Java Solutions for Cheminformatics June 2007 Company and product overview.
In Silico Synthesis György Pirok, Nóra Máté. Elements of the Virtual Synthesis Technology A language for describing chemical rules –Chemical Terms A library.
Scientific & technical presentation Calculator Plugins József Szegezdi, Nóra Máté Sept 2005.
SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
Solutions for Cheminformatics
Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.
Interfacing the JChem Suite outside of Java Jonathan Lee Solutions for Cheminformatics.
Welcome to San Diego!! Alex Drijver, CEO Solutions for Cheminformatics.
Whats new in Marvin? GUI features –atom sets, MarvinSpace integration, better customization –AWT is not used anymore File IO –new file types and seeking.
Java Solutions for Cheminformatics April 2006 JChem Cartridge For Oracle - Latest.
UGM, June, 2007 Presenting: Szabolcs Csepregi JChem Base and Cartridge latest.
Instant JChem - current status and what's coming soon. Tim Dudgeon Solutions for Cheminformatics.
ChemAxon - Pipeline Pilot Integration
1 Szabolcs Csepregi May, 2005 Structural Search Using ChemAxon Tools.
Solutions for Cheminformatics
1 Péter Kovács May, 2005 Compound storage / retrieval with JChem Cartridge for Oracle.
UGM, June, 2007 Szabolcs Csepregi Markush: Whats new, development discussions.
Name to structure, Structure to name, chemicalize.org Daniel Bonniot Solutions for Cheminformatics.
Java Solutions for Cheminformatics Structure based predictions – new plugins Zsolt Mohácsi, Nóra Máté, József Szegezdi, Ödön Farkas, Gábor Imre, Imre Jákli.
1 György Pirok, Szilárd Dóránt May, 2005 What is Marvin and how to...
June, 2007 David Spender*, Erika Biró What's new in Marvin and development discussion.
ChemAxon for Developers Ferenc Csizmadia 2008 November – Last updated: 2010 April.
Agricultural Products Group 1 ChemAxons Marvin & JChem (v 3.1.3) vs. MDL® ISIS/Draw ISIS/Host (v 4.0) Seong Jae Yu, David Roush, Usha Ganesh Young Moon,
Solutions for Cheminformatics Marvin features and news Akos Papp.
Name to structure, Structure to name, chemicalize.org Daniel Bonniot de Ruisselet Solutions for Cheminformatics.
2008 Accelrys EUGM Pipelining ChemAxon Szilard Dorant Solutions for Cheminformatics.
Instant JChem 2009 US + EU Seminars Confidential. Copyright© 2009 ChemAxon Kft, Informatics Matters Ltd Instant JChem Instant JChem Seminar series Q
Java Solutions for Cheminformatics March About Us Molecule Drawing and Visualization Structure Searching Cartridge Structure Standardization Molecular.
Solutions for Cheminformatics
Dr. Matthew Wright Product Director.
September 2014, Version Szilárd Dóránt Scientific & technical Presentation Pipeline Pilot Integration.
Molecular Descriptors
1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal,
May 2009 ChemAxon - What’s New?. What’s new and hot? All products have seen enhancements in the past 12 months BUT WHAT’S REALLY HOT?
National Center for Supercomputing Applications NCSA OPIE Presentation November 2000.
What’s new? Update on Netrics Matching Engine V4.0 and V4.1 Dave Chamberlain
June 2016, Version Scientific & technical Presentation Pipeline Pilot Integration.
Pipeline pilot Components
Building Hypotheses and Searching Databases
Daylight and Discovery
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Presentation transcript:

Scientific & technical presentation JChem Cartridge for Oracle version 5.3, January 2010 1 1 1

Contents Purpose of JChem Cartridge Features of JChem Cartridge Constituents of the JChem Cartridge API Normal Tables vs. JChem Tables Architecture of JChem Cartridge 2 2 2

Purpose of JChem Cartridge Access JChem functionality using SQL: SELECT count(*) FROM nci WHERE jc_contains(structure, 'Brc1cnc2ccccc12') = 1 Access JChem in any programming environment offering Oracle connectivity (.NET, Java, Perl, PHP, Python, Apache mod_plsql...)‏ Execute SQL queries efficiently using extensible indexes Precompute chemical information on structures by creating jc_idxtype indexes: CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype The jc_idxtype implementation scans the indexed column for eligible structures in one single performance-optimized operation: domain index scan 3 3 3

Features of JChem Cartridge Adds chemistry knowledge into the SQL language of Oracle (SELECT, INSERT, UPDATE, ...)‏ Substructure, superstructure, full structure, similarity searching Complex chemical expressions using the Chemical Terms language that includes logP, pKa, ... Automatic property calculation during registration Standardization (canonicalization) during registration Structure format conversions (MRV, Molfile, SDfile, RDfile, SMILES, CML, etc.)‏; 2D, 3D image generation Structure enumeration using reaction rules User-defined fingerprint columns Custom similarity search through molecular descriptors Interaction with Oracle optimizer 4 4 4

Structure search features Wide range of query atoms Query properties R-group queries Full SMARTS support Coordination compounds Link nodes Pseudo atoms, lone pairs Relative stereo Reaction search features Hit coloring, position variation Polymers See detailed information on structure search: www.chemaxon.com/conf/Structural_Search.ppt 5

Search options Stereo on/off Ignore charge/isotope/radical/valence/mixture brackets Vague bond matching options Chemical Terms filter Tautomer search Inverse hit list Maximum search time / number of hits Combine with non-structure conditions Ordering of results etc. 6

Searching in Markush structures Combinatorial Markush structure registration and search Markush features handled in search & enumeration: R-groups (nesting to any depth) Atom lists, bond lists Position variation bond Link nodes and repeating units Homology groups Compatible Markush enumeration plugin Detailed description: http://www.chemaxon.com/jchem/doc/user/Query.html#combinatorialMarkush http://www.jchem.com/doc/user/Query.html#explH 7 7

Standardization Default standardization includes: Hydrogen removal Aromatization Custom standardization can be specified for each table or JChem index http://www.jchem.com/doc/user/Standardizer.html 8

Custom Standardization Example JChem Cartridge http://www.chemaxon.com/conf/Standardizer.ppt Custom Standardization Example before after 9

Compatibility and integration File formats: SMILES MDL molfile (v2000 and v3000)‏ MDL SDF RXN RDF MRV IUPAC name, InChI Markush DARC CDX Operating systems: Windows Linux Solaris HP-UX etc. DB engines: Oracle versions 9i R2 or above for alternative RDBMS systems, see the JChem Base presentation: http://www.chemaxon.com/JChem_Base.ppt 10

Elements of the JChem Cartridge API Operators (jc_...) for SQL and their functional forms (jcf package) for PL/SQL Parameters for index creation DML operators for JChem tables Support functions for user defined operators 11 11 11

Operators and functions I. Typical operator: jc_<some-operation>(<target-structure-column>, <some- operand>)‏ Operator for substructure search: jc_contains(<target-structure-column>, <query-structure>)‏ “Swiss-army-knife” search operator: jc_compare(<target-structure-column>, <query-structure>, <options>)‏ 12 12 12

Operators and functions II. Chemical Terms Over 100 built-in functions, including elemental analysis topological descriptors property predictions (logP/D, pKa, PSA, H bond donors/acceptors, charge etc). tautomers, protonation forms User-defined functions. Example: The Lipinski-rule in chemical terms SELECT count(*) FROM nci_3m WHERE jc_compare(structure, 'O=C1ONC(N1c2ccccc2)-c3ccccc3','sep=! t:s!ctFilter:(mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10)') = 1 13 13 13

Operators and functions III. jc_compare: substructure/similarity/exact searching combined with Chemical Terms expressions jc_matchcount: number of occurences of the query structure in the target jc_evaluate: Chemical Terms evaluation jc_molweight: molecular weight jc_formula: molecular formula jc_react: structure enumeration based on virtual reactions jc_standardize: structure canonicalization jc_molconvert: conversion to different formats (image generation is supported) jc_tanimoto: similarity search jcf.hitColorAndAlign: substructure coloring and alignment 14 14 14

Operators and functions IV. Similarity search example displaying ID, SMILES code, and molweight: SELECT cd_id, cd_smiles, cd_molweight FROM my_structures WHERE jc_tanimoto(cd_smiles, 'CC(=O)Oc1ccccc1C(O)=O') >= 0.8; Chemical Terms and Query Prefiltering: SELECT id, purchase_date FROM compounds_instock WHERE jc_compare(structure, 'C(=S)([N][N])[S]', 'sep=! t:t!simThreshold:0.9!ctFilter:logp()>1!filterQuery:SELECT rowid FROM compounds_instock WHERE purchase_date > DATE ''2002-01- 01''') = 1 Prefiltering allows to execute search on a subset of rows more efficiently. Dynamic generation of static images: SELECT jc_molconvertb(structure, 'png -2') FROM nci where id = :1 Avaliable image formats: png, jpeg, svg, ... PNG 15 15 15

Operators and functions V. Calculate logp: SELECT jc_evaluate('OC(=O)c1c2ccccc2nc3ccccc13', 'logp') FROM dual; Generate tautomers: SELECT jc_evaluate_x('NC1=C(CC=O)C=CCC1', 'chemTerms:tautomers() outFormat:smiles') FROM dual; Generate resonants: SELECT jc_evaluate_x('NC1=C(CC=O)C=CCC1', 'chemTerms:resonants() outFormat:smiles') FROM dual 16 16 16

Index parameters Index parameters affect: Examples: Fingerprint attributes Standardizer configuration Table space and storage options of the index table Examples: Standardization by stripping hydrogens and using basic aromatization: CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype PARAMETERS('STD_CONFIG=removeexplicitH..aromatize:b')‏ Add structural keys to fingerprint for more efficient substructure searching (structural keys are defined in table stfp_keys): CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype PARAMETERS('STRUCTURALFP_CONFIG=select structure from stfp_keys')‏ 17 17 17

Calls Not Using Indexes Using SQL statements for calling JChem operators on structures not stored in a table Sample SQL statement without index information: SELECT jc_contains('O=C1C=CNC=C1', 'n1ccccc1') FROM dual Setting default properties for calls not using indexes: CALL jc_set_default_property('standardizerConfig', 'aromatize:b')‏ 18 18 18

Supported Column Types VARCHAR2: typically for short formats, e.g. SMILES CLOB BLOB for longer formats, e.g. MDL molfile, Marvin (mrv)‏ 19 19 19

Supported Structure Table Types Regular Table: nci_1k JChem Table (generated by jcman or API): jc_nci_1k CREATE INDEX jcxnci_1k... Rowid of the base table (nci_1k)‏ Index table: jcxnci_1k_jcx CREATE INDEX jcxjc_nci_1k... 20 20 20

Regular Tables vs. JChem Tables Regular structure tables base table and index table are physically distinct index properties are specified as index parameters JChem structure tables base table and index table are physically the same most of the “index” properties are specified during table creation (jcman or Java API)‏ Pros & Cons: inserts from outside the database are faster with JChem tables JChem tables require Java API or the jcman command line tool (for table creation) and Java API or special cartridge functions for INSERTs, UPDATEs and DELETEs; standard SQL can be used with regular tables in all cases. 21 21 21

JChem Cartridge Architecture Computation intensive operations are performed in a separate Sun JVM. Advantage: fast execution (optimized native code)‏ flexibility in deployment JChem Server Oracle JChem Cartridge JChem Streams RMI JChem Base Search Update Cache Cache JDBC JChem Core 22 22 22

Performance Table containing 19,528,372 structures from PubChem with Intel Quad CPU Q6600 2.40GHz desktop PC, 8GB memory desktop PC Substructure search results: Query Structure Hit Count Time (ms)‏ C1CN1c2cnnc3c(cncc23)C4=CSC=C4 1487 O=C1ONC(N1c2ccccc2)c3ccccc3 129 823 Oc1c(N=N)c(cc2cc(ccc12)S(O)(=O)=O)S(O)(=O)=O 93 764 C(Sc1ncnc2ncnc12)c3ccccc3 489 786 NC1=CC=NC2=C1C=CC(Cl)=C2 6,001 1,189 c1ncc2ncnc2n1 146,256 6,665 Clc1ccccc1 2,975,285 82,646 JChem 5.2 23 23 23

Future plans Flexible 3D pharmacophore search R-Group decomposition Clustering Maximum common substructure search type Extended fingerprint connectivity (EFPC) 24 24 24

Summary JChem Cartridge for Oracle allows to access the rich functionality of JChem Base in a flexible and efficient manner. JChem Cartridge for Oracle uses creative solutions to broaden the applicability of JChem's core functions while preserving key benefits of the Java platform. 25 25 25

Links Documentation Forum Brochure www.jchem.com/doc/admin/cartridge.html www.jchem.com/doc/guide/cartridge/index.html Forum www.chemaxon.com/forum/ Brochure www.chemaxon.com/brochures/JChem_Cartridge.pdf 26 26 26

Visit other technical presentations MarvinSketch/View http://www.chemaxon.com/MarvinSketch_View.ppt MarvinSpace http://www.chemaxon.com/MarvinSpace.ppt Calculator Plugins http://www.chemaxon.com/Calculator_Plugins.ppt JChem Base http://www.chemaxon.com/JChem_Base.ppt JChem Cartridge http://www.chemaxon.com/JChem_Cartridge.ppt Standardizer http://www.chemaxon.com/Standardizer.ppt Screen http://www.chemaxon.com/Screen.ppt JKlustor http://www.chemaxon.com/JKlustor.ppt Fragmenter http://www.chemaxon.com/Fragmenter.ppt Reactor http://www.chemaxon.com/Reactor.ppt 27 27 27