Presentation is loading. Please wait.

Presentation is loading. Please wait.

ChemAxons Chemical Fingerprints-Based Clustering to Assess AurSCOPE Databases Chemical Diversity.

Similar presentations


Presentation on theme: "ChemAxons Chemical Fingerprints-Based Clustering to Assess AurSCOPE Databases Chemical Diversity."— Presentation transcript:

1 ChemAxons Chemical Fingerprints-Based Clustering to Assess AurSCOPE Databases Chemical Diversity

2 Knowledge Base Integration Platform Query Interface Analysis/Display Applications The Aureus Pharma System

3 AurSCOPE Statistics: March 2006 PublicationsActivitiesLigands GPCR publications including 3525 patents Ion Channel pub including patents Kinase pub Including 1069 patents ADME/ Drug-Drug Interactions pub parent compound + metabolites HERG 800 pub

4 AurQUEST Query management software for AurSCOPE Web-based application integrating ChemAxon technology Powerful Query Builder -Biological and Chemical Queries -Structural search using ChemAxon tools Efficient Navigation Different Export Formats (SDF, RDF, …)

5 Counterions MW > 700 Inorg NAS Stereo-duplicates Identical mol. but different salts … AurSCOPE database 2D unique structures Data Preprocessing

6 molecules (*) (9897 uniques) Protocols: Binding or Electrophysiology Target: All Target type: Wild Parameter filter K i, EC 50, IC 50 < 300 nM (*) November 2005 AurSCOPE Ion Channels: Retrieving Active Molecules

7 AurSCOPE Ion Channels: Activity Distribution

8 Standardization of molecules. Generating Chemical Fingerprints (CF). Optimization of different CF parameters. CF-based Jarvis-Patrick clustering with various adjusted parameters. Encoding Chemical Space and Clustering

9 Parameters for Generating Hashed Chemical Fingerprints Fingerprint length - The number of bits in the bit string. - Bigger fingerprint increases the capacity for storing information on molecules. Maximum pattern length - The maximum length of atoms in the linear paths that are considered during the fragmentation of the molecule. (The length of cyclic patterns is not limited.). - Longer and more patterns hold more information on the molecule. Bits to be set for patterns - After detecting a pattern, some bits of the bit string are set to "1". The number of bits used to code patterns is constant. - Higher number of bits increases the coded information from a pattern. Darkness of the fingerprint - The percentage of "1" digits in the bit string. We consider fingerprints with more ones "darker" than those with less ones.

10 FP lengthMax #bondsMax #bitsAver. DarknessMax. Darkness Chemical Fingerprints: Effect of Parameters

11 1. 1. For each structure, collect the set of nearest neighbors that has a dissimilarity (distance) less than a T threshold value. Two structures cluster together if they are in each others list of nearest neighbors They have at least R min of their nearest neighbors in common, where R min is a ratio of the length of the shorter list. CF-based Jarvis-Patrick Clustering

12 T R min # Clusters# Singletons Chemical fingerprint length in bits: 2048 Maximum number of bonds in patterns: 7 Maximum number of bits to set for each pattern: 5 CF-based Jarvis-Patrick Clustering

13 Similarity threshold = 0.85 (*) (*) Martin Y.C. et al. Do structurally similar molecules have similar biological activity? J. Med. Chem. 2002, 45,

14 Most Populated Clusters

15 Jarvis-Patrick Clustering: missclassifications ??

16 Jarvis-Patrick Clustering: Diverse Singletons

17 Most Populated Clusters: Biological " Projection" Gamma aminobutyric acid A receptor Voltage-gated calcium channel Nicotinic acetylcholine receptorGamma aminobutyric acid A receptor Nicotinic acetylcholine receptorGamma aminobutyric acid A receptor

18 Potassium channel Gamma aminobutyric acid A receptor Voltage-gated calcium channel 5-HT 3 Nicotinic acetylcholine receptor Gamma aminobutyric acid A receptor

19 Conclusions JKlustor integrates computationally rapid and efficient clustering tools. Shortcomings to be addressed to deal with artificial singletons. Future work: combination with Maximum Common Substructure approach (LibMCS). Other algorithms (Ward,…)

20


Download ppt "ChemAxons Chemical Fingerprints-Based Clustering to Assess AurSCOPE Databases Chemical Diversity."

Similar presentations


Ads by Google