Download presentation

Presentation is loading. Please wait.

Published byTimothy Joyce Modified over 2 years ago

1
Topological, pH-dependent Fuzzy Pharmacophore Triplet Fingerprints 2D-FPT Dragos Horvath, Fanny Bonachera UMR 8576 CNRS Université des Sciences & Technologies de Lille

2
Pharmacophore Patterns Ligand-site affinity ~ functional group complementarity Functional groups of similar physicochemical behavior represent pharmacophore types: –Hydrophobic, Aromatic, Hydrogen Bond (HB) donors, Cations, HB Acceptors, Anions. The pharmacophore pattern of a molecule characterizes the relative arrangement of all its pharmacophore types –What pharmacophore types are represented? –How are they arranged (spatially, topologically) with respect to each other ? –How can these aspects be captured numerically to yield molecular descriptors of the pharmacophore pattern?

3
Exploiting pharmacophore patterns… N-dimensional vector D(M)=[D 1 (M), D 2 (M), …,D N (M)]; each D i encodes an element of the pharmacophore pattern – Allows meaningful quantitative definitions of molecular similarity: Neighborhood Behavior: Similar molecules - characterized by covariant vectors - are likely to display similar biological properties As chemists do not easily perceive the pharmacophore pattern, such covariance may reveal hidden but real molecular relatedness… –May serve as starting point for searching a binding pharmacophore – the subset of features that really participate in binding to a receptor Machine learning to select those elements D i that are systematically present in actives, but not in inactives of a molecular learning set!

4
Some examples of "hidden similarity"

5
Tricentric Pharmacophore Fingerprints: monitoring feature arrangement Topological: the distance between two features equals the (minimal) number of chemical bonds between them N N O N C l Spatial: if stable conformers are known, use the distance in Ǻ between two features

6
Example: Binary Pharmacophore Triplets Hp3-Hp3-Hp3Hp3-Hp3-Hp4Hp3-Hp3-Hp5 … Ar4-Hp3-Hp4Ar4-Hp3-Hp5 ………… Hp7-Ar4-PC6 … Hp3-HA5-Ar …00……1………0……0… Basis Triplets: all possible feature combinations all possible feature combinations at a given series of distances… at a given series of distances… Hp4-HA5-Ar ? ? Pickett, Mason & McLay, J. Chem. Inf. Comp. Sci. 36: (1996) ………

7
First key improvement: Fuzzy mapping of atom triplets onto basis triplets in 2D-FPT …00…+6……+3…………0…5 5 4 Hp3-Hp3-Hp3Hp3-Hp3-Hp4Hp3-Hp3-Hp5 … Ar4-Hp3-Hp4Ar4-Hp3-Hp5 ………… Hp7-Ar4-PC6 … Hp3-HA5-Ar5Hp4-HA5-Ar5 ……… D i (m) = total occupancy of basis triplet i in molecule m.

8
Combinatorial enumeration of basis triplets Example: there are basis triplets, verifying triangle inequalities, when considering 6 pharmacophore types and 11 edge lenghts between E min =3 to E max =13 with an increment of E step =1: (3, 4, 5,…13) –Canonical representation: T 1 d 23 -T 2 d 13 -T 3 d 12 with T 3T 2T 1 (alphabetically) Hp7-Ar4-PC6 Ar4-Hp7-PC6 –Out of two corners of a same type, priority is given to the one opposed to the shorter edge Ar4-Hp7-Hp6 Ar5-Hp6-Hp7

9
Triplet matching procedure The triplet matching score represents the optimal degree of pharmacophore field overlap: –if corner k of the triplet is of pharmacophore type T, e.g. F(k,T)=1, then it contributes to the total pharmacophore field of type T, observed at a point P of the plane: Horvath, D. ComPharm pp ; in "QSPR /QSAR Studies by Molecular Descriptors", Diudea, M., Editor, Nova Science Publishers, Inc., New York, 2001

10
The Gaussian trick and the overlay score… The overlap integral of the pharmacophore fields throughout all the points P of the plane… … can be evaluated analytically and is a function of distances between matching triangle corners

11
Control parameters for triplet enumeration & matching in two 2D-FPT versions.

12
Second key improvement: Proteolytic equilibrium dependence of 2D-FPT Ar5-NC5-PC8 Ar8-NC8-PC8 ? 12% 88%

13
Third key improvement: a novel similarity scoring scheme for 2D-FPT Classical Euclidean and Hamming distances increase whenever k (m,M)=|D k (M)-D k (m)| >0… –pairs of small & simple molecules (m,m), with D k (m)=D k (m)=0 for almost all the triplets k, have few non-zero contributions –large & complex compounds (M,M) with common, but slightly differently populated triplets D k (M) D k (M) have many small contributions that may nevertheless sum up to higher Euclidean scores! With correlation coefficients, the importance of common triplets, contributing to the cross-product D k (m)xD k (M) may be overemphasized…

14
Piecewise monitoring of the differences in the fingerprint… A triplet k may, with respect to a pair of molecules, be shared (++), null (--) or exclusive (+-) –fuzzy levels of association to each category c={(++),(--),(+-)} such that ++ (M,m) + +- (M,m) + -- (M,m) =1 Specifically calculate, for each category c: –fractions of triplets f c in that category, –weighed, normed partial Hamming distances W c : f c M,m 1 N N T k c M,m T k 1 W c m,M k1 N T W kk c m,M k m k M k1 N T W k The FPT-specific dissimilarity score FPT (M,m): the linear combination of fractions and partial Hamming distances with optimal Neighborhood Behavior with respect to a subset of training data The FPT-specific dissimilarity score FPT (M,m): the linear combination of fractions and partial Hamming distances with optimal Neighborhood Behavior with respect to a subset of training data

15
Molecular Similarity & Neighborhood Behavior… In chemoinformatics, molecular dissimilarity is a metric (distance) (m,M) between the points m and M representing the compounds in a descriptor space. The concept of Neighborhood Behavior * (NB) of calculated similarity metrics is the quantitative equivalent (of statistical nature) of the Similarity Principle: –If the probability to pick a pair of compounds with similar activity levels increases with decreasing (m,M), then this space and its metric are told to display a significant Neighborhood Behavior with respect to the considered biological activity. * Patterson, D.E., Cramer, R.D., Ferguson, A.M., Clark, R.D., Weinberger, L.E., Neighborhood Behavior: A Useful Concept for Validation of Molecular Diversity Descriptors, J. Med. Chem. 1996, 39,

16
s 1.0 Neighborhood behavior: in how far does structural similarity guarantee similar activities? (M,m) l (M,m)> l (M,m) s TruePositives (TP) (TP) FalsePositives (FP) (FP) False (?) Negatives (FN) (FN) TrueNegatives (TN) (TN) s BioPrint® activity profile differences (m,M)

17
Specific metric significantly improves the Neighborhood Behavior of 2D-FPT (v1)

18
. Consistency inversion of specific FPT metric may be due to top ranking of complex pairs!

19
Proteolytic equilibrium dependence significantly improves the NB of 2D-FPT

20
Some activity cliffs in rule-based descriptor space are smoothed out in 2D-FPT-space Neutral Cation Neutral Anion Neutral 90%Cation Neutral 50%Cation Neutral Anion Neutral 40%Cation Neutral 70%Cation

21
Neighborhood Behavior of 2D-FPT compares favorably to the one of other descriptors/metrics

22
Successful Virtual Screening Simulations D2D2 TK

23
Successful QSAR model construction with 2D- FPT: predicting c-Met TK activity 25 variables entering nonlinear model 153 molecules for training: RMSE=0.4 (log units), R 2 = molecules for validation: RMSE=0.8 (log units), R 2 = validation molecules out of 40 mispredicted by more than 1 log

24
ChemAxon Tools used for development… Software written in Java, based on the ChemAxon API: –molecule input and standardization tools –ShortestPath class used to calculate topological distances –pKaPlugin used to enumerate all microspecies and their relative concentrations at given pH value –PMapper used to set pharmacophore flag in each microspecies – using a customized.xml setup file that relies on the actual formal charges seen in the microspecies to set flags –JChem used for 2D-FPT storage –Marvin visualizer adapted to display actual occurrences of triplets in molecules

25
In progress & on the wishlist… 3D FPT version under study –does it pay off to generate conformers? How many would you need to get better results than with 2D-FPT? Whats the best conformational sampler to use? Accessibility-weighted fingerprints? –class to return (topological and/or 3D) estimate of the solvent- accessible fraction of an atom? Tautomer-dependent fingerprints? –if tautomers and their percentage were enumerated like any other microspecies…

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google