Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alexandre Varnek Faculté de Chimie, ULP, Strasbourg, FRANCE Criblage virtuel Master Chemoinfo.

Similar presentations

Presentation on theme: "Alexandre Varnek Faculté de Chimie, ULP, Strasbourg, FRANCE Criblage virtuel Master Chemoinfo."— Presentation transcript:

1 Alexandre Varnek Faculté de Chimie, ULP, Strasbourg, FRANCE Criblage virtuel Master Chemoinfo

2 Target Protein Large libraries of molecules High Throughout Screening Hit experimental computational Virtual Screening Filtering, QSAR, Docking Small Library of selected hits

3 Molecules are considered as vectors in multidimentional chemical space defined by the descriptors Chemical universe: molecules druglike molecules Virtual screening must be fast and reliable

4 Cible HTS Criblage à haut débit High-throughput screening Hits Lead Génomique Analyse de données Optimisation Candidat au développement Criblage à haut débit

5 Drug Discovery and ADME/Tox studies should be performed in parallel idea target combichem/HTS hit lead candidate drug ADME/Tox studies

6 Methodologies of a virtual screening from A.R. Leach, V.J. Gillet “An Introduction to Chemoinformatics”, Kluwer Academic Publisher, 2003

7 Platform for Ligand Based Virtual Screening Similarity search ~10 6 – 10 9 molecules ~ – 10 4 molecules Candidates for docking or experimental tests Filters QSAR models

8 Criblage à haut débit (HTS) Mots clés: - Chimie combinatoire -Criblage à haut débit (High Throughput Screening (HTS)) - Screening virtuel - Aspect Drug-like - Training sets jusqu’à composés

9 Virtual Screening Molecules available for screening (1) Real molecules millions in in-house archives of large pharma and agrochemical companies millions of samples available commercially (2) Hypothetical molecules Virtual combinatorial libraries (up to molecules)

10 Methods of virtual High-Throughput Screening Filters Similarity search Classification and regression structure – property models Docking

11 Filters to estimate “drug-likeness”

12 Lipinski rules for intestinal absorption (« Rules of 5 ») H-bond donors < 5 (the sum of OH and NH groups); MWT < 500; LogP < 5 H-bond acceptors < 10 (the sum of N and O atoms without H attached).

13 Lipinski rules for drug-like molecules (« Rules of 5 »)


15 Example of different filters: Rules for Absorbable compounds

16 Remove compounds containing too many rings

17 Remove compounds with toxic groups

18 Remove compounds with reactive groups

19 Remove False-Positive Hits

20 Remove poorly soluble compounds

21 Filter on inorganic and heteroatom compounds

22 Remove compounds with multiple chiral centers

23 MW = 837 logP=4.49 HD = 3 HA = 15 Paclitaxel (Taxol): violation of 2 rules

24 The Rule of Five Revisited: Applying Log D in Place of Log P in Drug-Likeness Filters S. K. Bhal, K. Kassam, I. G. Peirson, and G. M. Pearl, MOLECULAR PHARMACEUTICS, v.4, , (2007) Utilizing pH dependent log D as a descriptor for lipophilicity in place of log P significantly increases the number of compounds correctly identified as drug-like using the drug-likeness filter: log D 5.5 < 5 95% of all drugs are ionizable : 75% are bases and 20% acids logD vs logP

25 Synthetic Accessibility is proportional to fragment’s occurrence in the PubChem database Ertl and Schuffenhauer Journal of Cheminformatics :8

26 Altogether 605,864 different fragment types have been obtained by fragmenting the PubChem structures. Most of them (51%), however are singletons (present only once in the whole set). Only a relatively small number of fragments, namely 3759 (0.62%), are frequent (i.e. present more than 1000-times in the database). Ertl and Schuffenhauer Journal of Cheminformatics :8 Frequency distribution of fragments Synthetic Accessibility

27 The most common fragments present in the million PubChem molecules. The "A" represents any non- hydrogen atom, "dashed" double bond indicates an aromatic bond and the yellow circle marks the central atom of the fragment. Ertl and Schuffenhauer Journal of Cheminformatics :8 Synthetic Accessibility

28 Distribution of (- Sascore) for natural products, bioactive molecules and molecules from catalogues. Correlation of calculated (-SAscore ) and average chemist estimation for 40 molecules (r 2 = 0.890) Ertl and Schuffenhauer Journal of Cheminformatics :8

29 Similarity Search: unsupervised and supervised approaches

30  2d (unsupervised) Similarity Search Tanimoto coef molecular fingerprints


32 Contineous and Discontineous SAR

33 structural similarity “fading away” … reference compounds Structural Spectrum of Thrombin Inhibitors

34 small changes in structure have dramatic effects on activity “cliffs” in activity landscapes discontinuous SARs continuous SARs gradual changes in structure result in moderate changes in activity “rolling hills” (G. Maggiora) Structure-Activity Landscape Index: SALI ij =  A ij /  S ij  A ij  S ij ) is the difference between activities (similarities) of molecules i and j R. Guha et al. J.Chem.Inf.Mod., 2008, 48, 646

35 VEGFR-2 tyrosine kinase inhibitors bad news for molecular similarity analysis... MACC STc: 1.00 Analog 6 nM 2390 nM small changes in structure have dramatic effects on activity “cliffs” in activity landscapes lead optimization, QSAR discontinuous SARs

36 Example of a “Classical” Discontinuous SAR Adenosine deaminase inhibitors (MACCS Tanimoto similarity) Any similarity method must recognize these compounds as being “similar“...

37 Supervised Molecular Similarity Analysis

38 Dynamic Mapping of Consensus Positions  Prototypic “mapping algorithm” for simplified binary- transformed* descriptor spaces  Uses known active compounds to create activity-dependent consensus positions in chemical space  Operates in descriptor spaces of step-wise increasing dimensionality (“dimension extension”)  Selects preferred descriptors from large pools * median-based, i.e. assign “1” to a descriptor if its value is greater than (or equal to) its screening database median; assign “0” if it is smaller Godden et al. & Bajorath. J Chem Inf Comput Sci 44, 21 (2004)

39 DMC Algorithm Calculate and binary transform descriptors Compare descriptor bit strings of reference molecules and determine consensus bits Select DB compounds matching consensus bits Re-generate bit strings permitting bit variability Select DB compounds matching extended bit strings Repeat until a small selection set is obtained Descriptor bit strings for reference molecules 1. Dimension extension: = 1.0 or = 0.0 no variability  0.9 or  % variability  0.8 or  % variability e.g. 0%, 10%, 20% permitted bit variability: longer bit strings – fewer matching DB compounds … Calculate consensus bit string: 2. Dimension extension: (white “0”, black “1” gray, variably set bits) 0 1 2

40 QSAR/QSPR models

41 Virtual Sreening Database Experimental Tests Hits Screening and hits selection QSPR model Useless compounds

42 Libraries profiling: indexing a database by simultaneous assessment of various activities (Prediction of Activity Spectra for Substances) PASS software Example:

43 For each fragment i

44 PASS Calculations of « P(act) » and « P(inact) » Molecule is considered as active if P(act) > P(inact) or/and P(act) > 0.7 Naïve Bayes estimator

45 Quantitative Structure-Property Relationships (QSPR) Y = f (Structure) = f (descriptors) QSPR restricts reliable predictions for compounds which are similar to those used for the obtaining the models. Similarity / pharmacophore search approaches are still inevitable as complementary tools

46 Combinatorial Library Design

47 Virtual Screening... when target structure is unknown Virtual library Screening library Diverse Subset Parallel synthesis or synthesis of single compounds Design of focussed library Screening HTS Hits

48 Generation of Virtual Combinatorial Libraries if R1, R2, R3 = andthen Markush structure Fragment Marking approach

49 1.Substituent variation (R 1 ) 2.Position variation (R 2 ) 3.Frequency variation 4.Homology variation (R 3 ) ( only for patent search) n = 1 – 3 R 2 =NH 2 R 3 = alkyl or heterocycle R 1 = Me, Et, Pr The types of variation in Markush structures:

50 Generation of Virtual Combinatorial Libraries Reaction transform approach from A.R. Leach, V.J. Gillet “An Introduction to Chemoinformatics”, Kluwer Academic Publisher, 2003

51 Issues and Concepts in Combinatorial Library Design Size of the library Coverage of properties („chemical space“) Diversity, Similarity, Redundancy Descriptor validation Subset selection from virtual libraries

52 Hot topics in chemoinformatics Predictions vs interpretation Public availability of chemoinformatics tools - multi-component synergistic mixtures, new materials, metabolic pathways,... QSAR of complex systems New approaches in structure-property modeling - descriptors, - applicability domain - machine-learning methods (inductive learning transfer, semi-supervised learning,....) New techniques to mine chemical reactions

53 Nathan BROWN “Chemoinformatics—An Introduction for Computer Scientists” ACM Computing Surveys, Vol. 41, No. 2, Article 8, February 2009 Predictions vs interpretation

54 Ensemble modeling Non-linear machine-learning methods (SVM, NN, …) Descriptors correlations Problems : Reliable estimation (prediction) of the given property. What do end users expect from QSAR models ?

55 Public accessibility of models: WEB based platform for virtual screening

56 Some Screen Shots: Welcome Page…

57 ISIDA property prediction WEB server

58 ISIDA ScreenDB tools -only INTERNET browser is required -Different descriptors -(ISIDA fragments, FPT, ChemAxon) - Similarity search with different metrics (Tanimoto, Dice, …) - ensemble modeling approach (simulteneous application of several models) - models applicability domain (automatic detection of useless models)

59 The most fundamental and lasting objective of synthesis is not production of new compounds but production of properties George S. Hammond Norris Award Lecture, 1968

Download ppt "Alexandre Varnek Faculté de Chimie, ULP, Strasbourg, FRANCE Criblage virtuel Master Chemoinfo."

Similar presentations

Ads by Google