Presentation on theme: "COSMO Polarization Charge Densities as Key Information for"— Presentation transcript:
1COSMO Polarization Charge Densities as Key Information for Solubility and PartitioningProperty Predictionand 3D-QSARAndreas KlamtCOSMOlogic GmbH&Co.KGLeverkusen, Germany& Inst. of Physical and Theor. Chemistry,Univ. of Regensburg
2Let us look for an alternative representation The molecular electrostatic potential MEP / ESPPolarity, i.e. electrostatics, is generally accepted as most important for theinteractions of molecules, and thus for the understanding and predictionof the behaviour drug molecules.Hydrogen bonding is considered as a special flavor of polarity.Molecular electrostatic potentials are the most widely used concept toquantify and visualize molecular polarity.But it has a few nasty aspects:1) It diverges at the positions of the atomic nuclei,2) It decays only slowly with the distance from the molecule.3) There is no clear recipe, at which placeor on which surface the MEP should be considered.4) On the same surface, MEPs of ions on a different scalethan those of neutral compounds. They are hardly comparable.Let us look for an alternative representationof molecular polarity.
3The conductor polarization charge density The conductor can be takeninto account during thequantum chemical calculationusing theConductor-like Screening Model(COSMO, Klamt, Schüürmann, 1993)electrondensitypolarization charge densitys = polarization charge / areaconductorenergy, geometry, polarization charge density and conformations in conductorThe COSMO embedding already gives an approximate representation of a polar solvent.For less polar solvents it can be scaled by a simple function f(e) = (e-1)/(e+0.5). COSMO has become one of the most popular continuum solvation modelsBut all dielectric continuum models are fundamentally wrong, and COSMO-RS follows a different concept !
4Interactions of molecules swimming in a conductor There are nolong-range interactionsof moleculesin conductor!!!We only get an energy change DE, if the two cavities contact each other.Basic idea of COSMO-RS: Quantify interaction energies aslocal interactions of COSMO polarization charge densities s and s‘ss‘DEcontact = E(s,s‘)
5Linear dependence of EHB on s DFT-HB-cluster calculations for a week and a strong donors (e/nm²)donor=HCN)DFT/COSMO hydrogen bond energiesdonor=HF)s is the better local interaction descriptor!Polarization charge densities provide a predictive quantification of hydrogen bond energiesKlamt, Reinisch, Eckert, Hellweg, Diedenhofen, Phys. Chem. Chem. Phys., 2012,14, 955ff
6exp. hydrogen bond enthalpies Phys. Chem. Chem. Phys., 2013,15,Interpretation of experimental hydrogen-bond enthalpies and entropies from COSMO polarisation charge densitiesKlamt, Reinisch, Eckert, Graton, Le Questel
7Interaction energy of individual contacts: In a dense liquid all surface pieces are bound in surface pairs, and the total interaction energy can be expressed as a sum of surface interactions Eint(s,s‘).But for liquid phase properties we need free energies, i.e. contact probabilities of all possible contacts!
8For an efficient statistical thermodynamics we reduce the ensemble of molecules to an ensemble of pair-wise interacting surface segments.For handling this we need histograms of surface polarity.Screening charge distribution on molecular surfacereduces to "s-profile"
9Screening charge distribution on molecular surface reduces to "s-profile"
10 Why does it get warm when you mix acetone and? Qualitative thermodynamics based on s-profilesWhy does it get warmwhen you mix acetone and?Because their s-profiles are almost complementary!
11potential of a segment of Next we need the solve the statistical thermodynamicsof an ensemble of surface pieces with a composition:I.e. we need to calculate chemical potentials and contact probabilities of the pairwise interacting surface pieces. “Three weeks of sleepless nights“ led to the exact equation:s-potential, i.e.“pseudo“- chemicalpotential of a segment ofpolarity s in solvent Sfree energy costsof getting partners‘ availablesolvent s-profile,i.e. compostion ofsolvent S wrt sinteraction energyof s and s‘requires iterative solution: µS(s‘)=0
13Molecules can be flexible! For getting the chemical of a soluteX in a solvent we project the solvent s-potential back to the solute surface:„size correction“ orcombinatorial contribution(“lended from chem. eng.“):depends on solute and solventvolumes and areas(e.g. from COSMO cavity)This is the central equation of COSMO-RS,since knowing the chemical potential as a functionof composition and temperature we do have almostthe entire liquid phase thermodynamics in our hands.But stop, I sheated you:Molecules can be flexible!
14For flexible molecules there are multiple local minima (conformations) 2foldFor flexible molecules thereare multiple local minima(conformations)COSMO-RS knows theinternal energy (from DFT)and the individual free energyfrom central COSMO-RS eq.At every temperature andcomposition COSMO-RScan calculate the totalfree energy of the compound(from the partition function)and the exp. values of the allproperties.
15chemical potential of solute X in the gasphase: vapor pressures
16chemical potential of solute X in S: Property Calculationchemical potential of solute X in S:chemical potential of solute X in the gasphase: vapor pressurespartition coefficientsactivity coefficients arbitrary liquid-liquid equilibriavapor pressure
17COSMO-RS Flow Chart of DFT/COSMO COSMOtherm Chemical Structure Phase DiagramsEquilibrium data:activity coefficientsvapor pressure,solubility,partition coefficientsQuantum ChemicalCalculation with COSMO(full optimization)s-potential of mixtures-profiles of compoundsideally screened moleculeenergy + screening chargedistribution on surfaceFast StatisticalThermodynamicsDatabase ofCOSMO-files(incl. all common solvents)other compoundss-profileof mixtureDFT/COSMOCOSMOtherm
18alkanesalkenesalkinesalcoholsetherscarbonylsestersarylsdiverseaminesamidesN-arylsnitrilesnitrochlorowaterCOSMOtherm currently is the most accurate tool for DGsolv prediction:Accuracy (kcal/mol) on 2343 dataCOSMOtherm 0.48SM (fitted on this data set!)PCM ~ (only 3 solvents)A. Klamt, B. Mennucci, J. Tomasi, V. Barone, C. Curutchet, M. Orozco and F. Javier Luque, "On the Performance of Continuum Solvation Methods. A Comment on “Universal Approaches to Solvation Modeling”" Acc. Chem. Res., 2009, 42 (4), pp 489------SAMPL 2009 blind test for prediction of DGsolv of very demanding compounds45 entries from molecular dynamics, Monte Carlo, Continuum Solvation Models, and other methods:COSMO-RS error is about 0.5 kcal/mol smaller than the that of the second best entry.Results of parametrization based on DFT (DMol3: BP91, DNP-basis650 data17 parametersrms = 0.41 kcal/molA. Klamt, V. Jonas, J. Lohrenz, T. Bürger,J. Phys. Chem. A, 102, 5074 (1998)meanwhile:COSMOtherm2.1_0110 with Turbomole BP91/TZVPrms = 0.29 kcal/molResidualsLimited byaccuracy ofDFT!
19Applications to Phase Diagrams and Azeotropes Winner of the1st,5th,6th IFPSC(AICHE/NIST)Applications to Phase Diagrams and Azeotropesmiscibility gap
20COSMOtherm prediction of drug solubility in diverse solvents (blind test performed with Merck&Co., Inc., Rahway, NJ, USA)all predictions arerelative to ethanolsolvents:Water1-Propanol2-PropanolDMFEthyl AcetateMethanolHeptaneTolueneChlorobenzeneAcetoneEthanolAcetonitrile(Triethylamine)Butanoltriethylamineheptane
21Example Absolute solvent screening with estimated DGfus Artemisinin: All data are simulated / measured at 20°CYellow points indicate alternative experimental measurements, the experimental range is additionally visualized by black lines. Data and DGfus are extracted from Lapkin A., Peters M., Greiner L., Chemat S., Leonhard K., Liauw M., Leitner M., Screening of new solvents for artemisinin extraction process using abinitio methodology, Green Chem., 2010, DOI: /b922001aArtemisinin:
22„Conformational analysis of cyclic acidic a-amino acids in aqueous solution - an evaluation ofdifferent continuum hydration models."by Peter Aadal Nielsen, Per-Ola Norrby, Jerzy W. Jaroszewski, and Tommy Liljefors, for JACSMethod Solvent rms rms (4 points) Max DevModel (kJ/mol) (kJ/mol) (kJ/mol)AM SM5.4APM SM5.4PAM SMHF/6-31+G* C-PCMHF/6-31+G* PB-SCRFAMBER* GB/SAMMFF GB/SABP-DFT/TZVP COSMO-RSCOSMO-RS was evaluated as a blind test !!!
23latest results for bases (pKb): similar rms formicacidaceticacidchloroaceticaciddichloroaceticacid0trichloroaceticacidn-pentanoicacid2,2-dimethylpropanoicacidbenzoicacidoxalicacid0maleicacid3fumaricacidcarbonicacid0phenolpentachlorophenolethanol2,2,2-trichloroethanolhypochlorousacidhypobromousacidhypoiodousacidnitrousacidsulfurousacidphosphoricacid2boricacid5-fluorouracil5-nitrouracilcis-5-formyluracilthyminetrans-5-formyluracilUraciland otherslatest results for bases (pKb):similar rms
24COSMO-RSol has rmse of 0.6 on pesticide test set, Competing method shows 1.3 on that dataset.
25s-moment regressions enable a range of other ADME predictions logBBlogK_IAlogK_HSAlogK_OC…
26COSMOflat: Simulation of molecules at a flat interface Concept:Just a flat interface between twoarbitrariry user-defined liquids,(typically one his hydrophobic and other is water)construct a total partition sumget the probability to find thesolute in a certain depth and orientation.get the total free energy change of the solute at this interface(conformations can be taken into account, e.g. stretched vs. collapsed surface activity- surface and interfacial tension
27COSMOmic: Simulation of molecules in micelles and membranes (2) COSMOmic: A Mechanistic Approach to the Calculation of Membrane-Water PartitionCoefficients and Internal Distributions within Membranes and Micelles, J. Phys. Chem. B, 2008Andreas Klamt, Uwe Huniar, Simon Spycher, and Jörg Keldenich|
28Micelle and Interface Properties: COSMOmic Example: Free energy profiles through DMP (dimethylphthalate) membraneMD-simulations*,** COSMOmic (5 minutes calc. time)* Simulation results of Daniele Bemporad and Jonathan W. Essex University of Southampton, UK and** see also Claude Luttmann, J. Phys. Chem. B 2004, 108, 4875ff
29Cocrystal-Screening with COSMOtherm excess heat of mixing! Initial tests gave better results than current state-of-the-art method  (6 false positive results versus only 2 with COSMOtherm!)Fast screening against large databases with harmless food ingredients (EAFUS, GRAS) within minutes with COSMOfrag technology. Hunter et al., Chem. Sci. 2011, 2, 883–890.
31So far all COSMO-RS application examples were relevant for drug-development. Now to drug-design: - the interactions of ligands with receptors are exactly of the same type as those of solutes and solvents (electrostatics, hydrogen bonding, vdW – and all in the liquid state) !but position plays a far more important role than in liquids: The right polarity must be at the right place!nevertheless, it is a necessary criterium that the molecules have the right polarities, i.e. a suitable s-profile is a necessary, but not sufficient condition for strong binding.Retinal
32COSMOsim bio-isoster search based on s-profiles If physiological distribution and drug-receptor binding are to a large degree determined by s-surfaces and s-profiles, it makes sense to screen for drug-candidates s-profile-similarity:search is based on surface polarity (s) and not on structure => scaffold hoppingeither search over full COSMO-files of COSMOfrag-DB (60000 compounds)screen millions of candidate compounds using the COSMOfrag methodsee also: Thormann M, Klamt A, Hornig M, Almstetter MCOSMOsim: bioisosteric similarity based on COSMO-RS sigma profiles.[J Chem Inf Model. 2006, 46: ]
33CCC(=O)OZFQCMUCKI1OC(=O)C=CITPZMBCLI0.8169CCCC(=O)OIAVMXKDKI20.7996CC=CC(=O)ORGQGEAHMI30.791CC(=C)C(=O)OWCMTTAFLI40.765VGZSDPDLI50.7584CC(C)C(=O)ODGWQYNDKI60.7487OCC1CO1SDLNNSMIA70.7269CC(O)C#NHTYYARCJZ80.7233Oc1nnns1NBAKLRQLI90.7171CC(O)C(=O)OWOJBMNDKV100.7109CC(=O)OCZWYICCKI110.7052Clc1nnn[nH]1JMAKWZALI120.7041CC(=NO)CEZHYEWAJI130.6983OCCC(=O)OFFBMJKDKI140.6978CC(=O)C=NOHOMSZUGLI150.6919Oc1csnn1UMBRJEKLI160.6885OC(=O)C1CCC1CUOCJIGKI170.6817OCCSHLKLSJLHI180.6804CC1CC1C(=O)OGXSEIQGKP190.6767But COSMOsim was not widely used, since drug modelers do not want to waste 3D-information!p7p9p8p12p13p15
34COSMOsim3D bio-isoster search based on s-surfaces idea and initial implementation by Dr. M. Thormann, Origenis AG presented as CUBEsim on COSMO-RS symposium 2009places s-surfaces of target and probe on a gridalignes probe on targetdifferent start orientations ( first 21 systematic), ~ 5s – 30sgives 3D simlarity measure (CS3D)- search for molecules with maximum similarity of 3D-s-surfaces
35M.Thormann, OrigenisSeparation of true and randombioisosteric pairs
36Enrichment of active drugs in MDDR activity classes
37PharmBench introduction Alignment performance: PharmBenchPharmBench introductionA benchmark for alignment/pharmacophore elucidation methods was recently proposed on J. Chem. Inf. Model.PharmBench, namely a benchmark for alignment/pharmacophore elucidation methods was recently proposed on J. Chem. Inf. Model. The authors collected all available X-ray structures for 81 different protein targets, they superimposed the proteins using their alpha carbons and then they extracted the co-crystallized ligands. The collection of ligands for each target constitute a gold standard for alignment. In fact, judging the quality of an alignment of ligands is always a bit elusive in the absence of an experimental reference. By using the PharmBench collection, the quality of the alignments can be objectively evaluated by the RMSD from the crystallographic poses.The quality of the alignments is evaluated by the RMSD from the crystallographic poses
38PharmBench performance PharmBench: reproducing X-ray alignments (performed by Paolo Tosco, Univ. Turino)Tthe same procedure as the one described by Cruciani in the PharmBench papers was used; namely, the crystallographic conformations were extracted from the respective PDB files, then for each dataset, each ligand was aligned to each of the others in their X-ray position. Then each of the aligned ligands was compared to its X-ray position, and the RMSD was computed. For each of the ligands, the template which gives the lowest RMSD from the crystallographic pose was taken into consideration (see page D of the PharmBenchI paper).Here the results for 5 of the 81 datasets are reported for Open3DALIGN (O3A) and COSMOsim3D (CS3D), and compared to the results obtained by the authors of the PharmBench publication with FLAP, namely their own software distributed by Molecular Discovery. While Open3DALIGN performs slightly worse than FLAP, COSMOsim3D achieves a significantly lower RMSD from the X-ray poses.
39COSMOsim3D based Molecular Field Analysis: COSMOsar3D:COSMOsim3D based Molecular Field Analysis:PLSThe most predictive and robust MFA method presented ever!COSMOsim3D: 3D-Similarity and Alignment Based on COSMO Polarization Charge Densities J. Chem. Inf. Model., 2012, 52,2149COSMOsar3D: Molecular Field Analysis Based on Local COSMO σ-Profiles, J. Chem. Inf. Model., 2012, 52 ,2157
40COSMOsar3D: COSMOsim3D based Molecular Field Analysis: do alignment with COSMOsim3Duse the grid of local s-profiles as descriptor arrayby the local s-profiles you have high quality local information about- electrostatics hydrogen bonding- hydrophobic interactions- shapethe ~2Å grid spacing represents some local flexibility and fuzzyness, i.e. it mimics that the ligand and receptor can slightly adjust to each other.If we just assume that the virtual receptor provides a local s-potential at each grid point, then the binding free energy (including desolvation) should be a linear functional of the local s-profiles! Hence a multi-dimensional regression analysis (as PLS) should have a very good chance to generate a predictive model for binding constants.COSMOsim3D based Molecular Field Analysis should be favorable comparedto standard MFA methods! (patent application submitted)
41out off the boxCOSMOsar3D outperformsall 7 standard methods
42Robustness of COSMOsar3D compared to standard MFAand no cutoff-values are required!
43COSMOfrag: A fast shortcut of COSMOtherm suited for HTS-ADME prediction1) large database of precalculated drug-like compounds (about , incl. ions)2) for new compound find most similar fragments in database3) compose approximate COSMO surface from surface fragments4) write a meta-cosmo-file or a full 3D fcos-file for 3D-QSARCOSMOfrag requires less than 0.5 sec/compound HTS
44CF-COSMO files (.fcos): COSMOsim3D usually would require one quantum-chemical calculation for each conformation of a ligand which shall be tested. That is doable, but quite time-consuming for screening ~ 5 min. on AM1/SVP levelCOSMOfrag was able to generate s-profiles for new compounds in less than a second based on finding the most similar atoms in a big database of ~60000 pre-calculated drug-like compounds.NEW: COSMOfrag now can generate in a second an approximate 3D-COSMO file (*_cf.cosmo) by searching for the most similar atoms, transforming the respective COSMOsegments into the new local atomiccoordinate system.CF-cosmo files do not provide a niceclosed COSMO surface, but have thepolarity roughly in the right spatialreagion.But they are sufficient forCOSMOsim3D screening.
45Summary:The Conductor polarization charge density s provides rich information about molecular interactions!The COSMO-RS statistical thermodynamics converts s into free energies in liquid phases and thus yields activity coefficients, solubility … This enables many useful applications in drug development and formulation: solvent screening, co-crystal screening, reaction media selection and optimization, ADME predictions, …s-profiles and local 3D- s-profiles are very useful for the description of ligand-receptor binding.
46Information content of s-surfaces The electrostatic potential ESP calculated by DFT with (or without) COSMO is dominated by dipole moment. Almost no local features (lone pair directions, structure on halogen atoms) can be seen.The conductor polarization charge density s, calculated from DFT/COSMO, shows lone pair directions and many polarity details (e.g. on halogen atoms).It is a very good local measure of polarity.If s is calculated from ESP-fitted point charges, the picture looks similar on the first glance, but all details get lost, because a point charge representation is unable to reflect sub-atomar orbital features!