Presentation is loading. Please wait.

Presentation is loading. Please wait.

Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177

Similar presentations


Presentation on theme: "Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177"— Presentation transcript:

1 Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

2 Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: MonteCarlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing

3 3 Stable conformation : Minimum of energy Unstable conformation : High potential energy Two degrees of freedom Different « conformations » or « geometries » of a molecule

4 The POTENTIAL ENERGY calculation is based on the EMPIRICAL FORCE FIELD APPROACH –Quantum chemical calculations are too time-consuming: atoms and their interactions are approximated as “classical” objects –Atoms need to be “parameterized” in function of their chemical environment: a C atom in an alkane does not carry a same partial charge as a carbonyl C=O! –Covalent bonds are modeled as harmonic springs. The energy required to stretch or compress a bond by  b with respect to its natural length b is expressed as K b  b 2 –Valence angle bending modeled by harmonic potential K    –Atoms that are not directly bonded or do not form an angle interact “through space” by means of non-bonded interactions. Van der Waals interactions Electrostatics interactions – based on partial charges Continuum Solvent models

5 5 Non-bonded interactions : Coulomb : Van der Waals : Desolvation & Hydrophobic Term: -- a1a1 a2a2 ++

6 6 Global energy : Torsional correction terms : E=f(Geometry)

7 Torsions : the gateway to conformational sampling - Energy Profile with respect to a torsion....

8 Torsions : the gateway to conformational sampling - Energy Surface with respect to two torsions....

9 Torsions : the gateway to conformational sampling - Alternative Contour Plot representation

10 The Ramachandran Plot http://en.wikipedia.org/wiki/Ramachandran_plot  

11 Key points on the energy surface...

12  1 =0 36 72 108 144 180 216 252 288 324°  2 =0 60 120 180 240 300 Computing a 2D torsion plot... Not that easy! E low high ?

13 Energy Minimization is only [the easy] part of the problem –Given a starting geometry, deterministic algorithms allow the discovery of the adjacent local minimum –Descent methods follow the local gradient E X

14 Bad news: most molecules have more than 2 torsions... - No visualization of the energy hypersurface is possible!

15 Why care for conformational sampling? –Because experimental properties of a molecule are given by the Boltzmann Average of properties of populated conformers Boltzmann’s probability distribution: Boltzmann Averaging: Objective : finding the most probable solutions That is, the relevant minima Energy Geometry

16 The Challenge… “Well”-docked (folded) zone “Misdocked” (folded) conformers “Misdocked” (folded) conformers EE E#E# PDB Absolute Energy Minimum Native-like: one local clash Energy=f(Geometry) defined by the Empirical Force Field Publisher’s Force Field: « Nice H bond » My Force Field: « Bad Contact » Microstates contributing to macroscopic property

17 Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: MonteCarlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing

18 Sampling methods –Systematic <3…4 torsions –Molecular Dynamics Solve Newton’s motion equations, given the atomic forces calculated by the force field: simulate “Brownian motion” –Stochastic sampling: Monte Carlo simulations Genetic Algorithms 18

19 19 Systematic sampling :   Inter-atomic distance Potential energy Absolute minima Local minimum

20 Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: Monte Carlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing

21 The Monte Carlo Approach: win an Energy Optimum by Playing Dice! –Take a random geometry –Randomly choose a torsional axis –Apply a Random rotation around that axis –Recalculate the Energy of the thereof resulting geometry If lower – or, at least, not too (!) high, accept: make new conformer new “default” geometry” Otherwise, reject – restore ancient geometry –Loop until no further energy drop is observed 21

22 Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: MonteCarlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing

23 23 Data representation : « individual » or « chromosome » = list of its torsional angles Population of individuals : nn  n-1 …    Genetic Algorithm –Applying a Darwinian Evolution Scenario to a population of vectors (“chromosomes”) encoding the solution to a problem –Solution Quality is the “Fitness” score, and the fittest survive…

24 24 Generation of new offspring : Crossover : … nn …  i+1 ii   …  ’ n …  ’ i+1  ’ i  ’   ’  parent1 : parent2 : Mutation : … nn …  i+1 ii   Wild type :  …  ’ n  ’ i+1 ii …   … nn  i+1  ’ i …  ’   ’  child1 : child2 : … nn  i+1  ’ i …   mutant : 

25 25 intermediate population... nn    nn    nn    nn    nn    nn    nn    nn     random... nn    nn    nn    nn    initial population sorted final population... nn    nn    nn    nn    sorted Evolution of the average fitness, Evolution of the fitness of the best  the algorithm converges  selection threshold energies Population Diversity Control is a Key Issue > Discarding of redundant chromosomes (requires a metric defining how similar two encoded solutions are!) > Multiple ‘Island’ models – parallel simulations occasionally swapping solutions

26 or God?? Genetic Algorithms: Chance, Selection & the CoinFlipper’s bet! Any problem admitting a vector as a solution may be coded by a “chromosome” and left in the hands of Darwin… I bet (1M€) I can find a person who won a coin- flipping challenge 10 times in a row, at his/her first attempt!! –In order to fulfill my promise, I need a total of 1024 coin flips to happen, 1024/10=102 pretendents, each with a chance of (1/2) 10 to score 10 successive winning coin flips: ~90% chance to loose 1M€! If you read “Darwin’s Dangerous Idea” by D.C.Dennett, you are not allowed to bet !!

27 Selection is the Key! 1024 candidates / 512 flips … 512 candidates / 256 flips …

28 28 Hybrid strategies: (1) Selective Chromosome Initialization: - Knowledge-based: favoring locally stable torsions… - ‘Traditionalism’: favoring torsion values seen in previously visited samples

29 29 Evolution stalled in local minimum, Mutations will not help! Add a constraint term forcing  1 to adopt ‘mutant’ value  ’ 1 Gradient optimization, following the new energy landscape… ‘Lamarckian’ move towards next optimum Process in parallel to main GA stream in order to avoid halting evolution! Hybrid Strategies (2): Directed or ‘Lamarckian’ Mutations

30 Hybrid Heuristics (3) The Taboo Search Dilemma Evolved Solution “Taboo” Phase space region ? ?

31 Search for Optimal Sampling Setups in the Strategy Parameter Space… p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p 14 p 15 Population management Population size Number of parallel process Migration rate between ‘islands’ Evolution management Crossover rate Mutation rate One/two point crossover rate Selection pressure Dissimilarity limit Maximal age Convergence management Apocalypse (population reset) frequency Elitism Global stop condition

32 32 3-fold repeat Postprocessing… Run 1 Run 2 Run n … Global Base of Diverse Conformers Base of diverse conformers [sampled at current setup] µ-Fitness!! Meta-algorithm defines parameter setup News ?? « Tabus » « Tradition » Meta-GA picks next set of configurations yes GAME OVER no Directed Mutations The Island Model

33 GRID 5000-based ‘Planetary’ Model If (free node) DEPLOY Island Model - Executables - Molecule File - Constraint Files - Seeds List - Taboo List - Operational Pars -Stablest Chromosomes -Sampling Success Score Solution Merger & Clusterer Conformer & Cluster Database ‘Panspermia’ policy center ‘recent’ clusters: seeds ‘old’ clusters: taboo Sampling Success vs. Operational Pars Stop:  max. ‘Mission Nr.’  no new clusters since N ‘missions’ www.grid5000.fr Operational Pars Selector

34 Trp cage 1L2YAb initio folding of Trp cage 1L2Y: native structure (reproducibly) found and ranked as most stable. D&C Planetary model: 20 nodes for 24 hours PDB

35 Villin headpiece 1VIIAb initio folding of the Villin headpiece 1VII: helical parts are seen to fold in a matter of days (40 nodes) – although not properly oriented. PDB

36 ChignolinGood news for the  -hairpin of Chignolin: out of the top 10 best ranked conformers, 8 are native-like Number one is not – but in this case, that may not be a problem PDB #1,#5

37 However, proper folding of 1LE1 could be achieved (though not reproducibly!) with previous force field versions – is the current setup too helix-specific? The 1LE1  -sheet is not the absolute energy minimum according to the current setup! PDB

38 Casein Kinase 2 (3BQC)Docking simulations in presence of flexible loops, such as the hinge region of Casein Kinase 2 (3BQC) – pose of ligand emodin and loop geometry are correctly predicted (3BQC not in FF training set). Flexible hinge region PDB, #1

39 Furthermore, a crystallographic water molecule can be simultaneously docked, being considered as another ligand – and is correctly placed. Flexible hinge region Water location converges towards experimental position

40 Docking into GPCRs: (1) Turkey  1-Adrenergic Receptor – Cyanopindolol complex 2VT4, 190 degrees of freedom (ligand and side chains) – 30 days/20 nodes**. Ligand RMS =0.48 A (best pose) ** total run time required to visit ~40000 phase space cells

41 Conclusions –Conformational Sampling is the Key Element for Understanding of Molecular Behavior –It may range from very simple to extremely difficult, to impossible –If you don’t do it well, better don’t do it at all: empirical methods based on molecular topology only may be more accurate than 3D models based on wrong – or too few – conformations –Two main sources of errors: A.) wrong calculated energy- geometry landscape (poor Force Field parameterization) and B.) – insufficient sampling! –Docking is just a specific case of conformational sampling, involving at least two molecules: a binding “site” and one or more “ligands” –You will often hear that the knowledge of the “bioactive” conformer is paramount to understand binding. This is necessary, but sometimes not sufficient. Note: the “bioactive” conformer may sometimes be quite unstable and almost never populated in the free state.


Download ppt "Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177"

Similar presentations


Ads by Google