Download presentation
Presentation is loading. Please wait.
Published byTracey Atkins Modified over 9 years ago
1
Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr
2
Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: MonteCarlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing
3
3 Stable conformation : Minimum of energy Unstable conformation : High potential energy Two degrees of freedom Different « conformations » or « geometries » of a molecule
4
The POTENTIAL ENERGY calculation is based on the EMPIRICAL FORCE FIELD APPROACH –Quantum chemical calculations are too time-consuming: atoms and their interactions are approximated as “classical” objects –Atoms need to be “parameterized” in function of their chemical environment: a C atom in an alkane does not carry a same partial charge as a carbonyl C=O! –Covalent bonds are modeled as harmonic springs. The energy required to stretch or compress a bond by b with respect to its natural length b is expressed as K b b 2 –Valence angle bending modeled by harmonic potential K –Atoms that are not directly bonded or do not form an angle interact “through space” by means of non-bonded interactions. Van der Waals interactions Electrostatics interactions – based on partial charges Continuum Solvent models
5
5 Non-bonded interactions : Coulomb : Van der Waals : Desolvation & Hydrophobic Term: -- a1a1 a2a2 ++
6
6 Global energy : Torsional correction terms : E=f(Geometry)
7
Torsions : the gateway to conformational sampling - Energy Profile with respect to a torsion....
8
Torsions : the gateway to conformational sampling - Energy Surface with respect to two torsions....
9
Torsions : the gateway to conformational sampling - Alternative Contour Plot representation
10
The Ramachandran Plot http://en.wikipedia.org/wiki/Ramachandran_plot
11
Key points on the energy surface...
12
1 =0 36 72 108 144 180 216 252 288 324° 2 =0 60 120 180 240 300 Computing a 2D torsion plot... Not that easy! E low high ?
13
Energy Minimization is only [the easy] part of the problem –Given a starting geometry, deterministic algorithms allow the discovery of the adjacent local minimum –Descent methods follow the local gradient E X
14
Bad news: most molecules have more than 2 torsions... - No visualization of the energy hypersurface is possible!
15
Why care for conformational sampling? –Because experimental properties of a molecule are given by the Boltzmann Average of properties of populated conformers Boltzmann’s probability distribution: Boltzmann Averaging: Objective : finding the most probable solutions That is, the relevant minima Energy Geometry
16
The Challenge… “Well”-docked (folded) zone “Misdocked” (folded) conformers “Misdocked” (folded) conformers EE E#E# PDB Absolute Energy Minimum Native-like: one local clash Energy=f(Geometry) defined by the Empirical Force Field Publisher’s Force Field: « Nice H bond » My Force Field: « Bad Contact » Microstates contributing to macroscopic property
17
Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: MonteCarlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing
18
Sampling methods –Systematic <3…4 torsions –Molecular Dynamics Solve Newton’s motion equations, given the atomic forces calculated by the force field: simulate “Brownian motion” –Stochastic sampling: Monte Carlo simulations Genetic Algorithms 18
19
19 Systematic sampling : Inter-atomic distance Potential energy Absolute minima Local minimum
20
Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: Monte Carlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing
21
The Monte Carlo Approach: win an Energy Optimum by Playing Dice! –Take a random geometry –Randomly choose a torsional axis –Apply a Random rotation around that axis –Recalculate the Energy of the thereof resulting geometry If lower – or, at least, not too (!) high, accept: make new conformer new “default” geometry” Otherwise, reject – restore ancient geometry –Loop until no further energy drop is observed 21
22
Presentation Outline –The Basics: Molecules have Geometries! Intramolecular energy calculation: the Empirical Force Field –Sampling Methods: a brief overview –Molecular Casino: MonteCarlo Simulations –Really Difficult Problems: Darwinism, God’s Will and Massively Parallel Computing
23
23 Data representation : « individual » or « chromosome » = list of its torsional angles Population of individuals : nn n-1 … Genetic Algorithm –Applying a Darwinian Evolution Scenario to a population of vectors (“chromosomes”) encoding the solution to a problem –Solution Quality is the “Fitness” score, and the fittest survive…
24
24 Generation of new offspring : Crossover : … nn … i+1 ii … ’ n … ’ i+1 ’ i ’ ’ parent1 : parent2 : Mutation : … nn … i+1 ii Wild type : … ’ n ’ i+1 ii … … nn i+1 ’ i … ’ ’ child1 : child2 : … nn i+1 ’ i … mutant :
25
25 intermediate population... nn nn nn nn nn nn nn nn random... nn nn nn nn initial population sorted final population... nn nn nn nn sorted Evolution of the average fitness, Evolution of the fitness of the best the algorithm converges selection threshold energies Population Diversity Control is a Key Issue > Discarding of redundant chromosomes (requires a metric defining how similar two encoded solutions are!) > Multiple ‘Island’ models – parallel simulations occasionally swapping solutions
26
or God?? Genetic Algorithms: Chance, Selection & the CoinFlipper’s bet! Any problem admitting a vector as a solution may be coded by a “chromosome” and left in the hands of Darwin… I bet (1M€) I can find a person who won a coin- flipping challenge 10 times in a row, at his/her first attempt!! –In order to fulfill my promise, I need a total of 1024 coin flips to happen, 1024/10=102 pretendents, each with a chance of (1/2) 10 to score 10 successive winning coin flips: ~90% chance to loose 1M€! If you read “Darwin’s Dangerous Idea” by D.C.Dennett, you are not allowed to bet !!
27
Selection is the Key! 1024 candidates / 512 flips … 512 candidates / 256 flips …
28
28 Hybrid strategies: (1) Selective Chromosome Initialization: - Knowledge-based: favoring locally stable torsions… - ‘Traditionalism’: favoring torsion values seen in previously visited samples
29
29 Evolution stalled in local minimum, Mutations will not help! Add a constraint term forcing 1 to adopt ‘mutant’ value ’ 1 Gradient optimization, following the new energy landscape… ‘Lamarckian’ move towards next optimum Process in parallel to main GA stream in order to avoid halting evolution! Hybrid Strategies (2): Directed or ‘Lamarckian’ Mutations
30
Hybrid Heuristics (3) The Taboo Search Dilemma Evolved Solution “Taboo” Phase space region ? ?
31
Search for Optimal Sampling Setups in the Strategy Parameter Space… p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p 14 p 15 Population management Population size Number of parallel process Migration rate between ‘islands’ Evolution management Crossover rate Mutation rate One/two point crossover rate Selection pressure Dissimilarity limit Maximal age Convergence management Apocalypse (population reset) frequency Elitism Global stop condition
32
32 3-fold repeat Postprocessing… Run 1 Run 2 Run n … Global Base of Diverse Conformers Base of diverse conformers [sampled at current setup] µ-Fitness!! Meta-algorithm defines parameter setup News ?? « Tabus » « Tradition » Meta-GA picks next set of configurations yes GAME OVER no Directed Mutations The Island Model
33
GRID 5000-based ‘Planetary’ Model If (free node) DEPLOY Island Model - Executables - Molecule File - Constraint Files - Seeds List - Taboo List - Operational Pars -Stablest Chromosomes -Sampling Success Score Solution Merger & Clusterer Conformer & Cluster Database ‘Panspermia’ policy center ‘recent’ clusters: seeds ‘old’ clusters: taboo Sampling Success vs. Operational Pars Stop: max. ‘Mission Nr.’ no new clusters since N ‘missions’ www.grid5000.fr Operational Pars Selector
34
Trp cage 1L2YAb initio folding of Trp cage 1L2Y: native structure (reproducibly) found and ranked as most stable. D&C Planetary model: 20 nodes for 24 hours PDB
35
Villin headpiece 1VIIAb initio folding of the Villin headpiece 1VII: helical parts are seen to fold in a matter of days (40 nodes) – although not properly oriented. PDB
36
ChignolinGood news for the -hairpin of Chignolin: out of the top 10 best ranked conformers, 8 are native-like Number one is not – but in this case, that may not be a problem PDB #1,#5
37
However, proper folding of 1LE1 could be achieved (though not reproducibly!) with previous force field versions – is the current setup too helix-specific? The 1LE1 -sheet is not the absolute energy minimum according to the current setup! PDB
38
Casein Kinase 2 (3BQC)Docking simulations in presence of flexible loops, such as the hinge region of Casein Kinase 2 (3BQC) – pose of ligand emodin and loop geometry are correctly predicted (3BQC not in FF training set). Flexible hinge region PDB, #1
39
Furthermore, a crystallographic water molecule can be simultaneously docked, being considered as another ligand – and is correctly placed. Flexible hinge region Water location converges towards experimental position
40
Docking into GPCRs: (1) Turkey 1-Adrenergic Receptor – Cyanopindolol complex 2VT4, 190 degrees of freedom (ligand and side chains) – 30 days/20 nodes**. Ligand RMS =0.48 A (best pose) ** total run time required to visit ~40000 phase space cells
41
Conclusions –Conformational Sampling is the Key Element for Understanding of Molecular Behavior –It may range from very simple to extremely difficult, to impossible –If you don’t do it well, better don’t do it at all: empirical methods based on molecular topology only may be more accurate than 3D models based on wrong – or too few – conformations –Two main sources of errors: A.) wrong calculated energy- geometry landscape (poor Force Field parameterization) and B.) – insufficient sampling! –Docking is just a specific case of conformational sampling, involving at least two molecules: a binding “site” and one or more “ligands” –You will often hear that the knowledge of the “bioactive” conformer is paramount to understand binding. This is necessary, but sometimes not sufficient. Note: the “bioactive” conformer may sometimes be quite unstable and almost never populated in the free state.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.