Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Susan McClatchy, Milind Misra,

Similar presentations


Presentation on theme: "By Susan McClatchy, Milind Misra,"— Presentation transcript:

1 AUTODOCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction
By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu Shrivastava

2 Introduction Chandreyee Mukherjee

3 Automated Docking: Importance
Interaction between biomolecules lie at the core of all metabolic processes and life activities The number of solved protein structures available in the databases is expanding exponentially To understand their functions it is essential to elucidate the interaction mechanisms between the different molecules Primary importance lies in rational drug design Depending upon the success of the docked molecules the docking ligand may be redesigned or its structure further refined. Also important in the area of immunology to study antigen-antibody interaction.

4 Inhibitor bound to active site of HIVPR
Surface structure of HIVPR with bound inhibitor

5 What is docking? Prediction of the optimal physical configuration and energy between two molecules The docking problem optimizes: Binding between two molecules such that their orientation maximizes the interaction Evaluates the total energy of interaction such that for the best binding configuration the binding energy is the minimum The resultant structural changes brought about by the interaction

6 Categories of docking Protein-Protein Docking:
Both molecules are rigid Interaction produces no change in conformation Similar to lock-and key model Protein-Ligand Docking: Ligand is flexible but the receptor protein is rigid Interaction produces conformational changes in ligand

7 Protein-Protein Docking
Protein-Ligand Docking optimized

8 Docking uses a “search and score” method
It involves: Finding useful ways of representing the molecules and molecular properties. Exploration of the configuration spaces available for interaction between ligand and receptor. Evaluate and rank configurations using a scoring system, in this case the binding energy However, since it is difficult to evaluate the binding energy because the binding sites may not be easily accessible, the binding energy is modeled as follows: ∆G bind= ∆Gvdw + ∆Ghbond + ∆Gelect + ∆G conform+ ∆G tor + ∆G sol

9 The AutoDock Software Search Algorithms used:
Developed by AJ Olson’s group in 1990. AutoDock uses free energy of the docking molecules using 3D potential-grids Uses heuristic search to minimize the energy. Search Algorithms used: Simulated Annealing Genetic Algorithm Lamarckian GA (GA+LS hybrid)

10 Algorithms Overview Simulated Annealing Genetic Algorithm
Based on temperature effects Start with high temperature and global search Lower temperature local search Genetic Algorithm Charles Darwin’s Theory of Evolution Genotype Phenotype Lamarckian Algorithm ( Jean –Baptiste de Lamarck) Phenotype Genotype

11 Project Goal Study algorithms used to perform the searches and to calculate minimum energy Discuss why GA+LS hybrid better than SA Look at an example, i.e., dock a ligand to a protein molecule using latest AutoDock version

12 The Algorithms Sue McClatchy

13 Simulated Annealing Algorithm modeled after the cooling of a solution to form glass, though it’s better explained by crystal formation Given a long enough cooling time, molecules will relax into their lowest energy state to form the largest crystals Quick cooling - highly disordered system Slow cooling - highly ordered crystal, with each molecule in its lowest energy state Algorithm simulates either linear or proportional slow cooling

14 The SA Algorithm Uses neighborhood operator N(s) to generate a set of solutions according to a fixed distribution New solution compared to preceding solution, and is accepted if its energy is lower than that of previous solution If new solution has higher energy, it is accepted probabilistically according to Boltzmann distribution (see figure above) At high temperatures, many higher energy solutions will be accepted; at low temps., majority of probabilistic moves rejected Boltzmann probability distribution = e exp(delta E/T) where delta E = energy difference between two solutions, T = temperature Boltzmann finds p(of finding a system with energy E at temp T)

15 Pseudocode for SA Compute a random initial state s
n=0, x*n = s // initialize best solution to s and first state to 0 Repeat i = 1, 2, … // specify number of temperatures to try Repeat j = 1, 2, …, mi // no. of steps to perform for each temp. Ti Compute a neighbor s’ = N(s) // s’ = new solution from N(s) if (f(s’) <= f(s)) then // if energy of s’ <= energy of s s = s’ // accept new solution s’ if (f(s) < f(x*n)) then // if energy of new solution < x*n = s // energy of best solution of n = n // state n, replace best with new endif else // otherwise replace s with s’ using s = s’ with probability e (f(s) - f(s’))/Ti // Boltzmann dist. EndRepeat

16 How Genetic Algorithms Work - A Simple Example
Initial population of binary creatures having 6 “genes” Each gene has two different alleles, either a 0 or a 1 Three operators: crossover, mutation and selection

17 Selection Selection based on a fitness function f(x)
Score Selection based on a fitness function f(x) This operator chooses those individuals with the lowest values Those with higher values chosen with a very low probability 20 13 48 52

18 Crossover

19 Mutation

20 Replacement # offsp Score Lower scoring individuals create more offspring, higher scoring ones create fewer or none at all Offspring replace parental generation “Elitism” function allows best individual from parent generation to persist, if it is a better solution than new individuals created Cycle of selection, mutation, crossover and replacement repeated

21 Pseudocode for GA Select an initial population set xi0 = {x10 , x20,…, xM0} Determine fitness values f(xi0) for each individual Repeat for g = 1, 2, … # of generations Perform selection Perform crossover with probability  Perform mutation with probability  Determine fitness f(xig) for new individuals xg* = argmini=1,…M f(xig) and yg* = f(xg*) Perform replacement Until stopping criterion (# of generations) is reached

22 How GA works in AutoDock
Ligand’s “genes” are its x, y and z coordinates These form a unit vector, which is given a random rotation angle between 0o and 360o to form a quaternion Additional genes may represent torsion angles between bonds of the ligand

23 Mapping In standard GA, the genotype (x,y,z coordinates plus rotation and any torsion angles) are mapped to the fitness function f(x) The fitness function value corresponds to each individual’s phenotype According to the right hand side of the figure, genotypes of parents with high f(x) values are mutated to form genotypes of children with lower f(x) values

24 Selection, Crossover & Mutation
Selection chooses ligands with the lowest fitness (energy) values Crossover exchanges x, y, z coordinates, or rotations or torsions between these ligands Example: Two ligands with xyz coordinates Abc and aBc Crossover results in new individuals with coordinates abc and ABc Mutation operator mutates coordinate or other angle values by adding a random real number according to a Cauchy distribution, which is similar to a Gaussian but has thicker tails

25 Replacement Individuals with better-than-average fitness receive proportionally more offspring no= (fw – fi)/(fw - <f>), fw != <f> where no= number of offspring fi = fitness of individual (energy of ligand) fw = fitness of worst individual in last g generations (typically 10) <f> = mean fitness of population

26 Lamarckian Genetic Algorithm
According to left hand side of figure, LGA finds lowest fitness function (energy) values first, then maps these values to their respective genotypes Genetic algorithm plus Solis and Wets local search Better performance than either simulated annealing or genetic algorithm alone

27 The Application Milind Misra

28 HIV-1 Protease and AHA006 HIV-1 Protease in complex with the cyclic sulfamide inhibitor, AHA006 Source: Protein Data Bank Authors: K. Backbro, T. Unge Exp. Method: X-ray Diffraction (2 Å res.) Primary Citation: Backbro et al, J Med Chem 40 pp. 898 (1997) Polymer Chains: A, B; Residues: 198; Atoms: 1632

29 Protein (HIV-1 Protease)
Ligand (AHA006) (Source: PDB)

30 HIV-1 Protease dimer (Rasmol)

31 Initial X-Ray crystallographic positions of protein and ligand
(SYBYL)

32 Docking Preparation – Ligand
Assign charges Define rotatable bonds Rename aromatic carbons Merge non-polar hydrogens Write .pdbq ligand file

33 Docking Preparation – Protein
Add essential hydrogens Load charges Merge lone-pairs Add solvation parameters Write .pdbqs protein file

34 Docking Preparation – Grid
AutoDock uses grid-based docking Ligand-protein interaction energies are pre-calculated and then used as a look-up table during simulation Grid maps are constructed based on atoms of interest in ligand (here CANOSH)

35 (AutoDockTools)

36 Docking – Simulated Annealing
Runs = 100 Cycles = 50 Initial Temp (RT) = 1,000 Temp reduction factor = .95 Linear temperature reduction Translation reduction factor = 1 Quaternion reduction factor = 1 Torsional reduction factor = 1 # rotatable bonds = 12 Initial coordinates = Random Initial quaternion = Random Initial dihedrals = Random Translation step = 2.0 Å Quaternion step = 50 deg Torsion step = 50 deg Results: 100 different clusters Energy range: to +64,000 Conformation #81: Conformation #67: Conformation #68: Lowest energy conf not close to position but similar to original Conf #67 closest to position and conformation of original ligand; higher energy Conf #68 close to position but not conformation of original ligand; not as high energy

37 Original ligand conf SA conformation #67 (SYBYL)

38 Original ligand conf SA conformation #67 Close-up of previous (SYBYL)

39 Original ligand conf SA conformation #67 (SYBYL)

40 100 Clustered SA Conformations
(gOpenMol)

41 Docking – Genetic Algorithm
Runs = 50 # Evaluations = 250,000 Population size = 50 Elitism count = 1 Mutation rate = 0.02 Crossover rate = 0.8 Window size = 10 Cauchy alpha = 0 Cauchy beta = 1 # rotatable bonds = 12 Initial coordinates = Random Initial quaternion = Random Initial dihedrals = Random Translation step = 2.0 Å Quaternion step = 50 deg Torsion step = 50 deg Results: 50 different clusters Energy range: to Conformation #39: Conformation #9: Lowest energy conformation overall closest to original ligand conformation If only 10 runs had been used instead of 50, then conf #9 would have been the lowest energy conformation.

42 Docking – Local Search Runs = 50 Solis-Wets iterations = 300
Consecutive successes = 4 Consecutive failures = 4 Rho = 1 Lower bound on rho = 0.01 LS frequency = 0.06 # rotatable bonds = 12 Initial coordinates = Random Initial quaternion = Random Initial dihedrals = Random Translation step = 2.0 Å Quaternion step = 50 deg Torsion step = 50 deg Results: 18 different clusters Energy range: to +215,200 Confs #20, 21, 22, 23: Lowest energy conformation was most dissimilar to original ligand conformation Better results could have been obtained by reducing the step sizes

43 Docking – Lamarckian GA
Runs = 10 Max # Evaluations = 250,000 Max # Generations = 27,000 Population size = 50 Elitism count = 1 Mutation rate = 0.02 Crossover rate = 0.8 Window size = 10 Cauchy alpha = 0 Cauchy beta = 1 Solis-Wets iterations = 300 Consecutive successes = 4 Consecutive failures = 4 Rho = 1 Lower bound on rho = 0.01 LS frequency = 0.06 * Gray options * Results: 10 different clusters Energy range: to –8.38 Conformation #7: Lowest energy conformation fairly similar to original ligand conformation If the number of runs was restricted to 10 for both GA and LGA, LGA would have generated the best structure

44 Original ligand conf Best GA conf Best LGA conf Best SA conf
Best LS conf (SYBYL)

45 Original ligand conf Best GA conf Best LGA conf Best SA conf (SYBYL)

46 References S.Kumar et.al. “Protein Flexibility and Electrostatic Interactions.” IBM Journal of Research and Development Vol45. No ¾ 2001. G. Morris et.al. “Automated Docking Using a Lamarckian Genetic Algorithm and an Empirical Binding Free Energy Function.” Journal of Computational Chemistry, Vol. 19, No. 14, (1998) C. Rosin et.al. “A Comparison of Global and Local Search Methods in Drug Docking.” UCSD CSE Technical Report #CS (1997) C. A. Sotriffer et.al. “Automated Docking of Ligands to Antibodies: Methods and Applications.” Methods 20, (2000) M. Vieth et.al. “Assessing Search Strategies for Flexible Docking.” Practical Handbook of Genetic Algorithms. Edited by Lance Chambers An Introduction to Genetic Algorithms. Melanie Mitchell. Goodsell and Olson Prot. Struct. Func. Genet, 8, 195(1990). Principals of Biochemistry: Lehninger R. Durbin, S Eddy, A. Krogh, G. Mitchison Biological sequence analysis Wm. E. Hart. “A Theoretical Comparison of Genetic Algorithms and Simulated Annealing” Sandia National Laboratories,


Download ppt "By Susan McClatchy, Milind Misra,"

Similar presentations


Ads by Google