Development and Validation of a Genetic Algorithm for Flexible Docking Gareth Jones, Peter Willet, Robert C. Glen, Andrew R. Leach and Robin Taylor J.

Slides:



Advertisements
Similar presentations
Genetic Algorithms Chapter 3. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Algorithms GA Quick Overview Developed: USA in.
Advertisements

CS6800 Advanced Theory of Computation
Improving enrichment rates A practical solution to an impractical problem Noel O’Boyle Cambridge Crystallographic Data Centre
Biologically Inspired AI (mostly GAs). Some Examples of Biologically Inspired Computation Neural networks Evolutionary computation (e.g., genetic algorithms)
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
1 IOE/MFG 543 Chapter 14: General purpose procedures for scheduling in practice Section 14.5: Local search – Genetic Algorithms.
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
COMP305. Part II. Genetic Algorithms. Genetic Algorithms.
Introduction to Genetic Algorithms Yonatan Shichel.
FLEX* - REVIEW.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Molecular Docking Using GOLD Tommi Suvitaival Seppo Virtanen S Basics for Biosystems of the Cell Fall 2006.
Evolutionary Computation Application Peter Andras peter.andras/lectures.
Selecting Informative Genes with Parallel Genetic Algorithms Deodatta Bhoite Prashant Jain.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Universidad de los Andes-CODENSA The Continuous Genetic Algorithm.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Comparative Evaluation of 11 Scoring Functions for Molekular Docking Authors: Renxiao Wang, Yipin Lu and Shaomeng Wang Presented by Florian Lenz.
eHiTS Score Darryl Reid, Zsolt Zsoldos, Bashir S. Sadjad, Aniko Simon, The next stage in scoring function evolution: a new statistically.
Molecular Descriptors
Genetic Algorithm.
Evolutionary Intelligence
Efficient Model Selection for Support Vector Machines
A genetic algorithm for structure based de-novo design Scott C.-H. Pegg, Jose J. Haresco & Irwin D. Kuntz February 21, 2006.
A Comparison of Nature Inspired Intelligent Optimization Methods in Aerial Spray Deposition Management Lei Wu Master’s Thesis Artificial Intelligence Center.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Intro. ANN & Fuzzy Systems Lecture 36 GENETIC ALGORITHM (1)
SimBioSys Inc.© 2001http:// New methods for studying receptor-ligand interactions Zsolt Zsoldos, Aniko Simon SimBioSys Inc.,
Genetic Algorithms Michael J. Watts
Applying Genetic Algorithm to the Knapsack Problem Qi Su ECE 539 Spring 2001 Course Project.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
1/27 Discrete and Genetic Algorithms in Bioinformatics 許聞廉 中央研究院資訊所.
Genetic Algorithms Introduction Advanced. Simple Genetic Algorithms: Introduction What is it? In a Nutshell References The Pseudo Code Illustrations Applications.
In silico discovery of inhibitors using structure-based approaches Jasmita Gill Structural and Computational Biology Group, ICGEB, New Delhi Nov 2005.
Derivative Free Optimization G.Anuradha. Contents Genetic Algorithm Simulated Annealing Random search method Downhill simplex method.
Genetic Algorithms. Evolutionary Methods Methods inspired by the process of biological evolution. Main ideas: Population of solutions Assign a score or.
SimBioSys Inc.© 2004http:// Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
Genetic Algorithms Przemyslaw Pawluk CSE 6111 Advanced Algorithm Design and Analysis
ECE 103 Engineering Programming Chapter 52 Generic Algorithm Herbert G. Mayer, PSU CS Status 6/4/2014 Initial content copied verbatim from ECE 103 material.
EE749 I ntroduction to Artificial I ntelligence Genetic Algorithms The Simple GA.
Heterogeneous redundancy optimization for multi-state series-parallel systems subject to common cause failures Chun-yang Li, Xun Chen, Xiao-shan Yi, Jun-youg.
GENETIC ALGORITHM Basic Algorithm begin set time t = 0;
Innovative and Unconventional Approach Toward Analytical Cadastre – based on Genetic Algorithms Anna Shnaidman Mapping and Geo-Information Engineering.
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
Genetic algorithms: A Stochastic Approach for Improving the Current Cadastre Accuracies Anna Shnaidman Uri Shoshani Yerach Doytsher Mapping and Geo-Information.
Genetic Algorithms Chapter Description of Presentations
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.
Genetic Algorithms. Underlying Concept  Charles Darwin outlined the principle of natural selection.  Natural Selection is the process by which evolution.
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Elon Yariv Graduate student in Prof. Nir Ben-Tal’s lab Department of Biochemistry and Molecular Biology, Tel Aviv University.
Genetic Algorithms And other approaches for similar applications Optimization Techniques.
GENETIC ALGORITHM By Siti Rohajawati. Definition Genetic algorithms are sets of computational procedures that conceptually follow steps inspired by the.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
Genetic Algorithm (Knapsack Problem)
Chapter 14 Genetic Algorithms.
Genetic Algorithms.
An evolutionary approach to solving complex problems
Virtual Screening.
Genetic Algorithms Chapter 3.
Basics of Genetic Algorithms
Searching for solutions: Genetic Algorithms
Introduction to Genetic Algorithm and Some Experience Sharing
Presentation transcript:

Development and Validation of a Genetic Algorithm for Flexible Docking Gareth Jones, Peter Willet, Robert C. Glen, Andrew R. Leach and Robin Taylor J. Mol. Biol., 1997, 267

Bioinformatics Seminar 2005 Matthias Dietzen2 Contents Introduction Docking Genetic Algorithm Development of GOLD Validation of GOLD Conclusions Discussion

Introduction

Bioinformatics Seminar 2005 Matthias Dietzen4 Introduction Nowadays, computer-aided design of therapeutic molecules is the method of choice Screening virtual libraries for novel chemical entities and predicting their binding modes for a given receptor would save both time and money  Satisfies „fail fast, fail cheap“

Docking Definition Problems

Bioinformatics Seminar 2005 Matthias Dietzen6 Docking Definition “Docking tries to find the energetically most feasible three-dimensional arrangement of two molecules in close contact with each other.” Use of Docking: Target Validation Lead Discovery Lead Optimization

Bioinformatics Seminar 2005 Matthias Dietzen7 Docking Problems 3 different complexities: Rigid (comparatively simple) Semi-flexible (hard) Flexible (undoable) Combinatorial explosion when accounting for flexibility of ligand and/or receptor forces the development of highly sophisticated algorithms One of these: Genetic Algorithm

Genetic Algorithm Definition Model Algorithm

Bioinformatics Seminar 2005 Matthias Dietzen9 Genetic Algorithm Definition “A Genetic Algorithm evolves the population of possible solutions through genetic operators to a final population, optimizing a predefined fitness function.” Underlying principle: Darwin‘s Theory of Evolution Population growth is limited by the food available Individuals using this food more efficiently will produce more offspring displacement of less adapted individuals  „Survival of the fittest“

Bioinformatics Seminar 2005 Matthias Dietzen10 Genetic Algorithm Model A Genetic Algorithm provides: Population(s) of individuals competing against each other Each individual represented as a set of chromosomes encoding the individual‘s features Genetic Operators modelling processes of evolution A Fitness Function ranking the individuals of one generation

Bioinformatics Seminar 2005 Matthias Dietzen11 Genetic Algorithm Algorithm 1. Select and initialize the set of genetic operators 2. Randomly create an initial population and rank by fitness 3. Select parents in dependence of their ranking 4. Breed children by the use of genetic operators 5. Evaluate the children‘s fitness 6. Replace least fit members of the population 7. Go to 3 until termination or convergence

Development of GOLD Chromosomes Fitness Function Genetic Operators

Bioinformatics Seminar 2005 Matthias Dietzen13 Development of GOLD Chromosomes 2 binary strings for conformation information of both ligand and protein 1 byte for each bond‘s rotation angle 2 integer strings for mapping of hydrogen bonds Acceptor (ligand) -> Donor (receptor) Donor (ligand) -> Acceptor (receptor) Use of least squares fitting to form as many hydrogen bonds as possible

Bioinformatics Seminar 2005 Matthias Dietzen14 Development of GOLD Fitness Function 3 energy terms H_Bond_Energy: sum of energies of all hydrogen bonds in the complex Complex_Energy: steric energy of interaction between ligand and receptor Internal_Energy: the ligand‘s steric and torsional energy based on molecular mechanics Final fitness score: -(H_Bond_Energy+Internal_Energy+Complex_Energy)

Bioinformatics Seminar 2005 Matthias Dietzen15 Development of GOLD Fitness Function - H_Bond_Energy E pair x distance_weight x angle_weight Geometrical arrangement of donor hydrogen, acceptor and any lone-pairs hydrogen-bond energy between a donor and an acceptor

Bioinformatics Seminar 2005 Matthias Dietzen16 Development of GOLD Fitness Function - H_Bond_Energy E pair Uses model fragments for donor (d) and acceptor (a) Accounts for displacement of water (w) Initially, Donor and acceptor are in solution, but when forming a hydrogen-bond, water is stripped off  E pair = (E da + E ww ) – (E dw + E aw )

Bioinformatics Seminar 2005 Matthias Dietzen17 Development of GOLD Genetic operators Island model: isolated subpopulations instead of one large population No increase effectiveness but efficiency five subpopulations, each with 100 individuals Use of four genetic operators: Crossover Mutation Migration Selection

Bioinformatics Seminar 2005 Matthias Dietzen18 Development of GOLD Genetic operators Crossover Inherits the parents‘ features by crossover of chromosomes Mutation Changes a single individual‘s chromosome randomly (bit flipping) Migration Copies an individual from one island to a neighbouring one Selection Relative probability to chose fittest individual as a parent Pressure: 1.1

Validation of GOLD Data set Classification Results

Bioinformatics Seminar 2005 Matthias Dietzen20 Validation of GOLD Data set Data set of 100 protein ligand complexes of pharmacological interest from PDB High Variance of test set: Heavy atoms between 6 and 55 Rotatable bonds between 0 and 30 Many functionally different protein types Metalloenzymes Hand-curated with respect to charges, protonation and tautomeric states

Bioinformatics Seminar 2005 Matthias Dietzen21 Validation of GOLD Classification 20 GA runs per complex Ensures to find best solution Four subjective categories: Good: binding mode, hydrogen-bonds, close contacts, metal coordination correct Close:result acceptable, but with some displacement of ligand groups from the experimental result Errors:Partially correct, but with significant errors Wrong:Completely incorrect Preference to rmsd, small rmsd may mask errors

Bioinformatics Seminar 2005 Matthias Dietzen22 Validation of GOLD Classification Left: good Right: errors

Bioinformatics Seminar 2005 Matthias Dietzen23 Validation of GOLD Results Prediction: 71/100 in categories good and close Complexes predicted after 2, 5, 10 runs GA runsCorrectly predicted 249/71 563/ /71

Bioinformatics Seminar 2005 Matthias Dietzen24 Validation of GOLD Results Ligand composition:

Bioinformatics Seminar 2005 Matthias Dietzen25 Validation of GOLD Results Problems in resolution:

Bioinformatics Seminar 2005 Matthias Dietzen26 Validation of GOLD Results Summary:  71% prediction accuracy  In general, GOLD does not require 20 runs  fails for many heavy atoms/torsions due to complexity  fails for few hydrogen bonds due to fitness score  Prediction rate of 77% for resolution ≤2.5

Conclusions

Bioinformatics Seminar 2005 Matthias Dietzen28 Conclusions Genetic Algorithms in general: Random initialization (non-deterministic) Convergence to global minimum Solutions are suboptimal Need of a local minimizer GOLD: Bit vector mutation leads to solutions far from the original individual Problems of docking large, flexible, hydrophobic ligands

Thank you for your attention!

Discussion

Bioinformatics Seminar 2005 Matthias Dietzen31 Validation of GOLD Results Ligand composition (good+close/errors+wrong): Heavy AtomsTorsions & free corners% H-bonding Max 52/5528/4066.7/53.9 Avg 20.4/ / /25.1 Min 6/90/08.8/4.8

Bioinformatics Seminar 2005 Matthias Dietzen32 Development The Fitness Function – H_Bond_Energy distance_wt 1, d ≤ 0.25 Å distance_wt: d (d max – d)/(d max – 0.25 Å), d in [0.25 Å,d max ] 0, d ≥ d max d max varies linearly from 4.0 Å (when the GA starts) to 1.5 Å (after genetic operations)  allows long range interactions in the beginning but only close contacts in the end

Bioinformatics Seminar 2005 Matthias Dietzen33 Development The Fitness Function – H_Bond_Energy angle_wt Acceptor w/o lone-pair directional preference: angle_wt = 1 For acceptors with directionality in the plane of lone-pairs: 1, θ < 20° angle_wt: θ [(60°– θ) / (60°-20°)] 2, θ in [20°,60°] 0, θ > 60° For acceptors with directionality along the lone-pairs: 1, θ > 160° angle_wt: Φ [(160°– θ) / (160°-60°)] 2, θ in [60°,160°] 0, θ < 60°

Bioinformatics Seminar 2005 Matthias Dietzen34 Development The Fitness Function – Complex_Energy ∑ atoms i ∑ atoms j E ij E ij = A/d ij 8 – B/d ij 4 (8-4 potential) smoother than standard Lennard-Jones 12-6 potential A, B chosen to reproduce the minimum of 12-6 potential Adjustments for hydrogen bonds E ij = 0 for interaction of donor-H and acceptor Distance between donor and acceptor is scaled by 1.43  reduces vdW-radii by 70%

Bioinformatics Seminar 2005 Matthias Dietzen35 Development The Fitness Function – Complex_Energy Let –k ij be minimum energy of interaction between two atoms i and j For E ij > scale x k ij => E ij = 1.5 x scale x k ij scale varies logarithmically from 1.0 (when GA starts) to (after genetic operations)  Encourages to form close contacts early in a GA run, while avoiding steric clashes in the end

Bioinformatics Seminar 2005 Matthias Dietzen36 Development The Fitness Function – Internal_Energy Internal_Energy steric energy (for each two atoms i,j) E ij = C/d ij 12 - D/d ij 6 with C and D chosen such that E ij is minimal for d ij = r i +r j torsional energy (for four consecutively bonded atoms i,j,k,l) E ijkl = ½ V ijkl [1 + η ijkl / |η ijkl | cos(|η ijkl | x ω ijkl ) ] with ω torsional angle η periodicity (predefined) V barrier to rotation (predefined)