– A Web Portal for Massively Parallel Flexible Docking, using the ChemAxon toolkit. Dragos Horvath*, Kun Attila, Benjamin Parent*, Cyrielle.

Slides:



Advertisements
Similar presentations
L. Brillet (CEA) – ANR meeting – META08 Hammamet 1/18 validation ANR meeting - 28/10/2008 CEA Grenoble - DSV/iRTSV/CMBA.
Advertisements

Scientific & technical presentation Calculator Plugins January 2011.
Java Solutions for Cheminformatics Feb 2008 Whats new for PP.
Scientific & technical presentation Structure Visualization with MarvinSpace Oct 2006.
Scientific & technical presentation Standardizer January 2008.
UGM, June, 2007 Presenting: Szabolcs Csepregi JChem Base and Cartridge latest.
1 Miklós Vargyas, Judit Papp May, 2005 MarvinSpace – live demo.
2008 Accelrys EUGM Pipelining ChemAxon Szilard Dorant Solutions for Cheminformatics.
Solutions for Cheminformatics
3D Molecular Structures C371 Fall Morgan Algorithm (Leach & Gillet, p. 8)
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.
Molecular dynamics refinement and rescoring in WISDOM virtual screenings Gianluca Degliesposti University of Modena and Reggio Emilia Molecular Modelling.
Genetic Algorithms. Some Examples of Biologically Inspired AI Neural networks Evolutionary computation (e.g., genetic algorithms) Immune-system-inspired.
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
Genetic Algorithms and Protein Folding Based on lecture by Dr. Steffen Schulze-Kremer
Taking a Numeric Path Idan Szpektor. The Input A partial description of a molecule: The atoms The bonds The bonds lengths and angles Spatial constraints.
Docking of Protein Molecules
FLEX* - REVIEW.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
BL5203: Molecular Recognition & Interaction Lecture 5: Drug Design Methods Ligand-Protein Docking (Part I) Prof. Chen Yu Zong Tel:
Protein Side Chain Packing Problem: A Maximum Edge-Weight Clique Algorithmic Approach Dukka Bahadur K.C, Tatsuya Akutsu and Tomokazu Seki Proceedings of.
RAPID: Randomized Pharmacophore Identification for Drug Design PW Finn, LE Kavraki, JC Latombe, R Motwani, C Shelton, S Venkatasubramanian, A Yao Presented.
AMG Attendance System Product Description Copyright © 2009 AMG Employee Management, Inc.AMG Employee Management, Inc.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Comparative Evaluation of 11 Scoring Functions for Molekular Docking Authors: Renxiao Wang, Yipin Lu and Shaomeng Wang Presented by Florian Lenz.
Genetic Programming.
Protein Tertiary Structure Prediction
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
ClusPro: an automated docking and discrimination method for the prediction of protein complexes Stephen R. Comeau, David W.Gatchell, Sandor Vajda, and.
Conformational Sampling
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.
Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille.
Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177
May 2009 ChemAxon - What’s New?. What’s new and hot? All products have seen enhancements in the past 12 months BUT WHAT’S REALLY HOT?
Genetic Algorithms Michael J. Watts
Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss Joint EGGE and EDGeS Summer School.
Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.
Phase diagram calculation based on cluster expansion and Monte Carlo methods Wei LI 05/07/2007.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
In silico discovery of inhibitors using structure-based approaches Jasmita Gill Structural and Computational Biology Group, ICGEB, New Delhi Nov 2005.
Developments ‘08… Inclusion of intermolecular degrees of freedom Changes of the genetic algorithm –Constrained Sampling  min i <  i ≤  max i –Hybrid.
A Technical Introduction to the MD-OPEP Simulation Tools
Altman et al. JACS 2008, Presented By Swati Jain.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
R L R L L L R R L L R R L L water DOCKING SIMULATIONS.
Developing a Force Field Molecular Mechanics. Experimental One Dimensional PES Quantum mechanics tells us that vibrational energy levels are quantized,
TileSoft: Sequence Optimization Software for Designing DNA Secondary Structures P. Yin*, B. Guo*, C. Belmore*, W. Palmeri*, E. Winfree †, T. H. LaBean*
FlexWeb Nassim Sohaee. FlexWeb 2 Proteins The ability of proteins to change their conformation is important to their function as biological machines.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.
Force-field-based conformational sampling & docking: status, results, issues of the project Dragos Horvath ANR (Agence Nationale de la Recherche)
FESR Consorzio COMETA - Progetto PI2S2 Molecular Modelling Applications Laura Giurato Gruppo di Modellistica Molecolare (Prof.
HP-SEE In the search of the HDAC-1 inhibitors. The preliminary results of ligand based virtual screening Ilija N. Cvijetić, Ivan O. Juranić,
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
A new tri-objective model for the flexible docking problem
March 21, 2008 Christopher Bruns
ReMoDy Reactive Molecular Dynamics for Surface Chemistry Simulations
Virtual Screening.
Ligand Docking to MHC Class I Molecules
Conformational Search
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

– A Web Portal for Massively Parallel Flexible Docking, using the ChemAxon toolkit. Dragos Horvath*, Kun Attila, Benjamin Parent*, Cyrielle Boutroue #, Even Gaël #, Alexandru Tantar #, Nouredine Melab #, Sylvaine Roy & El-Ghazali Talbi # * UMR 8576, CNRS – Univ. Lille 1, FR Chemistry Dept, Univ. Babes-Bolyai, Cluj, RO Chemistry Dept, Univ. Babes-Bolyai, Cluj, RO # LIFL CNRS/INRIA – Univ. Lille 1, FR DSV/iRTSV - CEA, Grenoble, FR DSV/iRTSV - CEA, Grenoble, FR

Outline… The goal: automated fully flexible docking on computer grids –GRID5000, –Specific conformational sampling & docking software based on hybrid genetic algorithms –Upfront chemoinformatics tools to preprocess submitted ligands. –Upfront tools to define the active site and its key degrees of freedom (!) –Interface to start docking calculations & analyze results.

Genetic Algorithm-driven Conformational Sampling Tool Based on a Genetic Algorithm, coding conformers as "chromosomes" in which each locus stands for a torsional angle value. n … The In Silico Darwinian Evolution, leading to fitter and fitter (lower energy) conformers, was enhanced by –hybridization with various optimization heuristics –Fine-tuning of the parameters controlling the evolutionary strategy Customized CVFF force field, employing: a 10 Å cutoff (with a termination function) a smoothing procedure to avoid interatomic clashes a continuum solvent model Effective interatomic distance d 0 ij Smoothing distance d ij

GRID 5000-based Planetary Model If (free node) DEPLOY Island Model - Executables - Molecule File - Constraint Files - Seeds List - Taboo List - Operational Pars -Stablest Chromosomes -Sampling Success Score Solution Merger & Clusterer Conformer & Cluster Database Panspermia policy center recent clusters: seeds old clusters: taboo Sampling Success vs. Operational Pars Stop: max. Mission Nr. no new clusters since N missions Operational Pars Selector

Ab initio folding of Trp cage 1L2Y: native structure (reproducibly) found and ranked as most stable. Planetary model used max. 20 nodes for 4…5 daysAb initio folding of Trp cage 1L2Y: native structure (reproducibly) found and ranked as most stable. Planetary model used max. 20 nodes for 4…5 days Conformer # 1, RMS~1.8 Ǻ - good match to native structure

Ab initio folding of Trp zipper 1LE1: native structure found and ranked as most stable. Planetary model used max. 20 nodes for 4…5 daysAb initio folding of Trp zipper 1LE1: native structure found and ranked as most stable. Planetary model used max. 20 nodes for 4…5 days Conformer # 1, RMS~0.8 Ǻ - perfect match to native structure

However, there is a high risk that almost well folded solutions, being declared taboo, block the access to the correct fold !!However, there is a high risk that almost well folded solutions, being declared taboo, block the access to the correct fold !! Conformer # 79, RMS~2.4 Ǻ - near-optimal fold closest to native structure Conformer # 1, RMS~3.8 Ǻ - is a poor match of the native structure

Outline… The goal: automated fully flexible docking on computer grids –GRID5000, –Specific conformational sampling & docking software based on hybrid genetic algorithms –Upfront chemoinformatics tools to pre- process submitted ligands. –Upfront tools to define the active site and its key degrees of freedom (!) –Interface to start docking calculations & analyze results.

Ligand Preprocessor… Ligand File Upload Standardize Main Tautomer & Key µSpecies (occurrence > m%) All Tautomers & Major µSpecies All Tautomers & Key µSpecies Add Explicit H Force Field Typing (PMapper) JChem DataBase Cannonical SMILES Dockable Conformer Families Main Tautomer & Major µSpecies User Toggle Partial Charge Calculation Generate Conformer(s) If new… A selector of top N most likely tautomeric forms would be of outstanding help here – many among the enumenated tautomers are chemically meaningless! Potential problems with resonant structures in the ChargePlugin: try { ChgPlug.setTakeResonantStructure(true); chgMol=ChgPlug.setMolecule(currSpec,false,false); ChgPlug.run(); … } catch (Exception ResonantStructureFailed) { try { ChgPlug.setTakeResonantStructure(false); … } catch (Exception WhateverYouDoItBreaks) { … } Using PMapper to assign CVFF types to ligand atoms required SMARTS encoding of the CVFF templates corresponding to local neighborhoods defining each potential type Issues yet to be settled: use the Conformer Plugin to generate several hundreds of geometries Conformer diversity control ? How many degrees of freedom can be handled without significant risk of missing key minima ? Docking will use a different force field – how compatible are ConformerPlugin & CVFF energies? use the Conformer Plugin to generate a starting geometry, then use a ligand-specific GA-driven sampling engine to explore the phase space.

Outline… The goal: automated fully flexible docking on computer grids –GRID5000, –Specific conformational sampling & docking software based on hybrid genetic algorithms –Upfront chemoinformatics package to pre- process submitted ligands. –Upfront tools to define the active site and its key degrees of freedom (!) –Interface to start docking calculations & analyze results.

Active Site Definition… Ligand.. Fixed protein residues Fixed backbone, Mobile sidechains Flexible Loop: Backbone ( but not ) & sidechains This part of the backbone is a « frozen » part of the flexible loop: Rigid body rototranslations Formally « break » bond to unlock degrees of freedom in loop

Protein Preprocessing Tools… At this point, the user has to explicitly provide: –A BioSym.car protein file, with correct protonation states, partial charges and force field types for all protein atoms –A list of fixed atoms –A list of explicitly broken bonds to enable sampling ring and fixed end loop geometries –A list of active torsional degrees of freedom (otherwise, all potentially rotatable exocyclic single bonds will be considered) Will MarvinSpace evolve such as to allow for graphical input the above-mentioned information? Would the Charge Plugin, the MicroSpecies Plugin and PMapper work upon input of a.pdb file? JChem Database of defined active sites and their sampled unbound state geometries…

Outline… The goal: automated fully flexible docking on computer grids –GRID5000, –Specific conformational sampling & docking software based on hybrid genetic algorithms –Upfront chemoinformatics tools to pre-process submitted ligands. –Upfront tools to define the active site and its key degrees of freedom (!) –Interface to start docking calculations & analyze results.

The Dock Manager In an ideal world, an academic user may add own molecule collections to the database, but should be allowed to try docking other peoples molecules as well… –Paranoia Manager: whos allowed to dock my compounds and use my active sites? –Make use of JChem facilities to search ligand database by cannonical structures, and return all the conformers of associated µSpecies/Tautomers. Chemoinformatic filters welcome, even based on the Holy Rule of Five! Methodological progress on the docking algorithms still required: –Is rigid docking of each of ~10 2 ligand conformers into each one of the ~10 4 active site geometries feasible? Would it be assimilable to flexible docking? –How to score: free energy based on docked vs. unbound ensembles? What about µSpecies & Tautomer penalties?

Docked Conformer Visualization

Conclusions & Perspectives This is a long-term ANR-funded public research project: The primary goal is developing efficient GRID-based conformational sampling & docking methodologies – to provide the core routines for parallel evolutionary computinghttp://paradiseo.gforge.inria.fr/ However, chemically meaningful ligand and active site management is as important as the docking step! –ChemAxon tools for ligand standardizing, protonation, charge & force field management, 3D-buildup, storage & retrieval, visualizing,…, are perfectly suited! –Progress needed on macromolecule & active site management. T T H A N K S H T H A N K S A T H A N K S N T H A N K S K T H A N K S S