Cheminformatics II Apr 2010 Postgrad course on Comp Chem Noel M. O’Boyle.

Slides:



Advertisements
Similar presentations
Dynamic web application for drug design research M. Chapman 1, N. MacCuish 1, J. MacCuish 1 J. Bradley 2, J. Blankley 3 1 Mesa Analytics & Computing, Inc.,
Advertisements

Scientific & technical presentation JChem Cartridge for Oracle
Scientific & technical presentation Calculator Plugins January 2011.
SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
Analysis of High-Throughput Screening Data C371 Fall 2004.
Project JUNIOR: Drug Discovery using Azure Project JUNIOR: Drug Discovery using Azure April 6-7, 2010 Redmond, Washington.
3D Molecular Structures C371 Fall Morgan Algorithm (Leach & Gillet, p. 8)
Ionization and dissociation of drugs-1
Jürgen Sühnel Institute of Molecular Biotechnology, Jena Centre for Bioinformatics Jena / Germany Supplementary Material:
234th ACS National MeetingPAPER ID: Division of Chemical Information Herman Skolnik Award Symposium Bridging the gap between discovery data and.
Lipinski’s rule of five
6th lecture Modern Methods in Drug Discovery WS07/08 1 More QSAR Problems: Which descriptors to use How to test/validate QSAR equations (continued from.
Building bridges for chemical information Interoperability and the Blue Obelisk Noel M. O’Boyle, et a lot of al The Blue Obelisk is a group of people and.
A Study on Feature Selection for Toxicity Prediction*
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.
Cloud Computing for Chemical Property Prediction Paul Watson School of Computing Science Newcastle University, UK Microsoft Cloud.
Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott.
Design of Small Molecule Drugs Targeted to RNA RNA Ontology Group May
In Silico Design of Selective Estrogen Receptor Modulators from Triazoles and Imines Joey Salisbury Dr. John C. Williams Small Molecules/Large Molecules.
Chemoinformatics in Drug Design
Doug Brutlag 2011 Genomics, Bioinformatics & Medicine Drug Development
Classification and Prediction: Regression Analysis
Application and Efficacy of Random Forest Method for QSAR Analysis
Chapter 9: Making new antibiotics. Model organisms are used to speed drug discovery A model organism is a non-human species that is extensively studied.
Large-scale computational design and selection of polymers for solar cells Dr Noel O’Boyle & Dr Geoffrey Hutchison ABCRF University College Cork Department.
Lecture 7: Computer aided drug design: Statistical approach. Lecture 7: Computer aided drug design: Statistical approach. Chen Yu Zong Department of Computational.
Extending that Line into the Future St. Louis CMG February 12, 2008 Wayne Bell – UniGroup, Inc.
Molecular Descriptors
Functional groups / Pharmacological Activity
Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
High Throughput Experimentation: Computational Requirements John M. Newsam Molecular Simulations Inc. (A Pharmacopeia subsidiary) “Workshop on Combinatorial.
Faculté de Chimie, ULP, Strasbourg, FRANCE
“Topological Index Calculator” A JavaScript application to introduce quantitative structure-property relationships (QSPR) in undergraduate organic chemistry.
ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.
Développement "IN SILICO" de nouveaux extractants et complexants de métaux Alexandre Varnek Laboratoire d’Infochimie, Université Louis Pasteur, Strasbourg,
Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
David Gallagher, Introduction for OpenTox, September David Gallagher Review of recent QSAR related experience for OpenTox August 2008.
3D- QSAR. QSAR A QSAR is a mathematical relationship between a biological activity of a molecular system and its physicochemical parameters. QSAR attempts.
Identifying Applicability Domains for Quantitative Structure Property Relationships Mordechai Shacham a, Neima Brauner b Georgi St. Cholakov c and Roumiana.
Ligand-based drug discovery No a priori knowledge of the receptor What information can we get from a few active compounds.
QSAR Study of HIV Protease Inhibitors Using Neural Network and Genetic Algorithm Akmal Aulia, 1 Sunil Kumar, 2 Rajni Garg, * 3 A. Srinivas Reddy, 4 1 Computational.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
Selecting Diverse Sets of Compounds C371 Fall 2004.
McKim Conference on Predictive Toxicology
Log Koc = MW nNO – 0.19 nHA CIC MAXDP Ts s = 0.35 F 6, 134 = MW: molecular weight nNO: number of NO bonds.
Computer-aided drug discovery (CADD)/design methods have played a major role in the development of therapeutically important small molecules for several.
Catalyst TM What is Catalyst TM ? Structural databases Designing structural databases Generating conformational models Building multi-conformer databases.
Introduction to Chemoinformatics and Drug Discovery Irene Kouskoumvekaki Associate Professor February 15 th, 2013.
A molecular descriptor database for homologous series of hydrocarbons ( n - alkanes, 1-alkenes and n-alkylbenzenes) and oxygen containing organic compounds.
Use of Machine Learning in Chemoinformatics
Part 2. Physicochemical Properties 1.Rules ( 양혜란 ) 2.Liphophilicity ( 백아름 ) 3.pKa ( 박숙진 ) 4.Solubility ( 전종수, 최영재 ) 5.Permeability ( 김소연, 강경태 )
Identification of structurally diverse Growth Hormone Secretagogue (GHS) agonists by virtual screening and structure-activity relationship analysis of.
1 Prediction of Phase Equilibrium Related Properties by Correlations Based on Similarity of Molecular Structures N. Brauner a, M. Shacham b, R.P. Stateva.
Computational Approach for Combinatorial Library Design Journal club-1 Sushil Kumar Singh IBAB, Bangalore.
Julia Salas CS379a Aim of the Study To determine distinguishing features of orally administered drugs –Physical and structural features probed.
SMA5422: Special Topics in Biotechnology Lecture 11: Computer aided drug design: QSAR approach. SMA5422: Special Topics in Biotechnology Lecture 11: Computer.
Natural products from plants
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
Toxicity vs CHEMICAL space
Lipinski’s rule of five
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
COMP61011 : Machine Learning Ensemble Models
Linear regression project
ADME/Tox PredictionTox Prediction. The characterization of Absorption, Distribution, Metabolism, and Excretion (also known as ADME) and Toxicity are essential.
Building Hypotheses and Searching Databases
Virtual Screening.
Rules for Rapid Property Profiling from Structure
Current Status at BioChemtek
Topological Index Calculator III
Cheminformatics Basics
Presentation transcript:

Cheminformatics II Apr 2010 Postgrad course on Comp Chem Noel M. O’Boyle

Substructure search using SMARTS SMARTS – an extension of SMILES for substructure searching –(“regular expressions for substructures”) Simple example –Ether: [OD2]([#6])[#6] Any oxygen with exactly two bonds each to a carbon Can get more complicated –Carbonic Acid or Carbonic Acid-Ester: [CX3](=[OX1])([OX2])[OX2H,OX1H0-1] Hits acid and conjugate base. Won't hit carbonic acid diester Example use of SMARTS –Create a list of SMARTS terms that identify functional groups that cause toxicological problems. –When considering what compounds to synthesise next in a medicinal chemistry program, search for hits to these SMARTS terms to avoid synthesising compounds with potential toxicological problems –FAF-Drugs2: Lagorce et al, BMC Bioinf, 2008, 9, 396.

FAF-Drugs2: Free ADME/tox filtering tool to assist drug discovery and chemical biology projects, Lagorce et al, BMC Bioinf, 2008, 9, 396.

Calculation of Topological Polar Surface Area TPSA –Ertl, Rohde, Selzer, J. Med. Chem., 2000, 43, –A fragment-based method for calculating the polar surface area

Quantitative Stucture-Activity Relationships (QSAR) Also QSPR (Structure-Property) –Exactly the same idea but with some physical property Create a mathematical model that links a molecule’s structure to a particular property or biological activity –Could be used to perceive the link between structure and function/property –Could be used to propose changes to a structure to increase activity –Could be used to predict the activity/property for an unknown molecule Problem: Activity = 2.4 * Does not compute! Need to replace the actual structure by some values that are a proxy for the structure - “Molecular descriptors” Numerical values that represent in some way some physico-chemical properties of the molecule We saw one already, the Polar Surface Area Others: molecular weight, number of hydrogen bond donors, LogP (octanol/water partition coefficient) It is usual to calculate 100 or more of these

Building and testing a predictive QSAR model Need dataset with known values for the property of interest Divide into 2/3 training set and 1/3 test set Choose a regression model –Linear regression, artificial neural network, support vector machine, random forest, etc. Train the model to predict the property values for the training set based on their descriptors Apply the model to the test set –Find the RMSEP and R 2 Root-mean squared error of prediction and correlation coefficient Practical Notes: –Descriptors can be calculated with the CDK or RDKit –Models can be built using R (r-project.org) –For a combination of the two, see rcdk

Lipinski’s Rule of Fives Took dataset of drug candidates that made it to Phase II Examined the distribution of particular descriptor values related to AMDE An orally active drug should not fail more than one of the following ‘rules’: –Molecular weight <= 500 –Number of H-bond donors <= 5 –Number of H-bond acceptors <= 10 –LogP <= 5 These rules are often applied as an pre-screening filter Chris Lipinski Rule of Fives Oral bioavailability Image: Note: Rule of thumb

Cheminformatics resources Programming toolkits: Open Source –OpenBabel (C++, Perl, Python,.NET, Java), RDKit (C++, Python), Chemistry Development Kit [CDK] (Java, Jython,...), PerlMol (Perl), MayaChemTools (Perl) –Cinfony (by me!) presents a simplified interface to all of these See for links to an online interactive tutorial and a talk Command-line interface: –OpenBabel (“babel”) See for information on filtering molecules by property or SMARTS See for similarity searching, –MayaChemTools GUI: –OpenBabel Specialized toolkits: –OSRA: image to structure –OPSIN: name to structure –OSCAR: Identify chemical terms in text Building models: R ( rcdk