QSAR Study of HIV Protease Inhibitors Using Neural Network and Genetic Algorithm Akmal Aulia, 1 Sunil Kumar, 2 Rajni Garg, * 3 A. Srinivas Reddy, 4 1 Computational.

Slides:



Advertisements
Similar presentations
Analysis of High-Throughput Screening Data C371 Fall 2004.
Advertisements

Everardo Macias, Patrick Tomboc Eamonn F. Healy, Chemistry Department,
Ligand Binding Site Prediction for HIV-1 Protease using Shape Comparison Techniques Manasi Jahagirdar 1, Vivek K Jalahalli 2, Sunil Kumar 1, A. Srinivas.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Artificial Neural Networks - Introduction -
ABSTRACT The BEAM EU research project focuses on the risk assessment of mixture toxicity. A data set of 124 heterogeneous chemicals of high concern as.
Faculty of Computer Science © 2006 CMPUT 605February 04, 2008 Novel Approaches for Small Bio-molecule Classification and Structural Similarity Search Karakoc.
Cheminformatics II Apr 2010 Postgrad course on Comp Chem Noel M. O’Boyle.
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
A Study on Feature Selection for Toxicity Prediction*
Summary Molecular surfaces QM properties presented on surface Compound screening Pattern matching on surfaces Martin Swain Critical features Dave Whitley.
4 Th Iranian chemometrics Workshop (ICW) Zanjan-2004.
Biological Data Mining A comparison of Neural Network and Symbolic Techniques
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.
Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott.
8 th Iranian workshop of Chemometrics 7-9 February 2009 Progress of Chemometrics in Iran Mehdi Jalali-Heravi February 2009 In the Name of God.
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
Multimedia Data Mining Arvind Balasubramanian Multimedia Lab (ECSS 4.416) The University of Texas at Dallas.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Lecture 7: Computer aided drug design: Statistical approach. Lecture 7: Computer aided drug design: Statistical approach. Chen Yu Zong Department of Computational.
Predicting Highly Connected Proteins in PIN using QSAR Art Cherkasov Apr 14, 2011 UBC / VGH THE UNIVERSITY OF BRITISH COLUMBIA.
Data Mining Techniques
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Combining Statistical and Physical Considerations in Deriving Targeted QSPRs Using Very Large Molecular Descriptor Databases Inga Paster and Mordechai.
Cédric Notredame (30/08/2015) Chemoinformatics And Bioinformatics Cédric Notredame Molecular Biology Bioinformatics Chemoinformatics Chemistry.
Molecular Descriptors
NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.
Topological Summaries: Using Graphs for Chemical Searching and Mining Graphs are a flexible & unifying model Scalable similarity searches through novel.
Chapter 13. The Impact of Genomics on Antimicrobial Drug Discovery and Toxicology CBBL - Young-sik Sohn-
Appendix: The WEKA Data Mining Software
High Throughput Experimentation: Computational Requirements John M. Newsam Molecular Simulations Inc. (A Pharmacopeia subsidiary) “Workshop on Combinatorial.
1. Chemometrices:  Signal processing  Classification & pattern reccognation  Experimental design  Multivariative calibration  Quantitative Structure.
Développement "IN SILICO" de nouveaux extractants et complexants de métaux Alexandre Varnek Laboratoire d’Infochimie, Université Louis Pasteur, Strasbourg,
Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Open source software and web services for designing therapeutic molecules G. P. S. Raghava, Head Bioinformatics Centre, Institute of Microbial Technology,
Institute for Advanced Studies in Basic Sciences – Zanjan Kohonen Artificial Neural Networks in Analytical Chemistry Mahdi Vasighi.
Martin Waldseemüller's World Map of 1507 Zanjan. Roberto Todeschini Viviana Consonni Davide Ballabio Andrea Mauri Alberto Manganaro chemometrics molecular.
Artificial Neural Network Building Using WEKA Software
Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro QSAR Research Unit, Department of Structural and Functional Biology, University of.
Solution of a Partial Differential Equations using the Method of Lines
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Computer-aided drug discovery (CADD)/design methods have played a major role in the development of therapeutically important small molecules for several.
Seascape/Indira Ghosh, JNU Introduction. Seascape Intro Indo-US company with a mission to foster international collaborative research, education and software.
Catalyst TM What is Catalyst TM ? Structural databases Designing structural databases Generating conformational models Building multi-conformer databases.
MUTAGENICITY OF AROMATIC AMINES: MODELLING, PREDICTION AND CLASSIFICATION BY MOLECULAR DESCRIPTORS M.Pavan and P.Gramatica QSAR Research Unit, Dept. of.
Use of Machine Learning in Chemoinformatics
Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau.
Computer Science and Engineering PhD in Computer Science Monday, November 07, :00 a.m. – 11:00 a.m. Swearingen Conference Room 3A75 Network Based.
Computational Approach for Combinatorial Library Design Journal club-1 Sushil Kumar Singh IBAB, Bangalore.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Molecular Modeling in Drug Discovery: an Overview
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
Nahid Abbas and Sonal Dubey
Computational Tools Seminar
SMA5422: Special Topics in Biotechnology
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Optimization Based Design of Robust Synthetic
Virtual Screening.
High Throughput Experimentation: Computational Requirements
Current Status at BioChemtek
P. Gramatica1, F. Consolaro1, M. Vighi2, A. Finizio2 and M. Faust3
Evaluating Classifiers for Disease Gene Discovery
Presentation transcript:

QSAR Study of HIV Protease Inhibitors Using Neural Network and Genetic Algorithm Akmal Aulia, 1 Sunil Kumar, 2 Rajni Garg, * 3 A. Srinivas Reddy, 4 1 Computational Science Research Center, San Diego State University, CA; 2 ECE Dept., San Diego State University, San Diego, CA; 3 Chem. Dept., California State University, San Marcos, CA; 4 Molecular Modeling Group, IICT, Hyderabad, India. Descriptor ThinningResults Materials and Methods Summary & Future Work Introduction Total Descriptors IC 50 set: Final Descriptors EC 50 set: Final Descriptors Linear and Non-linear regression techniques are employed to analyze a large dataset of 334 compounds of HIV protease inhibitors (Kempf et al.). The data set was studied using MLR (Multiple Linear Regression) and ANN (Artificial Neural Network) techniques to develop QSAR (Quantitative Structure-Activity Relationship) models. Each ligand (inhibitor or drug molecule) was described by means of physico-chemical and structural descriptors (features) which encode constitutional, electrostatic, geometrical, quantum and topological properties. The capability of descriptors to address the variations in ligand(s) was linked to the predictive power of QSAR models. Combined information from these models helps in 'transforming data into information and information into knowledge' from chem-informatics point of view. References Reported dataset (Kempf et al.) with their experimental Biological Activity (EC 50 and IC 50 )‏ Lower energy conformation is obtained for each compound by means of Molecular Mechanics Minimization. A total of 277 descriptors calculated. Objective Descriptors(Matlab): IC 50 dataset(reduced from 277 to 148), EC 50 dataset(reduced from 277 to 157). Subjective Descriptors(WEKA/GA): IC 50 dataset(reduced from 148 to 9), EC 50 dataset(reduced from 157 to 7)‏ Both MLR and FNN methods were implemented in WEKA. (1) Fernandez et al.; “Quantitative structure-activity relationship to predict differential inhibition of aldose reductase by flavonoid compounds” Bioorganic and Medicinal Chemistry, 2005, 13, (2) (a)CODESSA software, Semichem Inc., USA; (b) MATLAB, The MathWorks Inc.; (c) WEKA software, the University of Waikato, New Zealand. (3) Fernandez, M. and Caballero, J.;”Linear and nonlinear modeling of antifungal activity of some heterocyclic ring derivatives using multiple linear regression and Bayesian- regularized neural networks”, J. Mol. Model., 2006, 12, (4) Goldberg, D. E.; Genetic Algorithms in Search Optimization & Machine Learning; Addison-Wesley:Reading, MA, (5) “Data Mining: Practical Machine Learning tools and techniques”, 2 nd Edition, Morgan Kaufmann, San Fransisco, IC 50 dataset EC 50 dataset For the IC 50 dataset, the constitutional and topological properties have the largest contribution, while for the EC 50 dataset, electrostatic and topological properties are significant. Non-linear models have better predictive capability. However, the linear models can be interpreted better mechanistically. Presence of similar descriptors in both types of models validates our results. Further studies using other statistical and ANN based regression techniques are in progress, in order to find the best QSAR models and descriptors. These models will serve as useful computational tools for prediction of biological activity of this class of HIV protease inhibitors. Research Design