University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Mining Three-dimensional Chemical Structure Data Sean McIlwain.

Slides:



Advertisements
Similar presentations
SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
Advertisements

JChem Node extension for KNIME Workbench
1 CS 391L: Machine Learning: Rule Learning Raymond J. Mooney University of Texas at Austin.
Christopher Reynolds Supervisor: Prof. Michael Sternberg Bioinformatics Department Division of Molecular Biosciences Imperial College London.
ADBIS 2007 A Clustering Approach to Generalized Pattern Identification Based on Multi-instanced Objects with DARA Rayner Alfred Dimitar Kazakov Artificial.
Mark Goadrich Computer Science and Mathematics
Inductive Logic Programming: The Problem Specification Given: –Examples: first-order atoms or definite clauses, each labeled positive or negative. –Background.
3D Molecular Structures C371 Fall Morgan Algorithm (Leach & Gillet, p. 8)
PharmaMiner: Geometric Mining of Pharmacophores 1.
GRAPH-BASED HIERARCHICAL CONCEPTUAL CLUSTERING by Istvan Jonyer, Lawrence B. Holder and Diane J. Cook The University of Texas at Arlington.
1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005.
August 19, 2002Slide 1 Bioinformatics at Virginia Tech David Bevan (BCHM) Lenwood S. Heath (CS) Ruth Grene (PPWS) Layne Watson (CS) Chris North (CS) Naren.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Taking a Numeric Path Idan Szpektor. The Input A partial description of a molecule: The atoms The bonds The bonds lengths and angles Spatial constraints.
Summary Molecular surfaces QM properties presented on surface Compound screening Pattern matching on surfaces Martin Swain Critical features Dave Whitley.
Transferability of Parameters for Al and O in the Adsorption of Alkanes on Alumina Damien A. Bernard-Brunel Department of Chemical Engineering and Materials.
M. Wagener 3D Database Searching and Scaffold Hopping Markus Wagener NV Organon.
Luddite: An Information Theoretic Library Design Tool Jennifer L. Miller, Erin K. Bradley, and Steven L. Teig July 18, 2002.
Pharmacophore-based Molecular Docking Bert E. Thomas, Diane Joseph- McCarthy, Juan C.Avarez.
Structural biology and drug design: An overview Olivier Taboureau Assitant professor Chemoinformatics group-CBS-DTU
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
RAPID: Randomized Pharmacophore Identification for Drug Design PW Finn, LE Kavraki, JC Latombe, R Motwani, C Shelton, S Venkatasubramanian, A Yao Presented.
Query Planning for Searching Inter- Dependent Deep-Web Databases Fan Wang 1, Gagan Agrawal 1, Ruoming Jin 2 1 Department of Computer.
GGAGATTCTGGGCCACTTTGGTTCCCCATGAGCCAAGACGGCACTTCTAATTTGCATTCCCTACCGGAGTCCCTGTCTGTAGCCAGCCTGGCTTTCAGCTGGTGCCCAAAGTGACAAATGTATCTGCAATGACAAAGGTAC CCTGGAAGGGCTCGCCCTCTGCGGAATTTCAGTTCATGCAGGCCTTGGTGCTTCCACATCTGTCCAAGGGCCTTTCAAATGTGACTTTTAACTCTGTGGATTGATTTGCCCGG
Pharmacophore and FTrees
Overview of Kernel Methods Prof. Bennett Math Model of Learning and Discovery 2/27/05 Based on Chapter 2 of Shawe-Taylor and Cristianini.
1/24 Learning to Extract Genic Interactions Using Gleaner LLL05 Workshop, 7 August 2005 ICML 2005, Bonn, Germany Mark Goadrich, Louis Oliphant and Jude.
Optimization of Carbocyclic Analogues to a Specific Pharmaceutical Enzyme Target via Discovery Studio TM Douglas Harris Department of Chemistry and Biochemistry,
Topological Summaries: Using Graphs for Chemical Searching and Mining Graphs are a flexible & unifying model Scalable similarity searches through novel.
Leiden University. The university to discover. Enhancing Search Space Diversity in Multi-Objective Evolutionary Drug Molecule Design using Niching 1. Leiden.
Introduction to ILP ILP = Inductive Logic Programming = machine learning  logic programming = learning with logic Introduced by Muggleton in 1992.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Learning Ensembles of First-Order Clauses for Recall-Precision Curves A Case Study in Biomedical Information Extraction Mark Goadrich, Louis Oliphant and.
Designing a Tri-Peptide based HIV-1 protease inhibitor Presented by, Sushil Kumar Singh IBAB,Bangalore Submitted to Dr. Indira Ghosh AstraZeneca India.
Computational Intelligence: Methods and Applications Lecture 30 Neurofuzzy system FSM and covering algorithms. Włodzisław Duch Dept. of Informatics, UMK.
Performance of Molecular Polarization Methods Marco Masia.
3D- QSAR. QSAR A QSAR is a mathematical relationship between a biological activity of a molecular system and its physicochemical parameters. QSAR attempts.
Open source software and web services for designing therapeutic molecules G. P. S. Raghava, Head Bioinformatics Centre, Institute of Microbial Technology,
Speeding Up Relational Data Mining by Learning to Estimate Candidate Hypothesis Scores Frank DiMaio and Jude Shavlik UW-Madison Computer Sciences ICDM.
Learning Ensembles of First-Order Clauses for Recall-Precision Curves Preliminary Thesis Proposal Mark Goadrich Department of Computer Sciences University.
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
QSAR Study of HIV Protease Inhibitors Using Neural Network and Genetic Algorithm Akmal Aulia, 1 Sunil Kumar, 2 Rajni Garg, * 3 A. Srinivas Reddy, 4 1 Computational.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Computer-aided drug discovery (CADD)/design methods have played a major role in the development of therapeutically important small molecules for several.
LITERATURE SEARCH ASSIGNMENT A) Properties of diatomic molecules A diatomic molecule is a molecule composed of two atoms. For homonuclear diatomics the.
PharmaMiner: Geometric Mining of Pharmacophores 1.
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
Catalyst TM What is Catalyst TM ? Structural databases Designing structural databases Generating conformational models Building multi-conformer databases.
Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3  Provided an approach to generating.
Dimensionality reduction
1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal.
Copyright © 2005, SAS Institute Inc. All rights reserved. Statistical Discovery. TM From SAS. 1 Minimum Potential Energy Designs Bradley Jones & Christopher.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Community self-Organizing Map and its Application to Data Extraction Presenter: Chun-Ping Wu Authors:
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Computational Approach for Combinatorial Library Design Journal club-1 Sushil Kumar Singh IBAB, Bangalore.
Molecular Modeling in Drug Discovery: an Overview
HP-SEE In the search of the HDAC-1 inhibitors. The preliminary results of ligand based virtual screening Ilija N. Cvijetić, Ivan O. Juranić,
SMA5422: Special Topics in Biotechnology Lecture 11: Computer aided drug design: QSAR approach. SMA5422: Special Topics in Biotechnology Lecture 11: Computer.
Jump to first page Relational Data. Jump to first page Inductive Logic Programming (ILP) n Can use ILP to find a set of rules capturing a property that.
Christopher Reynolds Supervisor: Prof. Michael J.E. Sternberg
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Virtual Screening.
Louis Oliphant and Jude Shavlik
Nancy Baker SILS Bioinformatics Seminar January 21, 2004
Similarity Search: A Matching Based Approach
Chapter 01 Introduction.
Strength of relation High Low Number of data Relationship Data
Presentation transcript:

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Mining Three-dimensional Chemical Structure Data Sean McIlwain & David Page University of Wisconsin Arno Spatola & David Vogel University of Louisville Slyvie Blondelle Torey Pines Research Institute for Molecular Studies

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Advantages of ILP for Pharmacophore Discovery Works with 3-dimensional databases without loss of information. Multi-relational. More comprehensive search of the space than typical greedy or hill-climbing in RP or ANNs. Pruning of search space can be achieved using appropriate scoring functions.

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Methodology Pick active/inactive molecules for system under study. Generate 3-dimensional structures via a conformational search (Charmm). Convert 3-dimensional results into Datalog format. Extract point groups from Datalog data. Perform ILP search of point groups for a pharmacophore clause.

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Overview of Aleph ILP Search (Top-down) Saturates on 1 st uncovered positive example. Performs top-down admissible search of the subsumption lattice above this example. Use of compression function to limit size of clauses and number of allowed negative example coverage.

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Lattice of Clauses for the Given Hypothesis Language

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Pharmacophores Found in Pseudomonas Growth Inhibition Active(A)  molecule(A), positive(A,B), positive(A,C), hydrophobic(A,D), hydrogen-donor(A,E), distance(A,B,C,4.05), distance(A,B,D,7.45), distance(A,B,E,6.53), distance(A,C,D,6.93), distance(A,C,E,5.90), distance(A,D,E,7.88). Cross-validation accuracy is 100%.

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Example 4-point Pharmacophore Overlayed With an Active Molecule Green - hydrophobic Blue – positive charge Orange – hydrogen donor Distances are in Angstroms (Å)

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Example Pharmacophore Green – hydrophobic Blue – positive charge Orange – hydrogen donor

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Example Pharmacophore Green - hydrophobic Blue – positive charge Orange – hydrogen donor

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Conclusion Cross-validate results and pharmacophore found shows that ILP is well suited to mining 3- dimensional chemical structure data. Directly mines relational data with the use of feature vectors. Interacts well with scientists. Approach also repeated successfully by Marchand-Geneste, N. (2002).

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Future Work Testing the proposed pharmacophore with new molecules. Application of ILP to related data (drug- discovery, proteomics, SNPs, etc.).

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Bibliography Brooks, B. (1983). Charmm: A program for macromolecular energy, minimization, and dynamics calculations. J. Comp. Chem., 4: Finn, P., Muggelton, S., Page, D., and Srinivasan, A. (1998). Discovery of pharmacophores using Inductive Logic Programming. Machine Learning, 30: Marchand-Geneste, N. (2002). A new approach to pharmacophore mapping and qsar analysis using inductive logic programming. Application to thermolysin inhibitors and glycogen phosphorylase b inhbitors. J. Med. Chem., 45: Ramakrishnan, R. (1999). Database Management Systems. McGraw-Hill Higher Education, Columbus, OH, 2 nd edition.

University of Wisconsin ISMB 2002 Department of Biostatistics Department of Computer Science Aleph Maintained by Ashwin Srinivasan publicily available at eas/machlearn/Aleph/aleph.html. eas/machlearn/Aleph/aleph.html Runs using yap prolog compiler maintained by Vítor Santos Costa obtainable at