December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07.

Slides:



Advertisements
Similar presentations
Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc.
Advertisements

Analysis of High-Throughput Screening Data C371 Fall 2004.
1 Sequential Screening S. Stanley Young NISS HTS Workshop October 25, 2002.
A novel approach to analysis of primary HTS data Compound Set Enrichment Thibault VarinAnsgar Schuffenhauer Gubler, H., Parker, C., Zhang, JH., Raman,
C2D Cheminformatics : Methods,Tools and Results By OSDD-Cheminformatics team.
Dual-domain Hierarchical Classification of Phonetic Time Series Hossein Hamooni, Abdullah Mueen University of New Mexico Department of Computer Science.
High Throughput Computing and Protein Structure Stephen E. Hamby.
Establishing a Successful Virtual Screening Process Stephen Pickett Roche Discovery Welwyn.
Data mining in bioinformatics: problems and challenges Sorin Draghici WWW:
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
A Study on Feature Selection for Toxicity Prediction*
Application of image processing techniques to tissue texture analysis and image compression Advisor : Dr. Albert Chi-Shing CHUNG Presented by Group ACH1.
Performance measures Morten Nielsen, CBS, BioCentrum, DTU.
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
Active Learning Strategies for Compound Screening Megon Walker 1 and Simon Kasif 1,2 1 Bioinformatics Program, Boston University 2 Department of Biomedical.
+ Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie.
NI Assays: Troubleshooting & Analysis of Curve Fitting Graph 1: Standard clinical isolate Good NA activity S shaped curve (observed points) IC50 in expected.
A Neural Network Predictor for Peptide Fragmentation in Mass Spectrometry Arunima Ram Advisor : Dr. Predrag Radivojac Co-Advisor : Dr. Haixu Tang Co-Advisor.
Introduction to Data Mining Engineering Group in ACL.
1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.
1 Data mining of toxic chemicals & database-based toxicity prediction Jiansuo Wang & Luhua Lai Institute of Physical Chemistry, Peking University P. R.
University of Toronto 8/30/20151 Data Mining The Art and Science of Obtaining Knowledge from Data Dr. Saed Sayad.
Understanding Data Analytics and Data Mining Introduction.
Science & Technology Centers Program Center for Science of Information Bryn Mawr Howard MIT Princeton Purdue Stanford Texas A&M UC Berkeley UC San Diego.
ROC 1.Medical decision making 2.Machine learning 3.Data mining research communities A technique for visualizing, organizing, selecting classifiers based.
Chapter 13. The Impact of Genomics on Antimicrobial Drug Discovery and Toxicology CBBL - Young-sik Sohn-
BsysE595 Lecture Basic modeling approaches for engineering systems – Summary and Review Shulin Chen January 10, 2013.
Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
Evan Bolton, PhD Jian Zhang, PhD Gang Fu, PhD Jun. 15, 2015 U.S. National Center for Biotechnology Information (NCBI)
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Prediction of HIV-1 Drug Resistance: Representation of Target Sequence Mutational Patterns via an n-Grams Approach Majid Masso School of Systems Biology,
Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
Artificial Neural Networks An Introduction. What is a Neural Network? A human Brain A porpoise brain The brain in a living creature A computer program.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
TCOF 3 :Repositioning of Chemical compounds From Different Classes as part of Virtual Screening Under the Guidance of PI: Dr UCA JALEEL, Dr Bheemarao Ugarkar.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Neural Network Implementation of Poker AI
ECCR Overview/MLSCN. NIH Roadmap Series of initiatives designed to pursue major opportunities in biomedical research and gaps in current knowledge that.
LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.
Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
Catalyst TM What is Catalyst TM ? Structural databases Designing structural databases Generating conformational models Building multi-conformer databases.
Introduction to Chemoinformatics and Drug Discovery Irene Kouskoumvekaki Associate Professor February 15 th, 2013.
A Brief Introduction and Issues on the Classification Problem Jin Mao Postdoc, School of Information, University of Arizona Sept 18, 2015.
PubChem: An Open Repository for Chemical Structure and Biological Activity Information Steve Bryant The NIH Biowulf Cluster: 10 Years of Scientific Supercomputing.
Predicting patterns of biological performance using chemical substructure features Diego Borges-Rivera 08/04/08.
Use of Machine Learning in Chemoinformatics
Identification of structurally diverse Growth Hormone Secretagogue (GHS) agonists by virtual screening and structure-activity relationship analysis of.
Computational Approach for Combinatorial Library Design Journal club-1 Sushil Kumar Singh IBAB, Bangalore.
TCOF 3 :Repositioning of Chemical compounds From Different Classes as part of Virtual Screening Under the Guidance of PI: Dr UCA JALEEL (IISc Research.
Frequent Sub-Structure-Based Approaches for Classifying Chemical Compounds Mukund Deshpande, Michihiro Kuramochi, George Karypis University of Minnesota,
CZ5225 Methods in Computational Biology Lecture 6: Drug resistance mutations and model developments CZ5225 Methods in Computational Biology.
Indiana University School of Indiana University ECCR Summary Infrastructure: Cheminformatics web service infrastructure made available as a community resource.
PubChem Search Features Stephen Bryant Wolfram Data Summit Scientific and Technical Data Session September 9-10, 2010.
Population sequencing using short reads: HIV as a case study Vladimir Jojic et.al. PSB 13: (2008) Presenter: Yong Li.
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
Automatic Lung Cancer Diagnosis from CT Scans (Week 1)
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
CS 698 | Current Topics in Data Science
CS548 Fall 2017 Decision Trees / Random Forest Showcase by Yimin Lin, Youqiao Ma, Ran Lin, Shaoju Wu, Bhon Bunnag Showcasing work by Cano,
Virtual Screening.
A Robust and Optimally Pruned Extreme Learning Machine
David Lubo-Robles*, Thang Ha, S. Lakshmivarahan, and Kurt J. Marfurt
Roc curves By Vittoria Cozza, matr
Machine Learning – a Probabilistic Perspective
Areas under the receiver operating characteristic (ROC) curves for both the training and testing data sets based on a number of hidden-layer perceptrons.
Presentation transcript:

December 1, Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

December, Introduction need  The need for new anti-HIV agents  Drug resistant mutations  Side effect / Toxicity limit  The limit in virtual screening techniques  Huge chemical space  Structure and activities challenge  The challenge to generate new hypothesis  Noise reduction  Knowledge exploration

December, HIV-1 reverse transcriptase associated ribonuclease H assay Associations among actives and inactives (Tanimoto ≥ 0.95) inactives actives Compounds Collection Total number of compounds Total number of clusters Isolated Clusters (only 1 member) Non-Isolated Clusters (2 members and above) Active1, Inactive63,  Designed by Dr. Michael Parniak of the University of Pittsburgh  PubChem, AID 565  compounds tested, 1250 of them are actives  Distributions of all compounds tested in The HIV-1 RT- RNase H assay HIV-1 RT-RNase H assay

December, A learning machine  PubChem fingerprint : Numerical understanding of molecular structures 2-Methyl pentane (1,1,…0)  Probabilistic Neural Network : Machine learning … … Hidden Layer Summation Layer New Compounds Fingerprint processing Output Layer

December, Model evaluation  10 fold Cross validation  Sensitivity 86.4%  Specificity 92.0%  Matthews correlation coefficient 0.26  Receiver Operating Characteristic (ROC) curve analysis  Area Under Curve (AUC) : 0.90

December, Conclusions Acknowledgements  The bioactivity data of HIV-1 RT-RNH assay can be learned for new hypothesis  The machine learning of HTS data can be used for virtual hits exploration  Yanli Wang  Steve Bryant  This research was supported by the Intramural Research Program of the NIH/NLM