December 1, Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07
December, Introduction need The need for new anti-HIV agents Drug resistant mutations Side effect / Toxicity limit The limit in virtual screening techniques Huge chemical space Structure and activities challenge The challenge to generate new hypothesis Noise reduction Knowledge exploration
December, HIV-1 reverse transcriptase associated ribonuclease H assay Associations among actives and inactives (Tanimoto ≥ 0.95) inactives actives Compounds Collection Total number of compounds Total number of clusters Isolated Clusters (only 1 member) Non-Isolated Clusters (2 members and above) Active1, Inactive63, Designed by Dr. Michael Parniak of the University of Pittsburgh PubChem, AID 565 compounds tested, 1250 of them are actives Distributions of all compounds tested in The HIV-1 RT- RNase H assay HIV-1 RT-RNase H assay
December, A learning machine PubChem fingerprint : Numerical understanding of molecular structures 2-Methyl pentane (1,1,…0) Probabilistic Neural Network : Machine learning … … Hidden Layer Summation Layer New Compounds Fingerprint processing Output Layer
December, Model evaluation 10 fold Cross validation Sensitivity 86.4% Specificity 92.0% Matthews correlation coefficient 0.26 Receiver Operating Characteristic (ROC) curve analysis Area Under Curve (AUC) : 0.90
December, Conclusions Acknowledgements The bioactivity data of HIV-1 RT-RNH assay can be learned for new hypothesis The machine learning of HTS data can be used for virtual hits exploration Yanli Wang Steve Bryant This research was supported by the Intramural Research Program of the NIH/NLM