A Novel SAR-Driven Approach for Identifying True High-Throughput Screening Hits S. Frank Yan, Hayk Asatryan, Jing Li, Kaisheng Chen, and Yingyao Zhou Genomics.

Slides:



Advertisements
Similar presentations
EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
Advertisements

Symantec 2010 Windows 7 Migration Global Results.
Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers New York University Stern School Victor Sheng Foster Provost Panos.
Program Verification using Probabilistic Techniques Sumit Gulwani Microsoft Research Invited Talk: VSTTE Workshop August 2006 Joint work with George Necula.
Variations of the Turing Machine
© Jim Barritt 2005School of Biological Sciences, Victoria University, Wellington MSc Student Supervisors : Dr Stephen Hartley, Dr Marcus Frean Victoria.
AP STUDY SESSION 2.
Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc.
1
RWTÜV Fahrzeug Gmbh, Institute for Vehicle TechnologyTÜV Mitte Group 1 GRB Working Group Acceleration Pattern Results of pass-by noise measurements carried.
UGM 2006 Miklós Vargyas Scientific Workshop Maximum Common Substructure.
1 Real World Chemistry Virtual discovery for the real world Joe Mernagh 19 May 2005.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Addition and Subtraction Equations
OPTN Modifications to Heart Allocation Policy Implemented July 12, 2006 Changed the allocation order for medically urgent (Status 1A and 1B) patients Policy.
NTDB ® Annual Report 2009 © American College of Surgeons All Rights Reserved Worldwide Percent of Hospitals Submitting Data to NTDB by State and.
Exploring Traversal Strategy for Web Forum Crawling Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei Zhang and Wei-Ying Ma Chinese Academy of Sciences.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J R W Hunter,
1 Contact details Colin Gray Room S16 (occasionally) address: Telephone: (27) 2233 Dont hesitate to get in touch.
The 5S numbers game..
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Break Time Remaining 10:00.
The basics for simulations
EE, NCKU Tien-Hao Chang (Darby Chang)
You will need Your text Your calculator
PP Test Review Sections 6-1 to 6-6
1 IMDS Tutorial Integrated Microarray Database System.
Real Estate Market Analysis
1 Atomic Routing Games on Maximum Congestion Costas Busch Department of Computer Science Louisiana State University Collaborators: Rajgopal Kannan, LSU.
Localisation and speech perception UK National Paediatric Bilateral Audit. Helen Cullington 11 April 2013.
Oil & Gas Final Sample Analysis April 27, Background Information TXU ED provided a list of ESI IDs with SIC codes indicating Oil & Gas (8,583)
Reporting and Interpreting Cost of Goods Sold and Inventory
Regression with Panel Data
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Biology 2 Plant Kingdom Identification Test Review.
1..
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
1 Termination and shape-shifting heaps Byron Cook Microsoft Research, Cambridge Joint work with Josh Berdine, Dino Distefano, and.
When you see… Find the zeros You think….
DSS Decision Support System Tutorial: An Instructional Tool for Using the DSS.
Before Between After.
: 3 00.
5 minutes.
Institut für Physik der Atmosphäre Institut für Physik der Atmosphäre Object-Oriented Best Member Selection in a Regional Ensemble Forecasting System Christian.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Research Summary 08/2010 Dr. Andrej Mošat` Prof. A. Linninger, Laboratory for Product and Process Design, M/C 063 University of Illinois at Chicago 04.
DTU Informatics Introduction to Medical Image Analysis Rasmus R. Paulsen DTU Informatics TexPoint fonts.
1 Titre de la diapositive SDMO Industries – Training Département MICS KERYS 09- MICS KERYS – WEBSITE.
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Ch 14 實習(2).
Clock will move after 1 minute
famous photographer Ara Guler famous photographer ARA GULER.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Select a time to count down from the clock above
1.step PMIT start + initial project data input Concept Concept.
9. Two Functions of Two Random Variables
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
A novel approach to analysis of primary HTS data Compound Set Enrichment Thibault VarinAnsgar Schuffenhauer Gubler, H., Parker, C., Zhang, JH., Raman,
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Design of a Compound Screening Collection Gavin Harper Cheminformatics, Stevenage.
Presentation transcript:

A Novel SAR-Driven Approach for Identifying True High-Throughput Screening Hits S. Frank Yan, Hayk Asatryan, Jing Li, Kaisheng Chen, and Yingyao Zhou Genomics Institute of the Novartis Research Foundation, John Jay Hopkins Drive, San Diego, CA 92121, USA ChemAxon User Group Meeting, June 2006 Modern drug discovery relies heavily on large-scale high-throughput screening (HTS) to identify potential starting points for medicinal chemistry optimization. The typical top X activity cutoff method used to generate hits from large amount of raw HTS data is intrinsically error-prone due to the noisy nature of single-dose HTS, which oftentimes leads to a large number of false positives. Here we propose a novel knowledge-based, SAR- driven statistical approach for primary HTS hit generation using ChemAxon technology for clustering and chemical fingerprints. The method is also implemented with SciTegic Pipeline Pilot. In a proof-of-concept study for an in-house HTS campaign, the new approach proved to be more effective in identifying confirmed active compounds in diverse chemical scaffolds containing valuable SAR information, as demonstrated by a significantly improved confirmation rate compared to the traditional top X cutoff method. A Proof-of-Concept Study HTS data from an internal project were used and results from secondary experiments were used as benchmark. The 50,000 most active compounds were selected for analysis (HTS activity < ~0.76) Compound clustering and fingerprinting were generated using ChemAxon software. OPI approach Top X method Scaffold-based Probability Score Alone Is Sufficient to Prioritize Hits Confirmation rate for those selected compounds Significant Structural Diversity in the Selected Hits Some Scaffolds Picked by OPI SIDXXXX645 SIDXXX414 8 compounds selected, 5/6 confirmed active mean = 0.05 stdev. = 0.46 SIDXXX598 8 compounds selected, 7/7 confirmed active mean = 0.05 stdev. = compounds selected, 12/28 confirmed active mean = 0.11 stdev. = compounds selected, 31/36 confirmed active mean = 0.31 stdev. = 0.09 SIDXXXX000 Great Improvement over the traditional Top X method Advantages of OPI Hit-picking An individualized activity threshold for every cluster/scaffold instead of a one-fits-all cutoff Effective in eliminating experimental artifacts (particularly those in the high-activity region) Improved hit confirmation rate (85% vs. 55%) Hits are inherently analyzed on a cluster/scaffold basis and SAR information can be readily extracted, facilitating the hit-to-lead process Some level of library redundancy is required Ontology-Based Pattern Identification* in Hit Selection *Novel Statistical Approach for Primary High-Throughput Screening Hit Selection S. Yan et al. J. Chem. Inf. Model. 45(6), , 2005 In silico gene function prediction using ontology-based pattern identification Y. Zhou et al. bioinformatics, vol.21 no , p Guilt by association Structure–activity relationship To automatically determine a subset of compounds for each cluster/scaffold, which not only share similar structure but also similar high HTS activity Cluster all tested, QC-ed compounds (>1,000,000) from an HTS campaign and rank them by activity For one given cluster, select more and more compounds by decreasing the activity cutoff and compute the corresponding hypergeometric P-value The cutoff for this cluster is determined when P-value reaches minimum P 0, and member compounds whose activities are higher than the cutoff are selected as potential hits and assigned a score P 0 Repeat steps 2 and 3 for all clusters Rank/select hits based on score P 0 and HTS activity N compounds from HTS A cluster of n compounds m Cluster probability score P 0 = min P( N, n, m, m ) Increasingly select m compounds by lowering the activity cutoff m compounds (P=P 0 ) are selected as potential hits for this compound cluster/scaffold Lower activity, more compounds Implementation Using Pipeline Pilot The Hit-to-Lead Paradigm Two important milestones that have fundamental far-reaching effects Bleicher et al. (2003) Nat. Rev. Drug Discov., 2, 369 Cherry-Pick the HTS Hits A new approach to more effectively select primary hits is urgently needed! Low activityHigh activity # of compounds An arbitrary activity cutoff In many real cases, the confirmation rate is often low ~100 to ~5000 The HTS Approach Initial HTS campaign Quality control Primary hit selection Hit validation >1,000,0001,000,0001, HTS assay activity Compound group Highly active singletons Scaffolds with good activity and good SAR Scaffolds with good activity but okay SAR cutoff Scaffolds with very bad SAR cutoff traditional cutoff Likely a false positive Scaffolds with okay activity but good SAR Valuable SAR Is Immediately Caught for This Scaffold Imidazopyridine Selected hits Not selected