Presentation is loading. Please wait.

Presentation is loading. Please wait.

Luddite: An Information Theoretic Library Design Tool Jennifer L. Miller, Erin K. Bradley, and Steven L. Teig July 18, 2002.

Similar presentations


Presentation on theme: "Luddite: An Information Theoretic Library Design Tool Jennifer L. Miller, Erin K. Bradley, and Steven L. Teig July 18, 2002."— Presentation transcript:

1 Luddite: An Information Theoretic Library Design Tool Jennifer L. Miller, Erin K. Bradley, and Steven L. Teig July 18, 2002

2 Outline  Overview  Search Strategy  Cost Function  Algorithms  Algorithm Extensions  Implementation Details  Results

3 Overview  Genomics and proteomics provide many novel targets  Need to find drugs for targets Which compound to screen? What target? Methods to answer debated for many years  QSAR  Recently combinatorial and parallel synthesis techniques have transformed question of which single compound to analyze to one of which collection of compounds (library).

4 Overview  Develop algorithm for design libraries Discrete – collection of individual compounds Combinatorial – collections of compounds synthesized in a parallel or combinatorial fashion  Based on information theoretic techniques

5 Overview  Idea – Use molecules to “interrogate” target receptor about what chemical features are required for binding  Objective – Compose library maximizing conclusions drawn from “answers” across all possible experimental outcomes  Goal – Design library that allows discovery of most information about optimization target

6 Search Strategy  Strategies used in “20 Questions” are applicable Binary Search  With every guess eliminate half the search space Codeword Search  Every outcome corresponds to a single codeword  Optimal set of questions can be asked simultaneously  Same set of optimal questions can be used every time

7 Search Strategy

8  Library design analogous to “20 Questions” Searching for features required for ligand binding, desired phenotype, and/or good pharmacokinetic properties instead of a number  “feature” – four-point pharmacophore

9 Search Strategy - Example

10 Search Strategy - Assumptions  “20 Questions” Analogy useful but assumes 1.Every compound tests half of possible features 2.Can synthesize any compound in design space 3.Every assay value is accurate 4.Goal is a single feature

11 Search Strategy - Remedies  Eliminating Assumptions 1. Minimum of log2(F) bits to decode F outcomes  Loose upper bound on number of compounds 2. Ability of set of questions to decode message is invariant to column reordering – therefore not necessary that every compound in design space be obtainable in order to find a maximally efficient set of questions

12 Search Strategy - Remedies 3. Error-correcting codes (ECC) based on Hamming Distance 4. Adjust probability of features in an iterative process and prune unlikely features.  Will probably lead to convergence  Enhances Efficiency  Improves probability of success

13 Cost Function  Given set of features search for a set of compounds that allow decoding of each individual feature If not possible seek to decode as many features as possible with flattest distribution across size of feature classes  Feature Class – subset of features that all have same codeword  Entropy well suited to this calculation

14 Cost Function - Entropy  Entropy – measure of uncertainty All codewords same – no uncertainty -> minimal entropy All codewords different -> maximum entropy Wish to optimize following equation  M is library measure  H is entropy of feature classes  C is # distinct classes  ||ci|| is size of feature class i  F is # of features

15 Cost Function – Entropy Example

16 Algorithm - Overview  Start with list of synthesized compounds  Goal - select subset to maximize entropy  State - set of compounds whose entropy can be calculated  Note: From entropy calculation that state is a function of classes but our moves through state space are a function of the compounds. In general can’t be calculated incrementally and must be completely reevaluated whenever the state changes  Stark contrast with other library design methods Despite seeming limitation method is very efficient

17 Algorithm - Details  Approach to discrete and combinatorial designs very similar  Both use a greedy build-up of library to desired number of compounds Greedy – technique that utilizes local max to find global max  Followed by a second phase that reevaluates each of the library components looking for a better selection  Repeat till no improvement

18 Algorithm - Extensions 1.Often desirable to guarantee certain items included in library 2.Ability to sub sample source pool during build-up and optimization phases  Dramatically decrease run time  Only slightly impact quality of designs 3.Define minimum Tanimoto fingerprint similarity between any two compounds in discrete library  1 implemented for discrete and combinatorial algorithms.  2 and 3 only implemented for discrete algorithm.

19 Implementation Details  C++  Microsoft Window NT  500 MHz Intel Pentium III  500 MB RAM

20 Results  9 different libraries selected with algorithm 273,373 compound source pool 3 component reaction A+B+C->D Monomer lists of length 33,436 19 4-point pharmacophore signatures calculated for all compounds in source pool  Compared final measures to optimal result and random result

21 Results

22 Results - Entropy  Combinatorial algorithm lags behind discrete one for performance  Discrete Library of 91 compounds has same measure as optimal combinatorial library of 250 compounds  Still possibly more cost- effective to synthesize combinatorial library  General rule – twice as many compounds required in a combinatorial library to achieve same information as a discrete library  Iterative setting Use combinatorial algorithm early to discover Use discrete algorithm later to cherry-pick specific compounds


Download ppt "Luddite: An Information Theoretic Library Design Tool Jennifer L. Miller, Erin K. Bradley, and Steven L. Teig July 18, 2002."

Similar presentations


Ads by Google