A Multiobjective Approach to Combinatorial Library Design Val Gillet University of Sheffield, UK.

Slides:



Advertisements
Similar presentations
© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.
Advertisements

Topic Outline ? Black-Box Optimization Optimization Algorithm: only allowed to evaluate f (direct search) decision vector x objective vector f(x) objective.
CS6800 Advanced Theory of Computation
1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
Genetic Algorithms Contents 1. Basic Concepts 2. Algorithm
Elitist Non-dominated Sorting Genetic Algorithm: NSGA-II
Biologically Inspired AI (mostly GAs). Some Examples of Biologically Inspired Computation Neural networks Evolutionary computation (e.g., genetic algorithms)
Evolutionary Synthesis of MEMS Design Ningning Zhou, Alice Agogino, Bo Zhu, Kris Pister*, Raffi Kamalian Department of Mechanical Engineering, *Department.
Spring, 2013C.-S. Shieh, EC, KUAS, Taiwan1 Heuristic Optimization Methods Pareto Multiobjective Optimization Patrick N. Ngatchou, Anahita Zarei, Warren.
A New Evolutionary Algorithm for Multi-objective Optimization Problems Multi-objective Optimization Problems (MOP) –Definition –NP hard By Zhi Wei.
Evolutionary Computational Intelligence
Genetic algorithms for neural networks An introduction.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Introduction to Genetic Algorithms Yonatan Shichel.
Luddite: An Information Theoretic Library Design Tool Jennifer L. Miller, Erin K. Bradley, and Steven L. Teig July 18, 2002.
Automating Keyphrase Extraction with Multi-Objective Genetic Algorithms (MOGA) Jia-Long Wu Alice M. Agogino Berkeley Expert System Laboratory U.C. Berkeley.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
Applying Multi-Criteria Optimisation to Develop Cognitive Models Peter Lane University of Hertfordshire Fernand Gobet Brunel University.
Chapter 6: Transform and Conquer Genetic Algorithms The Design and Analysis of Algorithms.
A New Algorithm for Solving Many-objective Optimization Problem Md. Shihabul Islam ( ) and Bashiul Alam Sabab ( ) Department of Computer Science.
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Combinatorial Chemistry and Library Design
Evolutionary Intelligence
© Negnevitsky, Pearson Education, CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University.
Genetic Algorithms Genetic algorithms imitate natural optimization process, natural selection in evolution. Developed by John Holland at the University.
A genetic algorithm for structure based de-novo design Scott C.-H. Pegg, Jose J. Haresco & Irwin D. Kuntz February 21, 2006.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Soft Computing Lecture 18 Foundations of genetic algorithms (GA). Using of GA.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Genetic Algorithms Michael J. Watts
Genetic Algorithms Genetic algorithms imitate a natural optimization process: natural selection in evolution. Developed by John Holland at the University.
Supporting Conceptual Design Innovation through Interactive Evolutionary Systems I.C. Parmee Advanced Computation in Design and Decision-making CEMS, University.
Robin McDougall Scott Nokleby Mechatronic and Robotic Systems Laboratory 1.
GENETIC ALGORITHMS FOR THE UNSUPERVISED CLASSIFICATION OF SATELLITE IMAGES Ankush Khandelwal( ) Vaibhav Kedia( )
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithms Siddhartha K. Shakya School of Computing. The Robert Gordon University Aberdeen, UK
© Negnevitsky, Pearson Education, Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Introduction,
 Genetic Algorithms  A class of evolutionary algorithms  Efficiently solves optimization tasks  Potential Applications in many fields  Challenges.
Genetic Algorithms What is a GA Terms and definitions Basic algorithm.
Evolutionary Design (2) Boris Burdiliak. Topics Representation Representation Multiple objectives Multiple objectives.
Selecting Diverse Sets of Compounds C371 Fall 2004.
2/29/20121 Optimizing LCLS2 taper profile with genetic algorithms: preliminary results X. Huang, J. Wu, T. Raubenhaimer, Y. Jiao, S. Spampinati, A. Mandlekar,
Design of a Compound Screening Collection Gavin Harper Cheminformatics, Stevenage.
GENETIC ALGORITHM Basic Algorithm begin set time t = 0;
Authors: Soamsiri Chantaraskul, Klaus Moessner Source: IET Commun., Vol.4, No.5, 2010, pp Presenter: Ya-Ping Hu Date: 2011/12/23 Implementation.
Introduction to GAs: Genetic Algorithms Quantitative Analysis: How to make a decision? Thank you for all pictures and information referred.
1 Contents 1. Basic Concepts 2. Algorithm 3. Practical considerations Genetic Algorithm (GA)
Evolutionary Computing Chapter 12. / 26 Chapter 12: Multiobjective Evolutionary Algorithms Multiobjective optimisation problems (MOP) -Pareto optimality.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Identification of structurally diverse Growth Hormone Secretagogue (GHS) agonists by virtual screening and structure-activity relationship analysis of.
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
© P. Pongcharoen CCSI/1 Scheduling Complex Products using Genetic Algorithms with Alternative Fitness Functions P. Pongcharoen, C. Hicks, P.M. Braiden.
Computational Approach for Combinatorial Library Design Journal club-1 Sushil Kumar Singh IBAB, Bangalore.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Presented By: Farid, Alidoust Vahid, Akbari 18 th May IAUT University – Faculty.
Hirophysics.com The Genetic Algorithm vs. Simulated Annealing Charles Barnes PHY 327.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
ZEIT4700 – S1, 2016 Mathematical Modeling and Optimization School of Engineering and Information Technology.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
1 Genetic Algorithms Contents 1. Basic Concepts 2. Algorithm 3. Practical considerations.
Combinatorial Library Design Using a Multiobjective Genetic Algorithm
Bulgarian Academy of Sciences
C.-S. Shieh, EC, KUAS, Taiwan
CSC 380: Design and Analysis of Algorithms
Heuristic Optimization Methods Pareto Multiobjective Optimization
Genetic Algorithms Chapter 3.
EE368 Soft Computing Genetic Algorithms.
CSC 380: Design and Analysis of Algorithms
Multiobjective Optimization
Presentation transcript:

A Multiobjective Approach to Combinatorial Library Design Val Gillet University of Sheffield, UK

Outline  SELECT  GA based program for combinatorial library design  Combinatorial subset selection in product-space  Multiobjective optimisation via weighted-sum fitness function  Limitations of a weighted-sum approach  MoSELECT  Multiobjective optimisation via MOGA

Library Design is a Multiobjective Optimisation Problem  Early HTS results disappointing  Low hit rates  Hits too lipophilic; too flexible; high molecular weights…  Diverse libraries  Distance-based/cell-based diversity  Bioavailability; cost; ease of synthesis…  Focused/targeted libraries  Similarity to known active; predicted active by QSAR model; fit to receptor site  Bioavailability; cost,….

Product-Based Library Design  A two-component combinatorial library can be represented by a 2D array  A combinatorial subset can be defined by intersecting rows and columns of the array  Exploring all combinatorial subsets is equivalent to testing all permutations of the rows and columns of the array

R1R2 6  4 subset  Chromosome encoding  each chromosome represents a combinatorial subset as an integer string  one partition for each reactant pool  the size of a partition equals the no. of reactants required from the corresponding pool  Crossover, mutation and roulette wheel parent selection are used to evolve new potential solutions Selecting Combinatorial Subsets Using a GA

Multiobjective Optimisation in SELECT  Weighted-sum fitness function  enumerate the combinatorial library represented by a chromosome  calculate descriptors for molecules in the library  Objectives are scaled and user defined weights are applied

Multiobjective Optimisation in SELECT cont.  Diversity indices  distance-based (e.g. sum of pairwise dissimilarities and Daylight fingerprints)  cell-based  Physical property terms  minimise the difference between the distribution in the library and some reference distribution, e.g. “drug-like” profile derived from WDI  Cost: £  minimise the cost of the library

Library Enumeration in SELECT  Virtual library is enumerated upfront  ADEPT (A Daylight Enumeration and Profiling Tool)  Identify potential reactants  Filter out unwanted ones  Enumerate virtual library Reaction Tookit (Reaction transforms; MTZ language)  Descriptors are calculated upfront  Combinatorial subset accessed via fast lookup

Example: Amide Library Molecular weight Percentage of Compounds WDI Reactant-based Product-based  Product-based selection: diversity & molecular weight profile (Diversity 0.573)  10K virtual library 100 amines  100 carboxylic acids  30 x 30 amide subsets  WDI – World Drugs Index  Reactant-based selection: diversity (Diversity )

Limitations of a Weighted-Sum Fitness Function  Definition of fitness function difficult especially for different types of objectives  e.g. molecular weight profile and cost  Setting of weights is non-intuitive  Can result in regions of search space being obscured especially when objectives are in competition  Difficult to monitor progress since >1 objective to follow simultaneously  A single solution is found

Varying Weights in SELECT  Objectives are in competition resulting in trade-offs  A family of alternative solutions exist that are all equivalent

Multiobjective Optimisation  Evolutionary algorithms, e.g., GAs  operate with a population of individuals  well suited to search for multiple solutions in parallel  readily adapted to deal with multiobjective optimisation  MOGA: MultiObjective Genetic Algorithm  Fonseca & Fleming. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 28(1), 1998,

MOGA  Multiple objectives are handled independently without summation and without weights  A hyper-surface is mapped out in the search space  represents a continuum of solutions where all solutions are seen as equivalent  represents compromises or trade-offs between the various objectives  solutions are called non-dominated, or Pareto solutions.  A family of non-dominated solutions is sought rather than a single solution

Dominance & Pareto Ranking  A non-dominated individual is one where an improvement in one objective results in a deterioration in one or more of the other objectives when compared with the other individuals in the population  Pareto ranking: an individual’s rank corresponds to the number of individuals in the current population by which it is dominated f2f2 f1f A B

SELECT Single solution Initialise Population Select parents Apply genetic operators Calculate objectives: a,b,c... Apply fitness function f=w 1 a + w 2 b + w 3 c +... Rank based on fitness Test for convergence MoSELECT* Family of solutions Initialise Population Apply genetic operators Calculate objectives: a,b,c... Calculate dominance: a, b,c Rank using Pareto Ranking: based on dominance Test for convergence Select parents * Patent Applied for

1000 iterations5000 iterations 0 iterations100 iterations MoSELECT: Search Progress

 Each run of MoSELECT results in a family of solutions  Finding the same coverage of solutions using SELECT would require multiple runs using various combinations of weights  One run of MoSELECT takes the same cpu time as one run of SELECT Family of Solutions 5000 iterations  MW Diversity

Focused Library: Aminothiazoles  -bromoketones & thioureas extracted from ACD  ADEPT used to  filter reactants (MW < 300; RB < 8)  enumerate virtual library => products (74  -bromoketones & 170 thioureas)  MoSELECT used to design 15×30 subsets optimised on  Similarity to a target compound (Daylight fingerprints)  Cost ($/g)

MoSELECT Solutions: 1 0 iterations 5000 iterations

MoSELECT Solutions: iterations Running MoSELECT with niching

Moving to > 2 Objectives: Parallel Graph Representation Each objective is scaled using the Max and Min values achieved when the objective is optimised independently 5000 iterations  MW Diversity

Focused Library: Amides  100 × 100 virtual library  MoSELECT used to design 10 × 10 subsets  Objectives  Similarity to a target Sum of similarities using Daylight fps  Predicted bioavailability Each compound rated from 1 to 4 Sum of ratings  Hydrogen bond profile  Rotatable bond profile

MoSELECT Solutions  Population size 50  Iteration 5000  Niching 30%  Number of solutions = 11  CPU 53s (R12K 360 MHz)

Conclusions  Advantages of MoSELECT  a family of equivalent solutions is obtained in a single run with each solution representing one combinatorial library  this is achieved at vastly reduced computational cost compared to performing multiple runs of SELECT  no need to determine weights for objectives  optimisation of different types of objectives is readily achieved  visualisation of the search progress allows trade-offs between objectives to be observed  the user can make an informed choice on which solution(s) to explore

Acknowledgements  Illy Khatib, Peter Willett; Information Studies, University of Sheffield  Peter Fleming; Automatic Control and Systems Engineering, University of Sheffield  Darren Green, Andrew Leach; GlaxoSmithKline, UK  Funding by GlaxoSmithKline, UK  John Bradshaw; Daylight  Daylight for software support