From Analyzing the Tuberculosis Genome to Modeling the Milky Way Galaxy Using Volunteer Computing for Computational Science Travis Desell Department of.

Slides:



Advertisements
Similar presentations
Crew Pairing Optimization with Genetic Algorithms
Advertisements

" The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22,
Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
The Halo of the Milky Heidi Jo Newberg Rensselaer Polytechnic Institute.
Cyberinfrastructure for Scalable and High Performance Geospatial Computation Xuan Shi Graduate assistants supported by the CyberGIS grant Fei Ye (2011)
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
Optimizing genetic algorithm strategies for evolving networks Matthew Berryman.
A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.
Evolutionary Computational Intelligence
Introduction to Genetic Algorithms Yonatan Shichel.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Evolutionary Computational Intelligence
Genetic Algorithms Nehaya Tayseer 1.Introduction What is a Genetic algorithm? A search technique used in computer science to find approximate solutions.
Chapter 6: Transform and Conquer Genetic Algorithms The Design and Analysis of Algorithms.
Levels of organization: Stellar Systems Stellar Clusters Galaxies Galaxy Clusters Galaxy Superclusters The Universe Everyone should know where they live:
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Genetic Algorithm.
Efficient Model Selection for Support Vector Machines
1 Hybrid methods for solving large-scale parameter estimation problems Carlos A. Quintero 1 Miguel Argáez 1 Hector Klie 2 Leticia Velázquez 1 Mary Wheeler.
Evolutionary Algorithms BIOL/CMSC 361: Emergence Lecture 4/03/08.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
1 December 12, 2009 Robust Asynchronous Optimization for Volunteer Computing Grids Department of Computer Science Department of Physics, Applied Physics.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Genetic algorithms Prof Kang Li
Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory Mixed Integer Problems Most optimization algorithms deal.
ART – Artificial Reasoning Toolkit Evolving a complex system Marco Lamieri Spss training day
Researchers: Preet Bola Mike Earnest Kevin Varela-O’Hara Han Zou Advisor: Walter Rusin Data Storage Networks.
1 Evaluation of parallel particle swarm optimization algorithms within the CUDA™ architecture Luca Mussi, Fabio Daolio, Stefano Cagnoni, Information Sciences,
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
PSO and its variants Swarm Intelligence Group Peking University.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
1 “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions.
Artificial Intelligence Chapter 4. Machine Evolution.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
Learning by Simulating Evolution Artificial Intelligence CSMC February 21, 2002.
and Volunteer Computing at RPI Travis Desell RCOS, April 23, 2010.
Introduction to Research 2011 Introduction to Research 2011 Ashok Srinivasan Florida State University Images from ORNL, IBM, NVIDIA.
Genetic Algorithms CSCI-2300 Introduction to Algorithms
 Genetic Algorithms  A class of evolutionary algorithms  Efficiently solves optimization tasks  Potential Applications in many fields  Challenges.
Genetic Algorithms. 2 Overview Introduction To Genetic Algorithms (GAs) GA Operators and Parameters Genetic Algorithms To Solve The Traveling Salesman.
MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #12 2/20/02 Evolutionary Algorithms.
Biologically inspired algorithms BY: Andy Garrett YE Ziyu.
GENETIC ALGORITHM Basic Algorithm begin set time t = 0;
Genetic algorithms: A Stochastic Approach for Improving the Current Cadastre Accuracies Anna Shnaidman Uri Shoshani Yerach Doytsher Mapping and Geo-Information.
FACTS Placement Optimization For Multi-Line Contignecies Josh Wilkerson November 30, 2005.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Selection and Recombination Temi avanzati di Intelligenza Artificiale - Lecture 4 Prof. Vincenzo Cutello Department of Mathematics and Computer Science.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.
Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-
Genetic Algorithm(GA)
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
Hirophysics.com The Genetic Algorithm vs. Simulated Annealing Charles Barnes PHY 327.
Evolutionary Algorithms Jim Whitehead
Particle Swarm Optimization (2)
Job Scheduling in a Grid Computing Environment
Robust Asynchronous Optimization Using Volunteer Computing Grids
Grid Computing Colton Lewis.
Jon Purnell Heidi Jo Newberg Malik Magdon-Ismail
Multi-Objective Optimization
EE368 Soft Computing Genetic Algorithms.
Genetic Algorithm Soft Computing: use of inexact t solution to compute hard task problems. Soft computing tolerant of imprecision, uncertainty, partial.
Coevolutionary Automated Software Correction
Population Methods.
Presentation transcript:

From Analyzing the Tuberculosis Genome to Modeling the Milky Way Galaxy Using Volunteer Computing for Computational Science Travis Desell Department of Computer Science, Rensselaer Polytechnic Institute November 29, 2010University of North Dakota

1. Computational Science 2. A Case for Asynchronous Computing 3. Asynchronous Optimization Evolutionary Algorithms Asynchronous EAs Simulation Verification 4. Questions?

The Sagittarius Dwarf Tidal Stream Image (above): [Ibata et al. 1997, AJ] Image (below): David Martinez-Delgado (MPIA) & Gabriel Perez (IAC) The Sagittarius Dwarf Galaxy is merging with the Milky Way The dwarf is being tidally disrupted by the Milky Way, creating long “tails.” Provide information on matter distribution in Milky Way Provide constraints on Galactic Halo Mapping the Tidal Stream will:

Image: sdss.org 230+ million objects 8,400 square degrees in the sky Large percentage of north galactic cap Very little data in galactic plane (too much dust) Several hundred thousand stars SLOAN Digital Sky Survey

The Milky Way Halo Bulge Thin Disk Thick Disk ~30 kiloparsecs (100,000 light-years) Sun Sagittarius Dwarf Galaxy Tidal Stream Data Wedge Image: Matthew Newby

Sagittarius Stream Model Assume stream is a cylinder Radial drop-off given by a Gaussian Distribution 2 background parameters (new model has 4): r0, q 6 parameters per stream: ε, μ, r, θ, φ, σ A single stream with the old model has an 8 dimensional search space Often fit multiple streams for search spaces with more than 20 dimensions! Background distribution:

Nathan Cole, Heidi Newberg, Malik Magdon-Ismail, Travis Desell, Kristopher Dawsey, Warren Hayashi, Jonathan Purnell, Boleslaw Szymanski, Carlos A. Varela, Benjamin Willett, and James Wisniewski. Maximum Likelihood Fitting of Tidal Streams with Application to the Sagittarius Dwarf Tidal Tails. Astrophysical Journal, 683: , 2008.Maximum Likelihood Fitting of Tidal Streams with Application to the Sagittarius Dwarf Tidal Tails Nathan Cole. Maximum Likelihood Fitting of Tidal Streams with Application to the Sagittarius Dwarf Tidal Tails. PhD thesis. Rensselaer Polytechnic Institute Maximum Likelihood Fitting of Tidal Streams with Application to the Sagittarius Dwarf Tidal Tails Further Reading

N-Body Simulation Density of stars along the Orphan Stream Can we simulate the formation of tidal streams?

N-Body Simulation Density of stars can be simulated with N-body simulations, and fitness to real data can be optimized to determine Orphan Stream progenitor parameters (mass, size, evolution time)

Travis Desell, Benjamin Willett, Matthew Arsenault, Heidi Newberg, Malik Magdon- Ismail, Boleslaw Szymanski, Carlos A. Varela. Evolving N-Body Simulations to Determine the Origin and Structure of the Milk Way Galaxy's Halo using Volunteer Computing. International Parallel and Distributed Processing Symposium (IPDPS) To appear (hopefully).Evolving N-Body Simulations to Determine the Origin and Structure of the Milk Way Galaxy's Halo using Volunteer Computing Further Reading

Find protein binding sites using Gibbs sampling Use random walks (Markov chains) which result in sites distributed according to their actual probability of being the correct binding site Initial sequences: Mycobacterium tuberculosis Yersinia pestis (cause of the Bubonic plague)

What is a Binding Site? Alberts, Johnson, Lewis, Raff, Roberts, & Walter, Molecular Biology of the Cell 4th Edition, 2002 Binding sites are squences of DNA before a gene that proteins bind to. Different proteins will cause the gene to either ‘turn on’ or ‘turn off’.

Finding Binding Sites Biology is messy -- binding sites are not exact sequences. Multiple species with the same genes will have similar binding sites. We need to find ‘motifs’ which have the best probability of matching sequences of DNA across species.

Objective - Regulatory Circuits Howard-Ashby, Materna, Brown, Tu, Oliveri, Cameron, & Davidson, Dev Biol, 2006 Turning a gene on causes new proteins to be produced, what binding sites will that activate? Turning a gene off stops production of proteins, which other binding sites will that activate?

A Case for Asynchronous Computing

Hosts

Hosts

Statistics over 25,000 active users (from over 150 countries) over 36,000 active hosts ~850 teraflops (has reached 1.6 petaflops): most powerful BOINC project was 3rd most powerful computing system (behind and the fastest supercomputer) most of this from GPU computing

GPU Application First GPU implementation was user- contributed Compared to 3.0Ghz AMD Phenom(tm) II X4 940: ATI HD5870 GPU - 109x speedup NVidia GeForce GTX 285 GPU - 17x speedup Requires double-precision calculations: NVidia GPUs have less double precision real estate Application would be 6.2x faster on the ATI GPU, 7.8x faster on the NVidia GPU using single-precision math Travis Desell, Anthony Waters, Malik Magdon-Ismail, Boleslaw Szymanski, Carlos Varela, Matthew Newby, Heidi Newberg, Andreas Przystawik and Dave Anderson. Accelerating the volunteer computing project with GPUs. In the 8th International Conference on Parallel Processing and Applied Mathematics (PPAM 2009), Wroclaw, Poland, September 2009.Accelerating the volunteer computing project with GPUs

Data from Supercomputer Cores by Top 500 Rank (Circa November 2009) RPI CCNI BlueGene/L

Architectures are becoming heterogeneous (CPUs, GPUs) As hosts/cores increase, so does the chance of errors/failures A Case for Asynchronous Computing

New algorithms need to be efficient, scalable and reliable Asynchronous Computing -- minimize synchronization/dependencies between computing hosts A Case for Asynchronous Computing

Asynchronous Optimization

Take some function: f(p 1, p 2,... p n ) = ? How can we find p = p 1, p 2,... p n such that f is maximized (or minimized)? What is Optimization?

Genetic Search: based on evolution, new populations are generated by selection, mutation and recombination (reproduction). Particle Swarm Optimization: individuals or particles ‘swarm’ around the search space, being attracted to the best particle position and their own previously best found position. Differential Evolution: individuals ‘evolve’ by recombination with other individuals and the differentials between other individuals. Evolutionary Algorithms

Further Reading on EA Strategies Travis Desell. Asynchronous Global Optimization for Massive-Scale Computing. PhD thesis. Rensselaer Polytechnic Institute. 2009Asynchronous Global Optimization for Massive-Scale Computing Travis Desell, Boleslaw Szymanski, and Carlos A. Varela. An Asynchronous Hybrid Genetic-Simplex Search for Modeling the Milky Way Galaxy using Volunteer Computing. In Genetic and Evolutionary Computation Conference (GECCO 2008), Atlanta, Georgia, pages , July 2008.An Asynchronous Hybrid Genetic-Simplex Search for Modeling the Milky Way Galaxy using Volunteer Computing Extra slides at the end.

An Example

Minimize Sphere: f(x) = Minimum at f(0,..., 0)

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, 1 1. Set bounds: -5 to 5 for all parameters 2. Create initial population

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, 1 3. Calculate fitnesses for individuals fitness

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, 1 4. Sort the population (only for GS) fitness

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, 1 5. Generate new population via heuristics. GS uses Selection, Mutation, Recombination fitness

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 5. Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 5. Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, 0 5. Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 5. Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, , 0, Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, , 0, 2.5 2, 1, Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 population 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, , 0, 2.5 2, 1, , 3, Generate new population via heuristics GS uses Selection, Mutation, Recombination

Genetic Search Example 6. Go to step 3 using the new population. 2, -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, , 0, 2.5 2, 1, , 3, -0.5

Genetic Search Example 3. Calculate fitnesses for individuals fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, , 0, 2.5 2, 1, , 3, -0.5

Genetic Search Example 4. Sort the population 5. Generate new population... and so on fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, , 0, 2.5 2, 1, , 3, -0.5

Genetic Search Example 4. Sort the population 5. Generate new population... and so on fitness , -1, 0 3, 0, 1 new population 2, 4, -3 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, , 0, 2.5 2, 1, , 3, -0.5 After one iteration the population has already improved quite a bit

Problems with Iterations/Generations ? ? ? ? fitness ? 2, -1, 0 population 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, 0.5 A popular parallel computing strategy is to the divide fitness evaluations among worker processors. ? ? 1, -1, , -0.5, 0 worker 1 ? ? 1, -1, , -0.5, 0 worker 2 ? 1, -1, 0 worker 3

Problems with Iterations ? ? ? ? fitness ? 2, -1, 0 population 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, 0.5 Already we have a problem: What if we can’t divide the population evenly? ? ? 1, -1, , -0.5, 0 worker 1 ? ? 1, -1, , -0.5, 0 worker 2 ? 1, -1, 0 worker 3 busy busy busyIdle

Problems with Iterations ? ? ? ? fitness ? 2, -1, 0 population 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, 0.5 What if a host fails? ? ? 1, -1, , -0.5, 0 worker 1 ? ? 1, -1, , -0.5, 0 worker 2 ? 1, -1, 0 worker 3 busy busy busyIdle

Problems with Iterations ? ? ? ? fitness ? 2, -1, 0 population 1, -1, 0 -1, -2, , -0.5, 0 2, 1.5, 0.5 ? ? 1, -1, , -0.5, 0 worker 2 (slow) ? 1, -1, 0 worker 3 busy busy busyIdle Idle ? ? 1, -1, , -0.5, 0 worker 1 (fast) What if the fitness evaluation time is non-deterministic? What if the processors are heterogeneous?

Problems with Iterations You always have to wait for the slowest fitness evaluations before you can proceed to the next generation. Failures are worse, you need to be resend and recalculate before you can proceed. This can cause a lot of idle time. Load balancing can help but it is not perfect, and doesn’t handle failures.

Asynchronous Optimization Strategy Population Fitness (1) Fitness (2) Fitness (n) Individual (1) Individual (2) Individual (n) Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) Workers (Perform Fitness Evaluation) Report results and update population Request Work Send Work Generate individuals when queue is low Add new individual in order and remove worst individual Select parents from population to generate new individual

Asynchronous Optimization Example population 1. Generate a random initial population by sending out random parameter sets and waiting for the result. 2. Insert initial results in-order. -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4,

Asynchronous Optimization Example populationworker , 2, -0.5? Work Request Recombination -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, If the population is not full, generate a new random individual. If a worker requests work, create a new individual using mutation or recombination (for GS).

Asynchronous Optimization Example population If a worker requests work, create a new individual using mutation or recombination (for GS). worker 1 ? Work Request worker , -1, -2? Recombination -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, , 2, -0.5

Asynchronous Optimization Example population If a worker requests work, create a new individual using mutation or recombination (for GS). worker 1 ? Work Request worker 2 ? worker 3 -5, 2, 5? Mutation -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, , -1, , 2, -0.5

Asynchronous Optimization Example population When a worker completes calculating the fitness and reports the result, insert it into the population. worker 1 ? Report Result worker worker 3 -5, 2, 5? -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, , -1, , 2, -0.5

Asynchronous Optimization Example populationworker 1 ? Insert worker worker 3 -5, 2, 5? -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 -4, -2, -5 2, 4, , -1, , 2, -0.5 Perform an in-order insert and remove the worst member of the population. Remove

Asynchronous Optimization Example populationworker 1 ? worker 2 ? worker 3 -5, 2, 5? -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, ? -3.5, 2, -0.5 Selection is done by keeping a fixed size population and only inserting results that improve it , -1, -2

Asynchronous Optimization Example populationworker 1 ? worker 2 ? worker 3 -5, 2, 5? -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, ? -3.5, 2, -0.5 If a worker fails or leaves we can continue the optimization without stopping , -1, -2

Asynchronous Optimization Example population worker 2 ? worker 3 -5, 2, 5? -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, ? , -1, -2 Workers can join and leave at any time. 2, 0, -0.5? Work Request Recombination worker 4

Asynchronous Optimization Example population worker 2 ? worker 3 -5, 2, 5? -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, ? , -1, -2 Workers can join and leave at any time. 2, 0, -0.5? Work Request Mutation worker 4 2, -1, -3? worker 5

Asynchronous Optimization Example population worker 2 ? worker 3 -5, 2, , 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, ? , -1, -2 If a reported result will not improve the population, simply discard it (selection). 2, 0, -0.5? worker 4 2, -1, -3? worker 5 Report Result

Asynchronous Optimization Example population worker 2 ? worker 3 ?? -5, 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, ? , -1, -2 If a reported result will not improve the population, simply discard it (selection). 2, 0, -0.5? worker 4 2, -1, -3? worker 5 Discard Result

Asynchronous Optimization Example population worker 2 ? worker 3 ? ? Fast workers do not need to wait for slow workers and the search can continue to progress without them. 2, 0, -0.5? worker 4 2, -1, -3? worker 5 Work Request Recombination 0, 1.5, , 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, , -1, -2

Asynchronous Optimization Example population worker 2 ? worker ? Fast workers do not need to wait for slow workers and the search can continue to progress without them. 2, 0, -0.5? worker 4 2, -1, -3? worker 5 Report Result 0, 1.5, , 2, -2 -2, 4, -5 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, , -1, -2

Asynchronous Optimization Example population worker 2 ? worker 3 ? ? Fast workers do not need to wait for slow workers and the search can continue to progress without them. 2, 0, -0.5? worker 4 2, -1, -3? worker 5 Insert Result ? -5, 2, -2 -4, -2, 0 2, -1, 0 2, 3, -3 3, 0, 1 2, -3, 2 -5, 0, 4 2, 4, , -1, , 1.5, -2.5

Asynchronous optimization works in theory, but does it scale? How does it compare to an iterative approach? Asynchronous Optimization

Simulating Asynchronous Optimization

Simulation Architecture Population Fitness (1) Fitness (2) Fitness (n) Individual (1) Individual (2) Individual (n) Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) Workers (Fitness Evaluation) Report results and update population Request Work Send Work Generate individuals when queue is low Min Heap (Report Times) Remove result with minimum time and report Generate report time and insert into heap Request new work for each reported individual Initialize heap with results equal to number of workers

Test Functions

Courtsey of Sphere Function

Courtsey of Ackley Function

Courtsey of Griewank Function

Courtsey of Rastrigin Function

Courtesy of Rosenbrock Function

Traditional EA population size has to match number of processors. Asynchronous EAs can use a fixed population size (100). Use a fixed report time of 1, so all work is requested and sent out simultaneously. Simulating Homogeneous Environments

Simulated Homogeneous Environments

Simulating

Simulated

AGS didn’t work! ADE/best worked? Simulated

Travis Desell, David P. Anderson, Malik Magdon-Ismail, Heidi Newberg, Boleslaw Szymanski and Carlos A. Varela. An Analysis of Massively Distributed Evolutionary Algorithms. In the Proceedings of the 2010 IEEE Congress on Evolutionary Computation (IEEE CEC 2010), Barcelona, Spain, July To Appear.An Analysis of Massively Distributed Evolutionary Algorithms Further Reading on Simulation Travis Desell. Asynchronous Global Optimization for Massive-Scale Computing. PhD thesis. Rensselaer Polytechnic Institute. 2009Asynchronous Global Optimization for Massive-Scale Computing

BOINC vs. BlueGene

Iterative GS on BlueGene

BOINC vs. BlueGene Asynchronous GS on BlueGene 1 1 Fitness evaluation was distributed over all 1000 cores -- only one worker

BOINC vs. BlueGene Asynchronous GS on BlueGene 12 1 Fitness evaluation was distributed over all 1000 cores -- only one worker 2 Used better recombination heuristic

BOINC vs. BlueGene Asynchronous GS on 1 1 With approximately 5,000 workers

Asynchronous optimization is fault tolerant by design. Asynchronous optimization also scales where traditional methods fail. Asynchronous optimization can also be faster than traditional iterative searches. Results

BOINC verifies every work unit Only results that will be inserted into the population need to be verified Partial Verification: Ignore false-negatives (results that won’t be inserted) Verify results which potentially improve the search Handling Malicious Results with Verification

Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) Workers (Fitness Evaluation) Remove verified results from queue and insert them into the population Request Work Send Work Generate new individuals when queue is low Population Fitness (1) Fitness (2) Fitness (n) Individual (1) Individual (2) Individual (n) Verification Queue Fitness (1) Fitness (2) Fitness (n) Individual (1) Individual (2) Individual (n) Insert result if it could improve population Resend individuals at a specified verification rate

Further Reading on Verification Travis Desell, Malik Magdon-Ismail, Boleslaw Szymanski, Carlos A. Varela, Heidi Newberg and David P. Anderson. Validating Evolutionary Algorithms on Volunteer Computing Grids. In the Proceedings of the 10th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS 2010), Amsterdam, Netherlands, June 2010.Validating Evolutionary Algorithms on Volunteer Computing Grids

Questions?

Thanks!

96 November 19, 2009 Particle Swarm Optimization Particles ‘fly’ around the search space. Move according to their previous velocity and are pulled towards the global best found position and their locally best found position. Analogies: cognitive intelligence (local best knowledge) social intelligence (global best knowledge) 96

97 November 19, 2009 Particle Swarm Optimization (Example) 97 previous: p i (t-1) current: p i (t) local best global best c 1 * (l i - p i (t)) c 2 * (g - p i (t)) w * v i (t) velocity: v i (t) possible new positions

98 November 19, 2009 Particle Swarm Optimization (Example) 98 previous: p i (t-1) current: p i (t) local best global best c 1 * (l i - p i (t)) c 2 * (g - p i (t)) w * v i (t) velocity: v i (t) possible new positions

99 November 19, 2009 Particle Swarm Optimization (Example) 99 previous: p i (t-1) current: p i (t) local best global best c 2 * (g - p i (t)) w * v i (t) velocity: v i (t) possible new positions Particle finds a new local best position

100 November 19, 2009 Particle Swarm Optimization (Example) 100 previous: p i (t-1) current: p i (t) local best global best c 2 * (g - p i (t)) w * v i (t) velocity: v i (t) possible new positions c 1 * (l i - p i (t))

101 November 19, 2009 Particle Swarm Optimization (Example) 101 previous: p i (t-1) current: p i (t) local best global best velocity: v i (t) new position Particle finds the global best position

102 November 19, 2009 Particle Swarm Optimization (Example) 102 c 2 * (g - p i (t)) w * v i (t) possible new positions c 1 * (l i - p i (t)) previous: p i (t-1) current: p i (t) local best global best velocity: v i (t) Another particle finds the global best position

103 November 19, 2009 Particle Swarm Optimization (details) PSO: v i (t+1) = w * v i (t) + c 1 * r 1 * (l i - p i (t)) + c 2 * r 2 * (g - p i (t)) p i (t+1) = p i (t) + v i (t+1) w, c 1, c 2 = constants r 1, r 2 = random float between 0 and 1 v i (t) = velocity of particle i at iteration t p i (t) = position of particle i at iteration t l i = best position found by particle i g = global best position found by all particles 103

104 November 19, 2009 Differential Evolution (In Brief) In general: Perform binary or exponential recombination between the current individual and another individual modified by a scaled difference between n pairs of other individuals 104 best/n/bin best/n/exp random/n/bin random/n/exp current/n/bin current/n/exp Parent SelectionNumber of PairsRecombination// Many Variations:

105 November 19, 2009 Differential Evolution (Example) 105 current: p i (t) pair 1 : r 1 parent: r 0 recombine(current, target) target: r 0 + c(r 1 - r 2 ) scaled differential: c(r 1 - r 2 ) pair 2 : r 2

106 November 19, 2009 Differential Evolution (Details) p i,j (t) = j th parameter of i th member of population at iteration t g j = j th parameter of global best member at iteration t c = scaling factor r1, r2 = random int between 0 and population size, r 1 != r 2 r3 = random int between 0 and number of parameters r4 = random float between 0 and 1 cr = crossover rate 106 = g j (t) + c * (p r1,j (t) - p r2,j (t)) = p i,j (t) p i,j (t+1) DE (best/1/bin): if r3 == j or r4 < cr otherwise if f(p(t+1)) < f(p(t)) then p(t+1) = p(t)