Presentation on theme: "CSE298 CSE300 DGA-1 CSE300 Agent-based Distributed Genetic Algorithms Rodrigo E. Caballero Computer Science & Engineering Department University of Connecticut."— Presentation transcript:
CSE298 CSE300 DGA-1 CSE300 Agent-based Distributed Genetic Algorithms Rodrigo E. Caballero Computer Science & Engineering Department University of Connecticut email@example.com Mingjun Song Department of Natural Resources Management and Engineering University of Connecticut firstname.lastname@example.org December 12, 2000
CSE298 CSE300 DGA-2 CSE300 Topics Introduction to Genetic Algorithms Parallel Genetic Algorithm Approaches Agent-based Distributed Genetic Algorithm Experiments and Results Conclusions and Future Directions
CSE298 CSE300 DGA-3 CSE300 Why are Some Problems Difficult to Solve? Real-world problems are: Mathematically intractable Large search spaces Time-varying Noisy observations Competitors Limitations of classics methods Often fail to address the real problem at hand Simplification leads to the right answer to the wrong problem
CSE298 CSE300 DGA-4 CSE300 Evolution Addresses These Problems Evolution is a two-step process of random variation and natural selection Variation creates diversity Diversity is heritable Selection eliminates inappropriate individuals The process of evolution follows the process of the scientific method Individuals server as hypotheses tested in light of an environment Worthwhile “ideas” are retained and extended Unsuitable “ideas” are purged
CSE298 CSE300 DGA-5 CSE300 Genetic Algorithm Stochastic search method inspired in natural evolution Population-based random variation and selection applied to data structures in light of a goal and initialization Characteristics: Genetic operators (selection, crossover, mutation) Population of chromosomes (bit strings) encode candidate solutions in the search space Selection based on the fitness of each chromosome
CSE298 CSE300 DGA-8 CSE300 Representation Efficiency and complexity of the search depends on the representation, and its relation with the search operators GA usually represent the decision variables using binary strings Chromosomes Gray code Represents adjacent integer values by bitstrings having Hamming distance 1 101001 x = (6,1)
CSE298 CSE300 DGA-9 CSE300 Mutation Maintain diversity Mutation probability per bit is usually very small 101001 100010
CSE298 CSE300 DGA-10 CSE300 Crossover Exploits the useful information contained in a pair of parents Crossover Point 00010 101101 11 1 Parents 000 101011 011 1 1 Offspring
CSE298 CSE300 DGA-12 CSE300 Parallel Genetic Algorithm – Global Parallelization o Each processor runs an independent GA on a local computer o Subpopulations never interact o Relatively easy to implement o Highly redundant
CSE298 CSE300 DGA-13 CSE300 Parallel Genetic Algorithm – Fine Grained Parallel GAs o Partition the population into a large number of very small subpopulations o Calls for massively parallel computers
CSE298 CSE300 DGA-14 CSE300 Parallel Genetic Algorithm – Coarse Grained parallelism o Population is divided into a few subpopulations o A migration operator is introduced o Migration is used to send some individuals from one subpopulation to another
CSE298 CSE300 DGA-15 CSE300 Parallel Genetic Algorithm – Coarse Grained parallelism o Two population genetics models. o Island model. Individuals can migrate to any other subpopulation. o Stepping stone model. Migration is restricted to neighboring subpopulations.
CSE298 CSE300 DGA-16 CSE300 Agent-Based Distributed GA Multiple-Population Coarse-Grained Synchronous migration Migrations occur at a predetermined constant interval Dynamic migration topology Destination of the migrants is determined randomly among all the nodes involved in the computation Tournament-based selection Deterministic Crowding Allows for finding multiple optima in multimodal search spaces by reducing the selection pressure between distant individuals
CSE298 CSE300 DGA-17 CSE300 System Overview Local Area Network GA Population GA Population GA Population
CSE298 CSE300 DGA-23 CSE300 Sequence Diagram Part
CSE298 CSE300 DGA-24 CSE300 AgentBasedGA class Implement the main() method. o Start up voyager. o Assign the IP address and construct Directory. o Create GAProxy remote objects o Instantiate GeneticAlgorithm agent o Move GeneticAlgorithm agent
CSE298 CSE300 DGA-25 CSE300 GAProxy class Responsible for registering GA, registering and checking out SubPopulation and locating remote host. o Directory directory. Provides service to look for the remote host o Vector queueSubPopulation. Stores the SubPopulation object coming from other GeneticAlgorithm agent. o Boolean isAlive. State of the GAProxy, control if subpopulation should move. o Methods Register(SubPopulation), SubPopulationQueueSize(), checkOutSubPopulation(int), clearSubPopulationQueue(). Manages the SubPopulation objects. o Method unregisterGA(). Reclaim the GeneticAlgorithm agent by Garbage Collection.
CSE298 CSE300 DGA-26 CSE300 SubPopulation class Represents the migrants exchanged between two GeneticAlgorithm agents. o Method moveTo(String). Agent.of(this).moveTo( String url, String callback [, Object args ] ). o Method onArrival(). Oneway callback after subpopulation agent moves to the remote host, it invokes Register method of the remote GAProxy object to add this SubPopulation to the queue.
CSE298 CSE300 DGA-27 CSE300 GAState class Represents the internal state of the GA and is used to report the results of the GA to the user o Vector lnkBestFitness. Store the best fitness of every generation o Method moveTo(String). Move GAState object to the local host o Method onArrival(). Oneway callback. It invoke method reportResult() to report the result o Method save(). Called by the GeneticAlgorithm to save the best fitness of the population
CSE298 CSE300 DGA-28 CSE300 Directory class Implement a lookup table for the processing nodes involved in the computation. Contained in each GAProxy object o Vector vecHost. Contains the list of the IP address o Mmethod addHost(String) o Method getRndHost(String). Locates a random host except for the one in the parameter
CSE298 CSE300 DGA-29 CSE300 Interface o IGAProxy, ISubPopulation and IGAState are the interface of the class GAProxy, SubPopulation and GAState respectively. o Contains no code and only define a set of method signatures that are defined in their classes. o In voyager, a remote object is represented by a special proxy object that implements the same interfaces as its remote counterpart.
CSE298 CSE300 DGA-30 CSE300 Rosenbrock’s Valley
CSE298 CSE300 DGA-31 CSE300 Rosenbrock’s Valley Global Minimum f(x)=0; x i =1, i=1,2 Global Minimum f(x)=0; x i =1, i=1,2
CSE298 CSE300 DGA-32 CSE300 Assumptions 10-bit binary representation mapped in [-2.048,2.048] Population size Migration size Migration interval 10 Generations Number of generations 100 Crossover (Prob.=1.0) / Mutation (Prob.=0.0) Methodology results are the average of 5 experiments
CSE298 CSE300 DGA-33 CSE300 Execution Time vs. Number of Nodes
CSE298 CSE300 DGA-34 CSE300 Time vs. Generation
CSE298 CSE300 DGA-35 CSE300 Execution Time vs. Generation
CSE298 CSE300 DGA-36 CSE300 Computation and Communication Times
CSE298 CSE300 DGA-37 CSE300 Observation Parallel GA had worst performance Communication overhead is too big compared with the time spent searching for the solution What happens if the objective function is more complex? Repeated the experiments adding 5 seconds of delay in each fitness calculation
CSE298 CSE300 DGA-38 CSE300 Execution Time vs. Number of Nodes
CSE298 CSE300 DGA-39 CSE300 Computation and Communication Times
CSE298 CSE300 DGA-40 CSE300 Communication/Computation Ratio
CSE298 CSE300 DGA-41 CSE300 Average Fitness vs. Generation
CSE298 CSE300 DGA-43 CSE300 Error vs. Number of Nodes
CSE298 CSE300 DGA-44 CSE300 Experiment: Ackley’s Path Function Ackley's Path is a widely used multimodal test function. o function definition: f 10 (x)=-a·exp(-b·sqrt(1/n·sum(x(i)^2)))- exp(1/n·sum(cos(c·x(i))))+a+exp(1); a=20; b=0.2; c=2·pi; i=1:n; -32.768<=x(i)<=32.768. o global minimum: f(x)=0; x(i)=0, i=1:n.
CSE298 CSE300 DGA-45 CSE300 Ackley’s Path Function Graphic 1
CSE298 CSE300 DGA-46 CSE300 Ackley’s Path Function Graphic 2
CSE298 CSE300 DGA-47 CSE300 Ackley’s Path Function Problem Methodology Designed to compare the performance between agent- based distributed GA and serial GA and test if agent-based distributed GA can have a better solution o Number of alleles in individuals: 80 o Number of individuals in the population: 100 o In both cases, GA evolves 100 generations and stops o For the distributed case, every 10 generations, 10 individuals are exchanged among distributed GAs o For each case, GA is run 3 times because the initial population is randomly produced
CSE298 CSE300 DGA-48 CSE300 Ackley’s Path Function Result Table Experiment Agent-based GA Serial GA 13.27E-81.73E-4 27.82E-95.57E-6 36.17E-91.79E-6
CSE298 CSE300 DGA-49 CSE300 Ackley’s Path Result Figure 1
CSE298 CSE300 DGA-50 CSE300 Ackley’s Path Result Figure 2
CSE298 CSE300 DGA-51 CSE300 Ackley’s Path Experiment Conclusion o Agent-based distributed GA can attain better final result than serial GA. o The performance of the serial GA is not steady, Therefore, distributed GA can be employed to solve the complex problems if possible. o Serial GA converges quickly in the beginning stage, the exchange of individuals in the distributed GA slows down the converge in the beginning but increases the variation in the population thereby helping attain the better final result.
CSE298 CSE300 DGA-52 CSE300 Conclusions Parallel GA does not speed up the computation in simple problems Small Communication/Computation ratio speed up of the form C/n Communication/Computation ratio increases with number of nodes Speed up does not impact the quality of the solution
CSE298 CSE300 DGA-53 CSE300 Future Research Migration policies Migration topologies Impact on solutions quality Speed up model Communication/computation ratio Multiobjective optimization Performance comparison with non-Java implementations Deme Sizes