Presentation on theme: "1 November 2005 Stefano Nolfi* Dario Floreano~ *Institute of Psychology, National Research Council Viale Marx 15, Roma, Italy ~LAMI - Laboratory of Microcomputing."— Presentation transcript:
1 November 2005 Stefano Nolfi* Dario Floreano~ *Institute of Psychology, National Research Council Viale Marx 15, Roma, Italy ~LAMI - Laboratory of Microcomputing Swiss Federal Institute of Technology EPFL, Lausanne, Switzerland December 1997 (revised May 1998) Co-evolving predator and prey robots: Do ‘arms races’ arise in artificial evolution? Presented by Assaf Glazer Co-evolving predator and prey robots: Do ‘arms races’ arise in artificial evolution? Presented by Assaf Glazer
2 November 2005 MAIN TOPICS Abstract Introduction Co-Evolving Predator and Prey Robots The Experimental Model Results Conclusions Summary
3 November 2005 ABSTRACT Cooperative Vs. Competitive Co-evolution. Investigate the role of co-evolution in the context of evolutionary robotics. How co-evolution may potentially enhance the power of adaptation of artificial evolution. in what conditions co-evolution can lead to “arms races”. We will show that in some cases artificial co-evolution has a higher adaptive power than simple evolution. Co-Evolution - the evolution of two or more competing populations with coupled fitness.
5 November 2005 INTRODUCTION Co-evolution has several features that may potentially enhance the adaptation power of artificial evolution: oIncreasingly complex evolving challenges – produce “arms race”. oIndividual fitness depends on the other population with vary during the evolutionary process - more general solutions are selected. oThe ever changing fitness landscape - preventing stagnation in local minima. Unfortunately, coevolving populations may cycle. Cycling may cancel out all the previously described advantages. We will try to understand in which conditions co-evolution can lead to “arms races”.
6 November 2005 CO-EVOLVING PREDATOR AND PREY ROBOTS Co-evolution in the context of predators and prey in simulation. The simulations were based on real the Khepera robots. Predators and prey belong to different species with different sensory and motor characteristics: oPredator with a vision module oPrey had a maximum available speed set to twice that of the predator.
7 November 2005 THE EXPERIMENT MODEL A square arena 47 x 47 cm in size. Both individuals were provided with eight infrared proximity sensors. The predator had a view-angle of 36° dividing into five sectors. Using neural network architecture: oTwo sigmoid units with recurrent connection. oPredator – connections from 8 infrared + 5 photoreceptors sensors. oPrey - 8 infrared sensors, speed output is multiply by 2 before setting the wheel speed. oConnection Weight Evolution
8 November 2005 16 synapses from the infrared sensors. 4 synapses from recurrent connections between the preceptors. 2 sigmoid thresholds. 10 synapses from the vision sensors (only the predator). 8 bits per parameter. Predator: Genotype of 8 * (30 synapses + 2 thresholds) bits Prey: Genotype of 8 * (20 synapses + 2 thresholds) bits Encoding THE EXPERIMENT MODEL
9 November 2005 Experimental Parameters: The competition ended either when the predator touched the prey or after 500 motor updates. Each individual was tested against the best competitors of the ten previous generations. Evolutionary Parameters: Two populations of 100 individuals. 100 generations. Initial population - Randomly assigned genotype. Fitness function – Sums of 1 and 0 per each of the 10 competitions. Selection - The best 20 were allowed to reproduce, 5 offspring each. One Point Crossover. Randomize mutation - pm = 0.02 3 THE EXPERIMENT MODEL 1 st Experiment
10 November 2005 Monitoring Parameters: “Red Queen Effect” – It’s hard to monitor progress by taking measures of the fitness throughout generations. “Master Tournament” - Avoid this problem by testing the performance of the best individual in each generation against all the best competing ancestors. EXPERIMENT MODEL 1 st Experiment
11 November 2005 RESULTS 1 st Experiment Performance does not increase at all throughout generations. Sudden drops. Effective strategies may be lost instead of being retained and refined. Cycling (A – Predatory, B – Prey): oA1 – Chasing the prey. oA2 – Tracking the prey and attacking on special occasions. oB1 – Stay still close to walls. oB2 – Moving fast oA1 > B1, B2 > A1, A2 > B2, B1 > A2
12 November 2005 RESULTS 1 st Experiment The cycling process is driven in general by prey. The efficacy and generality of the different selected strategies does not increase. In fact, individuals of later generations do not necessarily score well against competitors of much earlier generations.
14 November 2005 CONCLUSIONS The experiment is not so simple due to: oMany different strategies. othe advantage against another strategy is probabilistic. oHard to define toward which one of the strategies does the generation converge. The cycling can be clearly identified. “Hall of the Fame” - Fighting cycling by testing individuals against all discovered solutions. 1 st Experiment
15 November 2005 Experimental Parameters: “Hall of Fame” competitions - 10 opponents randomly selected from all previous generations. All other parameters remain the same. Evolutionary Parameters: The same. EXPERIMENT MODEL 2 nd Experiment
16 November 2005 RESULTS 2 nd Experiment We obtain a progressive increase in performance. Same classes of strategies which are evolutionarily more stable. Enables the co-evolutionary process to progressively refine current strategies.
17 November 2005 RESULTS 2 nd Experiment The evolutionary process find strategies that are more general. Verify this hypothesis in the following experiment:
18 November 2005 CONCLUSIONS As the process goes on there is less and less pressure to discover strategies that are effective against the opponent of the current generation. This type of solution is of course implausible from a biological point of view. The prey cannot improve its strategy above a certain level. The length of ‘arms races’ may vary in different conditions. Increase the richness of the prey’s sensory system. 2 nd Experiment
19 November 2005 Experimental Parameters: Provide the prey a camera with a view-angle of 240°, divided into 5 sectors of 48°. Prey and predatory have 13 sensors each. All other parameters remain the same as the 1 st experiment. Evolutionary Parameters: Prey and predator have the same length of genotype. All parameters remain the same as the 1 st experiment. EXPERIMENT MODEL 3 rd Experiment
20 November 2005 RESULTS 3 rd Experiment Prey in general overcomes predators. A significant increase in performance is observed in both populations.
21 November 2005 RESULTS 3 rd Experiment By using the ‘Hall of Fame’ selection performance measured using Master Tournament increased even more. However, if we test ‘standard’ Vs. ‘Hole of Fame’ individuals, results don’t remain the same (figure below): ‘Standard’ is better in all criteria Beside this case, it is always easier to defeat the ‘Hall of Fame’ individual
22 November 2005 CONCLUSIONS the ‘Hall of Fame’ might be even less effective throughout the generations By using simulated robots instead of real one we affect the course of the evolutionary process. The experimenter may unintentionally introduce constraints. By changing the initial conditions ‘arms races’ can continue to produce better and better solutions in both populations. If one or both sides fail to improve it is likely to lead into a limit cycle. The richness of the environment may prevent the cycling. 3 rd Experiment
23 November 2005 Experimental Parameters: Five different environments (right figure). 10 epochs, 2 per each environment. All other parameters remain the same as the ‘Standard’ 1 st experiment. Evolutionary Parameters: Prey and predator have the same length of genotype. All parameters remain the same as the 1 st experiment. EXPERIMENT MODEL 4 th Experiment
24 November 2005 RESULTS 4 th Experiment a significant increase in performance of the best. The average results, however, show a slight increase only in the first 20 generations.
25 November 2005 CONCLUSIONS The richness of the environment may delay the convergence of the co- evolutionary process towards a limit cycle. Larger the number of fixed constraints is, the lower the importance of the co-evolutionary dynamic may be. How co-evolution can enhance the adaptive power of artificial evolution? Can artificial co-evolution solve tasks that cannot be solved using a simple evolutionary process? 4 th Experiment
26 November 2005 Experimental Parameters: Increasing the problem complexity: oPredator and prey were equipped with 8 ambient light sensors. o60x60cm environment with 13 cylindrical obstacles. Each prey individual was tested against the best predator obtained using co-evolution and conversely for each predator individual. All other parameters remain the same. Evolutionary Parameters: Simple evolution. One populations of 100 preys. One populations of 100 predators. Initial population - Randomly assigned genotype. All other parameters remain the same as in the ‘Standard’ selection. EXPERIMENT MODEL 5 th Experiment
27 November 2005 RESULTS 5 th Experiment a significant increase in performance of both average and best replications. predators of the very first generations have close to null performance. We ran a new set of experiments where predators were competed against the best prey obtained using simple evolution and conversely for preys: –In 8 cases out of 10 simple evolution failed to select predators able to catch the co-evolved prey.
28 November 2005 CONCLUSIONS Simple evolution can create very effective prey or predator against the best of co-evolved predators or preys, respectively. “Boot Start Problem ” – The problem arises when starting from scratch. Two reasons why co-evolution can have an higher adaptive power than evolution: oIndividuals face with a larger number of different environmental events. oThe emergence of ‘arms races’. 4 th Experiment
29 November 2005 SUMMARY Evolutionary Robotics as a promising new approach. Fighting the “Boot Start Problem ” by: o‘Incremental evolution’ – supervision required. oUsing co-evolution in order to produce increasingly complex. The “Cycling Problem”: oPreserving previous solution may affect the evolutionary pressure. oLike the local minima problem, it is an intrinsic problem. oWhen both sides can produce better strategies, ‘arms races’ may last longer. oThe richness of the environment may limit the cycling problem. Co-evolution may succeed in producing individuals able to cope with very effective competitors while simple evolution is unable to do so.
30 November 2005 SUMMARY – Cont. If completely general solutions do not exist, we should re-consider the ‘cycling problem’. The best we can do is to select the appropriate strategy for the current counter-strategy. Co-evolution will lead to an increased complexity when complete general solutions exist and can be selected. Conversely, it may lead to a cycling. “Full General” Vs, “Plastic General”. In most of our experiments simple ‘Plastic General’ solutions can be found while ‘fully-general’ solutions cannot.
31 November 2005 NEURAL NETWORKS The motivation..
32 November 2005 NEURAL NETWORKS – Cont. The Preceptron model: Threshold functions: 1 1 1 1 Sigmoid Function Step Function Sign Function Linear Function Sign Function Step Function Linear Function Sigmoid Function: