Presentation on theme: "Using Genetic Programming to Evolve Sumobots Shai Sharabi Dept. of Computer Science Ben-Gurion University, Israel."— Presentation transcript:
Using Genetic Programming to Evolve Sumobots Shai Sharabi Dept. of Computer Science Ben-Gurion University, Israel
Overview Introduction Sumobot system description GP system description –Preparatory steps –Evolutionary process Results Conclusions
Sumo history Sumo has its roots in the Shinto religion (800 A.D). There are two principal ways to win a Sumo bout: –The first wrestler to touch the ground outside the circle loses –The first wrestler to touch the ground with any part of his body other than the soles of his feet loses A Sumo match (Ozeki Kaio vs. Tamanoshima in May 2005).
Sumobot rules Sumobot contests are hosted in Seattle 2 robots try to push each other outside the arena boundaries
Sumobots The complexity of robot behavior Studies in Evolutionary Robotics (Nolfi & Floreano) Floreano and Mondada utilized a Khepera robot to validate a neural network system evolved using a genetic algorithm – for navigation and obstacle avoidance Liu & Zhang, Multi-Phase Genetic Programming: A Case Study in Sumo Maneuver Evolution
Sumobot System Description Two overhead web cameras, each connected to its own computer. Each computer transmits via its remote controller the maneuver commands to the respective sumobot, which then acts accordingly.
GP System Description Preparatory steps 1.Determining the set of terminals 2.Determining the set of functions 3.Determining the fitness measure 4.Determining the parameters for the run (population size, number of generations, minor parameters) 5.Determining the method for designating a result and the criterion for terminating a run
Functions & Terminals (steps 1-2)
Fitness Function (step 3) w1 … w7 – empirically derived weights count - number of iterations in one fight radius – distance between robot’s starting & farthest location sticky – rewards spending time close to the target closer – rewards approaching the target speed – rewards higher speed push – rewards pushing the opponent bonuspush – rewards faster wining programs exploring – rewards exploring the grounds bonustay - is a bonus added for staying in the arena w 1 radius(count) + w 2 sticky(count)+ w 3 closer(count) + w 4 speed(count)+ w 5 push(count) + w 6 bonuspush(count)+ w 7 exploring(count)+bonustay
Parameters (steps 4-5) Population size20 Generation count Selection methodrank Reproduction probability 0.2 Crossover probability0.8 Mutation probability Elitism group2
Linear Ranking Selection Based on sorting of individuals by decreasing fitness The probability to be extracted for the i th individual in the ranking is defined as where can be interpreted as the expected sampling rate of the best individual
Various Linear Ranking Parameters
GP System Description Evolutionary Process 1.Creating random initial population using functions and terminals 2.Running and evaluating all the programs using the fitness measure 3.If termination criterion satisfied for the run stop the evolutionary process 4.Select programs and apply genetic operations to them 5.Goto step 2
Initial Population Ramped half-and-half –Using md = 8 [max depth] –Divide the population evenly to md-1 bins –Half the population in each bin is created using the “Grow” method, the other half using the “Full” method. –Each bin is given a new md’ starting with md’=2 up to md’=md
Four Batches of Experiments
Typical Results of Batch A a)Evolved fighter rotates toward the target and approaches it if starting from a position where < -11 b) Evolved fighter circles widely unconditionally c) Simplified program of robot (a) d) Simplified program of robot (b)
Typical Result of a Fight in Batch B a)Demonstration of a fight between two individuals from the 44th generation of Sumo1 and generation 39 of Sumo2 b) Sumo1's simplified code (bottom robot) c) Sumo2's simplified code (top robot)
Batch C Details At some point (criteria dependent * ) we continue evolution with a different, more demanding fitness function (like higher score for pushing and fast wining) Start with fitness function which “easily” evolves simple sumo strategies (like exploring the arena)
Changes of Dynamic Fitness in Batch C Dynamic fitness computation: After 10 generations the fitness weights were adjusted to assign each fitness component with a new range.
Typical Results of Batch C a) A fight at generation 3 b) A fight at generation 7 c) Left robot scored Yuko at generation 13. This is the highest possible score, given when one contender manages to push its opponent out of the arena
Typical Fitness Progress of One Experiment in Batch C After 10 generations the fitness computation changed
Typical Results of Batch D a)Evolved bot fights avoid- contact opponent b)Evolved bot fights pushing opponent and scores a Yuko c)Evolved bot fights spinning opponent d)Fitness graph of a run that produced evolved sumobot. The drop of fitness in generation 12 was due to a mechanical failure in the sumo wheels. Nevertheless evolution overcame this problem and slowly yielded an adapted sumo program.
Batch A: Best Programs Fitness Progress of 10 Runs
Batch A: Average Fitness Progress of 10 Runs
Batch D: Best Programs Fitness progress of 9 runs
Batch D: Average Fitness Progress of 9 Runs
Movies Presentation from Batch C (co-evolution) Dancing Approach Riding the Enemy Wining Yuko
Conclusion GP can be utilized to evolve sumo fighter strategies for a simple robot with only 5 input terminals All 4 Batches yielded positive results Fitness oscillation Convergence Comparing results to others
Future Work Using more difficult to operate platforms Adding more specific domain terminal ( e.g., past location ) Using ADF