Presentation on theme: "MENG PROJECT Design of an adaptive robot controller for a predator-prey task using e-puck robots."— Presentation transcript:
MENG PROJECT Design of an adaptive robot controller for a predator-prey task using e-puck robots
The Goal To design an adaptive robot controller capable of performing a predator-prey task, in a reconfigurable maze
Project Specifics Hardware Software Programming in C; Player/Stage; IR Sensors Camera Obstacles Maze
Project Breakdown The project can be broken down into 4 separate problems: Obstacle Avoidance – how to prevent a robot from colliding with obstacles in the maze; Object Identification – how each robot can identify each other and some additional objects; Predator-Prey Task – how to make a predator chase a prey and a prey escape from a predator; Evolution – how to evolve the controllers so that their performance increases. Evolution can also give a controller some adaptability.
Object Identification First solution: Predator should identify yellow objects as its food; Prey should identify green objects as its food and yellow objects as predators. Problem comes up: how to isolate an object of a certain colour in na image? Image subtraction was the first solution implemented.
Object Identification Which images should be subtracted?
Object Identification Red ChannelBlue ChannelGreen Channel
Object Identification Red - BlueBlue - GreenGreen - Blue Red - GreenBlue - RedGreen - Red
Object Identification Segmenting the chosen images using the Otsu Method Due to poor results, green is ruled out as a possible colour for the prey’s food. Blue and red are the new candidates.
Object Identification Problem found: when no coloured objects are present in the image, the subtraction and Otsu Method give false results.
Object Detection New solution: include an mbed board, that has 4 bright blue LEDs.
Object Identification Establish new fixed threshold of 205, determined experimentally. New feature: food is available for 3 minutes and unavailable for 20 seconds, cyclically.
Object Identification Red ChannelBlue ChannelGreen Channel
Object Identification By looking at the images from all 3 channels, one can see that the yellow object is very dark in the blue channel, are bright in both red and green channels. By processing each pixel individually: If both the red and green components of a pixel are 40% larger than the blue component, then that pixel is considered to be yellow.
Object Identification Results of the yellow object identification
Object Identification How to retrieve information from the thresholded images?
A scope is then defined for both food searching and for predator avoidance. Food scope – if the target’s centre of mass is within the scope, no turning occurs. It provides additional stability to the controller. Avoid scope – if the centre of mass of the object identified is outside of the scope, no turning takes place. If it is within the scope, then the robot must turn to escape. It is the equivalent to the field of view in the natural world.
Predator-Prey Task The main premises of the task are: The predator must chase the prey, and get as close to it as possible; The prey must try to escape from the predator; The prey, like the predator, also feeds. In its case, the food is the blue light source.
Predator-Prey Task An energy level was included in each controller, to reflect the events of feeding and dying. Therefore, new specifications are made: Each robot starts off with a random energy level between 80 and 100%; If the robot’s energy falls below a defined Hungry Level, then the robot becomes hungry and starts to look for food. The predator begins searching for the prey, and the prey begins searching for the blue light source; When a prey is caught by the predator, its energy level becomes 0% (death); When a predator catches a prey, it eats until it is no longer hungry; A prey, when eating, should take approximately 3 seconds to eat enough to give it 20 percentual energy points. It should eat until it is no longer hungry; To be allowed to “eat”, the distance between the robot and the food source must not be larger than 5 cm.
Predator-Prey Task Additional features included: Randomness in rebirth – when a robot dies, it is reborn after a few seconds. When it is reborn, there is a 51% chance it will be reborn with the same role it had before, and a 49% chance it will be reborn with the other role (i.e., predator becomes prey, and vice-versa); “Cannibalism” – one predator might make the decision of eating another predator. This is possible because of randomness in rebirth; Role-changing – if a predator who is very low on energy attacks a prey that has a very high energy value, then they change roles; 360 “sweep” – the robot comes to a halt and rotates 360 degrees around itself, hoping to find food.
Predator-Prey Task Communication between robots
Evolution Birth Life Death EvaluationSelectionMutation
Evolution After a robot dies, the following processes take place: Evaluation – if the controller’s fitness value for the role it is playing is equal or superior to the best fitness value found up to that point, then the controller is selected. Otherwise, the controller gets discarded, and skips to Mutation. Selection – if the controller is selected, then its parameters become the best for that role, and are stored in the robot’s memory. Mutation – the best set of parameters for the role it is playing is downloaded from the robot’s memory, and they are slightly changed. Each parameter gets its value changed by [-RANGE; RANGE]. The range is defined for each parameter, and it is percentual. The larger the parameter, the wider the range. After these tasks are performed, the robot is then reborn with its new set of parameters, and competes until it runs out of energy. The process is then repeated.
How to design the fitness function? First of all, it is important to be aware that if one chooses the fractional configuration for the fitness function: The variables should be included in the function as shown. The three variables chose to compose the fitness function are: o Number of times eaten by the predator; o Number of times that the robot fed; o Ratio between chases in which the robot caught food and total chases that the robot performed.
Evolution Keeping the fitness model in mind, the first value has to decrease the fitness, so it goes on the denominator. The other two contribute positively for the robot’s performance, and therefore should go on the numerator of the fraction model.
Evolution The parameters chosen to be included in the Evolutionary Algorithm are as follows: Prey’s Food Gain – gain associated to rule 20, if prey; Prey’s Avoid Gain – gain associated to rule 21, if prey; Predator’s Food Gain – gain associated to rule 20, if predator; Predator’s Food Gain – gain associated to rule 21, if predator; Size of “food scope” – width of the scope previously mentioned; Size of “avoid scope” – width of the “field of view”; Hungry Level – energy level below which the robot becomes hungry; Energy Danger Level – energy level below which the robot knows it is about to die; Variance Threshold – the limit that regulates the exploratory behaviour of the robot.
Evolution Each robot then has two different sets of parameters: the ones assigned to the predator and the ones assigned to the prey. Since two robots participate in the experiment, and they evolve differently, then 4 sets of paramateres will evolve in different ways.
Evolution Standard Configuration refers to when robot A is the predator and robot B the prey. Swapped Configuration refers to when robot B is the predator and robot A the prey.
Results The controllers were allowed to run for a few generations, and an interesting result came up: Standard ConfigurationSwapped Configuration
Results The prey’s fitness value increased dramatically because a prey-prey scenario occurred. Without anything to decrease its fitness value, the parameters’ path of evolution became corrupted. The environment would make the prey even less suitable to compete with the predator. Therefore, to prevent both prey-prey and predator- predator scenarios, randomness in birth and cannibalism were excluded from the project. Even so, the 4 sets of parameters all get a chance to evolve, since the role-changing still occurs when a predator is weak and catches a strong prey.
Evolution Evolution is restarted, and tests are carried out. During the rest of the project, the controllers were left to evolve, without any intereference. The Standard Configuration prey evolved throughout 141 generations, and the predator throughout 29. The Swapped Configuration prey evolved throughout 72 generations, and the predator throughout 19. It is important to be aware that all sets of parameters evolve from the same seed. The original set of parameters was defined through some basic experimentation.
Results The fitness values of each generation Standard Configuration
Results The fitness values of each generation Swapped Configuration
Results To keep it simple, the results will now be focused on the Standard Configuration, that evolved throughout more generations.
Results This is how the parameters evolved in Standard Configuration: Prey Predator
However, it is difficult to look at this data and withdraw conclusions. There are, nevertheless, some comments to be made: The predator develops a much narrower food scope, in order to quickly deal with an evasive prey. For some reason, the evolution of the Hungry Level and also of the Energy Danger Level parameters in both predator and prey is very similar. What can this mean? The predator’s variance threshold seems to have a tendency to become smaller, whereas the prey’s seems to be rising. Why? Maybe because the predator already performs periodic 360 sweeps, and therefore does not need to explore the maze as much as the prey. The avoid scope is the strange result. Since it represents the robot’s “field of view” for the objects it is trying to avoid, shouldn’t the prey develop a wide avoid scope?
Results The fitness obviously increased, but has the performance of the controller followed? To prove this, a battery of tests was carried out, creating competitions between the Swapped Configuration predator/prey, the Standard Configuration predator/prey, and also the predator/prey with the original non-evolved set of parameters. For these tests, evolution was halted. Each competition lasted approximately 20 minutes.
Major observations: Both predator and prey of the Standard Configuration show the best performances on all tests. The predator shows a fitness improvement of 680%, and when competing against a non-evolved prey it didn’t even die. The prey shows a smaller improvement, of around 66%. Both Standard and Swapped Configurations appear to have improved the controllers’ performance. The Swapped Configuration, having had less time to evolve, had a milder improvement of performance, of about 115% for the predator, and the prey actually suffered a 3% decrease in performance. This can be due to the fact that the number of generations is simply too small. Also interesting is the fact that the Standard Configuration prey only dies once without eating when competing against a non-evolved predator. This is a major achievement.
Results Major observations: Also, there are some parameters that evolve in totally different ways in each configuration. For instance, the prey’s avoid gain and avoid scope size have a tendency to become large in the Swapped Configuration, and an opposing tendency to become small in the Standard Configuration. Comparing the performances, maybe it’s better for the prey to have no fear of the predator. The Standard Configuration predator (the best) has also developed a food scope about 5 times smaller than the Swapped Configuration one. That may prove to be determinant in improving its performance.
Conclusions Implementing a behaviour-based robotic controller using only a set of rules proved to be a clean and simple way to solve problems such as Obstacle Avoidance and Exploring, or even chasing a target. The global behaviour of the robot is made up of small individual behaviours, competing and cooperating between themselves. The image processing part was the bottleneck of the project. A lot of time spent on finding alternative solutions to the problem.
Conclusions It is also the most sensitive part of the system. If lighting conditions change significantly, the Object Identification might yield false results, compromising the predator-prey task. A larger number of generations would have produced a larger amount of data, which would have been useful to withdraw more conclusions. On one pair of parameter sets, the performance of both predator and prey improved, thus confirming the Red Queen Effect.
Conclusions If the maze changes, the parameters should be reset, and evolution should make the robots adapt to new conditions. Without a map of its surroundings, the controllers are unable to come up with new strategies, and they are also unable to calculate either an absolute or relative localization in the maze. Without localization, they cannot effectively understand its surroundings, and therefore adaptation to the environment is not exactly possible, but instead the robots end up adapting to each other.
Future Work To add further constraints to the robot’s movement, one needs only to include new rules, and incorporate them in the functions that control the actuators. The program is ready to incorporate a GRN; A camera with improved quality could really boost the performance of the controllers, and allow for simpler processing. A third robot might produce some very interesting developments in the project. With 2-on-1 scenarios, the fitness values might change completely, and even help speed up evolution. Changes to the fitness function could also speed up evolution. For instance, rewarding a prey for the amount of time it stays alive.
The End That’s it, thank you for your attention!