Marcus Gallagher and Mark Ledwich School of Information Technology and Electrical Engineering University of Queensland, 4072. Australia Sumaira Saeed Evolving.

Slides:



Advertisements
Similar presentations
UNIVERSITY OF JYVÄSKYLÄ Resource Discovery in P2P Networks Using Evolutionary Neural Networks Presentation for International Conference on Advances in.
Advertisements

The Helmholtz Machine P Dayan, GE Hinton, RM Neal, RS Zemel
Yuri R. Tsoy, Vladimir G. Spitsyn, Department of Computer Engineering
Tetris – Genetic Algorithm Presented by, Jeethan & Jun.
On the Genetic Evolution of a Perfect Tic-Tac-Toe Strategy
CSE 380 – Computer Game Programming Pathfinding AI
UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä
Evolving Cutting Horse and Sheepdog Behavior with a Simulated Flock Chris Beacham Computer Systems Research Lab 2009.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,
Artificial Intelligence in Game Design Introduction to Learning.
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
Evolving Neural Network Agents in the NERO Video Game Author : Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen Presented by Yi Cheng Lin.
Diffusion Mechanisms for Active Queue Management Department of Electrical and Computer Engineering University of Delaware May 19th / 2004 Rafael Nunez.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #16 3/1/02 Taguchi’s Orthogonal Arrays.
1. 2 overview Background on video games Background on video games Neural networks Neural networks NE NE NEAT NEAT rtNEAT rtNEAT NERO NERO.
Intelligent Pac-Man Ghost AI
Rafael C. Nunez - Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware May 19 th, 2005 Diffusion Marking Mechanisms.
Diffusion Mechanisms for Active Queue Management Department of Electrical and Computer Engineering University of Delaware May 19th / 2004 Rafael Nunez.
Diffusion Early Marking Department of Electrical and Computer Engineering University of Delaware May / 2004 Rafael Nunez Gonzalo Arce.
Radial Basis Function Networks
Option and Constraint Generation using Work Domain Analysis Presenter: Guliz Tokadli Dr. Karen Feigh.
Evolving Multi-modal Behavior in NPCs Jacob Schrum – Risto Miikkulainen –
Genetic Algorithms and Ant Colony Optimisation
Evolving a Sigma-Pi Network as a Network Simulator by Justin Basilico.
Using Genetic Programming to Learn Probability Distributions as Mutation Operators with Evolutionary Programming Libin Hong, John Woodward, Ender Ozcan,
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Soft Computing Lecture 18 Foundations of genetic algorithms (GA). Using of GA.
Fundamentals of Game Design, 2 nd Edition by Ernest Adams Chapter 10: Core Mechanics.
Investigation of the Effect of Neutrality on the Evolution of Digital Circuits. Eoin O’Grady Final year Electronic and Computer Engineering Project.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
Computer Go : A Go player Rohit Gurjar CS365 Project Presentation, IIT Kanpur Guided By – Prof. Amitabha Mukerjee.
Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
CAP6938 Neuroevolution and Developmental Encoding Real-time NEAT Dr. Kenneth Stanley October 18, 2006.
Algorithms and their Applications CS2004 ( ) 13.1 Further Evolutionary Computation.
Backtracking and Games Eric Roberts CS 106B January 28, 2013.
Mike Taks Bram van de Klundert. About Published 2005 Cited 286 times Kenneth O. Stanley Associate Professor at University of Central Florida Risto Miikkulainen.
1 Genetic Algorithms and Ant Colony Optimisation.
Neural Network Implementation of Poker AI
Evolutionary Programming
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Evolving Reactive NPCs for the Real-Time Simulation Game.
1 x 2 Game 1: One Player Game If you select a number sentence, give the answer. If you select an answer give the number sentence that has that answer.
F UZZY L OGIC : G AME E XAMPLE Steve Foster. BDI M ODEL The BDI model studies intention and its relation to other mental attitudes As its name implies,
Pac-Man AI using GA. Why Machine Learning in Video Games? Better player experience Agents can adapt to player Increased variety of agent behaviors Ever-changing.
Evolution of Complex Behavior Controllers using Genetic Algorithms Title.
Simulation - Game Evaluation Agenda  Evaluate the skills that can be learned  How enjoyable is the game  Evaluate my effort as a student  Identify.
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
Will Britt and Bryan Silinski
Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.
Application of the GA-PSO with the Fuzzy controller to the robot soccer Department of Electrical Engineering, Southern Taiwan University, Tainan, R.O.C.
The Standard Genetic Algorithm Start with a “population” of “individuals” Rank these individuals according to their “fitness” Select pairs of individuals.
Game Playing Evolve a strategy for two-person zero-sum games. Help the user to determine the next move. Constructing a game tree Each node represents a.
Evolving Strategies for the Prisoner’s Dilemma Jennifer Golbeck University of Maryland, College Park Department of Computer Science July 23, 2002.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Robot Intelligence Technology Lab. 10. Complex Hardware Morphologies: Walking Machines Presented by In-Won Park
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
CAP6938 Neuroevolution and Artificial Embryogeny Real-time NEAT Dr. Kenneth Stanley February 22, 2006.
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 1 Authors : Siming Liu, Christopher Ballinger, Sushil Louis
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
Evolutionary Algorithms Jim Whitehead
Evolving the goal priorities of autonomous agents
CSE 476/876 Optional Course Project
Tokenization.
An Agent that plays Pacman
Volume 40, Issue 6, Pages (December 2003)
Evolutionary Ensembles with Negative Correlation Learning
Presentation transcript:

Marcus Gallagher and Mark Ledwich School of Information Technology and Electrical Engineering University of Queensland, Australia Sumaira Saeed Evolving Pac-Man Players: Can We Learn from Raw Input?

Abstract This paper describes an approach to developing Pac-Man playing agents that learn game-play based on minimal onscreen information. The agents are based on evolving neural network controllers using a simple evolutionary algorithm.

Pac-Man Pac-Man is a simple predator-prey style game, where the human player maneuvers an agent (i.e. Pac-Man) through a maze. The aim of the game is to score points, by eating dots initially distributed throughout the maze while attempting to avoid four “ghost” characters. If Pac-Man collides with a ghost, he loses one of his three lives. When Pac-Man eats a power pill he is able to turn the tables and eat the ghosts for a few seconds. Bonus “fruit” objects wander through the maze at random and can be eaten for extra points. The game ends when Pac-Man has lost all of his lives.

Approach For the experiments a freely available Java-based implementation of Pac-Man developed by Chow was used. For the majority of experiments in this paper, the game was simplified by removing three of the four ghosts, all power pills and fruit.

Strategy of Ghosts Consequently, the “first” (red) ghost moves by advancing towards Pac-Man by the shortest path 90% of the time, and 10% of the time will choose the second-best path to Pac-Man. If an internal game option (”InsaneAI”) is set, the single ghost will become deterministic and always selects the shortest path to Pac-Man.

Neural-network based Controller Each network has 4 output units representing the four possible directions (up, down, left and right) that Pac-Man can attempt to move in at any time-step of the game. Each output unit has a (logistic) sigmoidal activation function and the movement direction at each time-step is chosen according to the network output with the maximum value. The networks used take input from a window centered on the current location of Pac-Man in the maze (windows of sizes 5×5, 7×7 and 9×9 have been implemented).

Information from the Maze Input Window Walls and dots are each represented using a window-sized binary matrix of inputs. Ghosts are represented in a third matrix with a value of -1 while the absence of a ghost is indicated using a value of 0. When power pills are in play, a blue (edible) ghost can also be represented using a value of +1 Four additional inputs are therefore used in the game representation. These represent the total amount of dots remaining in each of the four primary directions in the maze. The total number of inputs to each network is therefore 3w 2 + 4, where w is the window height/width.

Evolving Neural Networks The connection weights in the neural networks were evolved using a ( μ +lemda)-Evolution Strategy. Mutation (without self-adaptation of parameters) was applied to each weight in a network with a given probability pm, using a Gaussian mutation distribution with zero mean and standard deviation vm. No recombination operator was used. The weights for the initial population of networks in each experiment were generated uniformly in the range [0, 0.1]. The fitness function was also very simple. the average number of points scored in a game (note that a game consists of three lives of Pac- Man) over the number of games played per agent, per generation.

One Deterministic Ghost The first experimental scenario was the simplest possible. The game used a single ghost behaving deterministically (moving towards Pac-Man by the shortest path). In this case, the entire game dynamics are deterministic. Other parameters were as follows: Input window size: 5 × 5 Number of hidden units = 8. pm = 0.1, vm = 0.1. Population size: μ = 30, lemda= 15.

Results In the early stages of evolution, low-scoring agents are those that quickly get stuck attempting to move in a direction where there is a wall. The better performing agents become responsive to the Walls input window, since they move around the maze much more effectively. the behavior of agents was also responsive to changes in the Dots window inputs - (different paths were typically observed for different lives during a single game.) Responsiveness to the Ghosts input was much less evident from observing game-play of the fittest individuals. In certain situations, Pac-Man clearly did exhibit avoidance of the ghost.

Results (Trial 1)

Result (Trial-2)

Result (Trial 3)

More Complex Games and Varying Experimental Parameters 1. Non-deterministic Ghosts, Number of Hidden Units, Population Size, Ngames 2. Larger Input Windows

1. Non Deterministic Ghost The experimental configurations tested are summarized as follows: (e1): w = 5, μ = 10, lemda= 5, 3 hidden units, nondeterministic ghost, Ngames = 5. (e2): w = 5, μ = 10, lemda = 5, 8 hidden units, nondeterministic ghost, Ngames = 5. (e3): w = 5, μ = 50, lemda = 25, 3 hidden units, nondeterministic ghost, Ngames = 3. (e4): w = 5, μ = 30, lemda = 1, 2 hidden units, 4 ghosts, Ngames = 2. Results for these experiments are shown in Table I.

2. Larger Input Windows Further experiments were conducted with larger input window sizes, more specifically: (e5): w = 7, 4 hidden units, non-deterministic ghost, Ngames = 5. (e6): w = 7, 8 hidden units, non-deterministic ghost, Ngames = 5. (e7): w = 9, 8 hidden units, non-deterministic ghost, Ngames = 3 (2 trials).

Conclusion The results demonstrate that it is possible to use neuroevolution to produce Pac-Man playing agents with basic playing ability using a minimal amount of raw on- screen information. No agent was able to clear a maze of dots in our experiments. Nevertheless, it is encouraging and perhaps surprising that it is possible to learn anything at all given the limited representation of the game, feedback about performance and incorporation of prior knowledge.

Thank you