An Agent that plays Pacman

Slides:

Advertisements

Similar presentations

Clustering Categorical Data The Case of Quran Verses

Advertisements

Techniques for Dealing with Hard Problems Backtrack: –Systematically enumerates all potential solutions by continually trying to extend a partial solution.

Infinite Horizon Problems

Clustering Color/Intensity

April 10, Simplifying solar harvesting model- development in situated agents using pre-deployment learning and information sharing Huzaifa Zafar.

Marcus Gallagher and Mark Ledwich School of Information Technology and Electrical Engineering University of Queensland, Australia Sumaira Saeed Evolving.

Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?

Figure ground segregation in video via averaging and color distribution Introduction to Computational and Biological Vision 2013 Dror Zenati.

Clustering Prof. Ramin Zabih

Reactive and Output-Only HKOI Training Team 2006 Liu Chi Man (cx) 11 Feb 2006.

By: David Gelbendorf, Hila Ben-Moshe Supervisor : Alon Zvirin

1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.

Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.

ITEC2110, Digital Media Chapter 2 Fundamentals of Digital Imaging 1 GGC -- ITEC Digital Media.

Satisfaction Games in Graphical Multi-resource Allocation

Greedy Algorithms General principle of greedy algorithm

Reinforcement Learning

Greedy Algorithms.

INTRODUCTION TO PROBLEM SOLVING

Real-Time Soft Shadows with Adaptive Light Source Sampling

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Introduction Algorithms Order Analysis of Algorithm

Lecture 3 of Computer Science II

David Kauchak CS52 – Spring 2016

Artificial Intelligence Lecture No. 5

AP CSP: Bytes, File Sizes, and Text Compression

Vocabulary Algorithm - A precise sequence of instructions for processes that can be executed by a computer.

Othello Artificial Intelligence With Machine Learning

Routing: Distance Vector Algorithm

Expectimax Lirong Xia. Expectimax Lirong Xia Project 2 MAX player: Pacman Question 1-3: Multiple MIN players: ghosts Extend classical minimax search.

Chapter 6: CPU Scheduling

Greedy Algorithms / Interval Scheduling Yin Tat Lee

CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.

Outline Perceptual organization, grouping, and segmentation

Parallelizing Dynamic Time Warping

Big-Oh and Execution Time: A Review

Data Integration with Dependent Sources

Self-control You can eat your skittle now or you can wait.

Shared Decision-Making: Partnering With Your Clinician

Preprocessing to accelerate 1-NN

Presented By Aaron Roth

Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.

Instructors: Fei Fang (This Lecture) and Dave Touretzky

Coverage Approximation Algorithms

CS 201 Fundamental Structures of Computer Science

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Advanced Algorithms Analysis and Design

Algorithmic complexity: Speed of algorithms

Mastering Interview Questions

CS5112: Algorithms and Data Structures for Applications

Minimizing the Aggregate Movements for Interval Coverage

CS5112: Algorithms and Data Structures for Applications

What LIMIT Means Given a function: f(x) = 3x – 5 Describe its parts.

The relation between the process and

CS 188: Artificial Intelligence Fall 2008

Minimax strategies, alpha beta pruning

Algorithmic complexity: Speed of algorithms

Dynamic Programming II DP over Intervals

DO NOW 3/15/2016 Find the values

CS 416 Artificial Intelligence

Lecture 2 Geometric Algorithms.

David Kauchak CS158 – Spring 2019

Minimax strategies, alpha beta pruning

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Reinforcement Learning (2)

CS51A David Kauchak Spring 2019

Directional Occlusion with Neural Network

Reinforcement Learning (2)

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

Presentation transcript:

An Agent that plays Pacman Abbas Mehrabian, Arian Khosravi Computer Engineering Department Sharif University of Technology {mehrabian, ar_khosravi}@ce.sharif.edu 3 December 2007

About the game of Pacman The program gets its input from the screen! This decreases speed by ~50%. 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman Glossary Pacman Ghost Edible Pill Power Pill Snack Walls and Tunnels 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman Game objectives Escape from the ghosts. Eat all of the pills to finish the level. When hunting, eat the maximum number of edibles. Eat the power pills at good times, i.e. when there are lots of ghosts around. 2/17/2019 An Agent that plays Pacman

Modeling the game: gametable Gametable: a 31x31 table that models the game state. Distance can be defined in the gametable. 2/17/2019 An Agent that plays Pacman

Modeling the game: gameobjects Class GameObject { position in gametable direction distance to other GameObjects } Pacman, Ghost, Edible, Snack are subclasses of GameObject. Definition: A connected set is a set of pixels with the same color. Every connected set may be the graphical representation of some gameobject. 2/17/2019 An Agent that plays Pacman

Program outline Image Processing Phase while true { capture the screen extract all connected sets for each connected set cs: find a gameObject obj such that cs is the representation of obj update position of obj in gametable among the four possible positions to go: find the position bestNext such that evaluate (bestNext) is maximum press the key to go to bestNext } Note: Since this is an infinite loop, it consumes CPU too much! Decision Making Phase 2/17/2019 An Agent that plays Pacman

Locality of decision making Agent makes decision only at this moment to go to one of the neighbors. This looks greedy, why don’t we predict future and devise long-term plans, or use min-max algorithm? The ghosts are very hard to predict. The number of possible future game states is large, and it will take a lot of time to calculate deeper. Recall that this is a real-time game! Greedy is simple to implement, and it worked! 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman How to evaluate a point? evaluate (point n) { Compute parameters p1,p2,…,pm for point n return c1p1 + c2p2 + … + cmpm } Some of the parameters are: Distance of n to the nearest pill Distance of n to the nearest ghost Distance of n to the nearest power pill The coefficients c1,c2, …, cm are important and should be optimized! When a coefficient is large, it means that the corresponding parameter is important. 2/17/2019 An Agent that plays Pacman

How to meet game objectives? Escape from the ghosts. Eat all of the pills to finish the level. When hunting, eat the maximum number of edibles. Eat the power pills at good times, i.e. when there are lots of ghosts around. In order to achieve each of these objectives, some parameters should be defined. Today I will talk about just the first objective, since it is the most important and challenging! 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman Escape parameters - 1 The most obvious parameter is distance to the nearest ghost. Try to maximize it. But, this is not perfect! Look at this figure: Obviously, it is better if Pacman goes down, but this parameter says that it should go up! Conclusion: We should find more visionary parameters! 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman Escape parameters - 2 Definition: Location f is called available from point p, if Pacman can start from p and reach f before any ghost. Definition: Safeness of a point is the number of available locations from that point. If a point has more safeness, we are more likely to go there. 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman Escape parameters - 3 The ghosts rarely reverse their direction, so we can use this property to predict them a little. Every ghost, spreads its energy throughout the gametable. Most of the energy goes forward. Definition: Energy of a point is the summation of the ghosts’ energies at that point. The less energy a point has, the more likely we are to go there. 2/17/2019 An Agent that plays Pacman

The relative effect of parameters The largest a coefficient ci is, the more effect the parameter pi would have on our move. But we don’t want a certain parameter to have a constant importance in all situations. For example, escaping parameters should affect more when the situation is red! Conclusion: The coefficients should not be constant during all phases of the game: First, we evaluate our current position, and get a feeling about the situation. Then we set the coefficients. 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman Implementation Several situations are defined. In each situation, certain coefficients should be larger: (before making decision) if a ghost is near or safeness is low then //Situation is “danger” c1 = … … else … endif (now, make decision) 2/17/2019 An Agent that plays Pacman

The first version program Our first version, finished on 14 Sep. 2007, has 4 different situations and 14 parameters! It gets 17300 points on average on my PC. It has the best official record of an agent that plays this version of Pacman: 2/17/2019 An Agent that plays Pacman

Current issues and future work We have made the game discrete, but this is sometimes very bad: Devise long-term plans. There are at least 56 coefficients in the program, which we have tuned our self. Use learning algorithms to tune the parameters. In every learning algorithm, there should be some way to evaluate the current playing strategy. Here, the strategy is represented by the coefficients. 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman Log-based learning There is another program (called the teacher) that improves the agent’s ability over time. We assume that there is a human master who plays the game very well. First, the master plays. At every moment teacher saves the parameters and master’s move in a log file. Second, teacher reads the log file, and at every moment the agent is provided with the parameters. Agent’s move is logged, and is compared with master’s move. The more the agent plays like the master, better the strategy is. Teacher then tries to change the agent’s coefficients to arrive at the optimal strategy. 2/17/2019 An Agent that plays Pacman

An Agent that plays Pacman References Simon Lucas, “Ms Pac-Man Competition CEC 2007 Results,” University of Essex, Nov. 2007, Nov. 2007, http://cswww.essex.ac.uk/staff/sml/pacman/CEC2007Results.html. Any Questions? 2/17/2019 An Agent that plays Pacman