A presentation by Matthew Dilts.  To solve problems that would take a long amount of time to manually solve ie. What’s the best strategy in a certain.

Slides:



Advertisements
Similar presentations
Debugging ACL Scripts.
Advertisements

Testing Relational Database
Artificial Intelligence 12. Two Layer ANNs
Objectives Identify the differences between Analytical Decision Making and Intuitive Decision Making Demonstrate basic design and delivery requirements.
Heuristic Search techniques
Reinforcement Learning
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
INTRODUCTION TO MODELING
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Machine Learning in Computer Games Learning in Computer Games By: Marc Ponsen.
Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.
Chapter 10 Schedule Your Schedule. Copyright 2004 by Pearson Education, Inc. Identifying And Scheduling Tasks The schedule from the Software Development.
Artificial Intelligence in Game Design Intelligent Decision Making and Decision Trees.
Game Design Serious Games Miikka Junnila.
Artificial Intelligence in Game Design Introduction to Learning.
The Comparison of the Software Cost Estimating Methods
Chapter 4 DECISION SUPPORT AND ARTIFICIAL INTELLIGENCE
The Decision-Making Process IT Brainpower
Machine Learning CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 5.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
An Introduction to Machine Learning In the area of AI (earlier) machine learning took a back seat to Expert Systems Expert system development usually consists.
Incorporating Advice into Agents that Learn from Reinforcement Presented by Alp Sardağ.
Game Mathematics & Game State The Complexity of Games Expectations of Players Efficiency Game Mathematics o Collision Detection & Response o Object Overlap.
1 Kunstmatige Intelligentie / RuG KI Reinforcement Learning Johan Everts.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
D Nagesh Kumar, IIScOptimization Methods: M1L4 1 Introduction and Basic Concepts Classical and Advanced Techniques for Optimization.
SIMULATION. Simulation Definition of Simulation Simulation Methodology Proposing a New Experiment Considerations When Using Computer Models Types of Simulations.
Multiple Input, Multiple Output II: Model Predictive Control By Peter Woolf University of Michigan Michigan Chemical Process Dynamics.
Part I: Classification and Bayesian Learning
Kunstmatige Intelligentie / RuG KI Reinforcement Learning Sander van Dijk.
McGraw-Hill/Irwin ©2005 The McGraw-Hill Companies, All rights reserved ©2005 The McGraw-Hill Companies, All rights reserved McGraw-Hill/Irwin.
Test Preparation Strategies
Science Inquiry Minds-on Hands-on.
CORE MECHANICS. WHAT ARE CORE MECHANICS? Core mechanics are the heart of a game; they generate the gameplay and implement the rules. Formal definition:
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Planning and Forecasting Dr.B.G.Cetiner
Brian Duddy.  Two players, X and Y, are playing a card game- goal is to find optimal strategy for X  X has red ace (A), black ace (A), and red two (2)
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
1 Performance Evaluation of Computer Networks: Part II Objectives r Simulation Modeling r Classification of Simulation Modeling r Discrete-Event Simulation.
1 ISA&D7‏/8‏/ ISA&D7‏/8‏/2013 Systems Development Life Cycle Phases and Activities in the SDLC Variations of the SDLC models.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
Starcraft Opponent Modeling CSE 391: Intro to AI Luciano Cheng.
SE: CHAPTER 7 Writing The Program
The System and Software Development Process Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. 1.
The Software Development Process
GAME PLAYING 1. There were two reasons that games appeared to be a good domain in which to explore machine intelligence: 1.They provide a structured task.
Neural Network Implementation of Poker AI
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Evolving Reactive NPCs for the Real-Time Simulation Game.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Pac-Man AI using GA. Why Machine Learning in Video Games? Better player experience Agents can adapt to player Increased variety of agent behaviors Ever-changing.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Topic 4 - Database Design Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
CpSc 810: Machine Learning Analytical learning. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
CSC321: Introduction to Neural Networks and Machine Learning Lecture 23: Linear Support Vector Machines Geoffrey Hinton.
WHAT ARE PLANS FOR? Philip E. Agre David Chapman October 1989 CS 790X ROBOTICS Presentation by Tamer Uz.
Artificial Intelligence in Game Design Lecture 20: Hill Climbing and N-Grams.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
The Game Development Process: Artificial Intelligence.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Done by Fazlun Satya Saradhi. INTRODUCTION The main concept is to use different types of agent models which would help create a better dynamic and adaptive.
CIS 488/588 Bruce R. Maxim UM-Dearborn
Future of Artificial Intelligence
CHAPTER I. of EVOLUTIONARY ROBOTICS Stefano Nolfi and Dario Floreano
Software Development Techniques
Artificial Intelligence Machine Learning
Presentation transcript:

A presentation by Matthew Dilts

 To solve problems that would take a long amount of time to manually solve ie. What’s the best strategy in a certain game?  Learning AI can adapt to conditions that cannot be anticipated prior to a game’s release (such as individuals playing tastes and strategies) qF1c

 Until recently, the lack of precedent of the successful application of learning in a mainstream top-rated game means that the technology is unproven and hence perceived as being high risk. (Software companies are wusses!)  Learning algorithms are frequently associated with techniques such as neural networks and genetic algorithms, which are difficult to apply in-game due to their relatively low efficiency.

 Faking It  Indirect Adaptation  Direct Adaptation  Supervised Learning  Unsupervised Learning

 Simply degrade an AI that performs very well through the addition of random errors, then over time reduce the number of random errors.  Advantages: - The ‘rate of learning’ can be carefully controlled and specified prior to release, as can the behavior of the AI at each stages of its development. - The state of the AI at any point in time is independent of the details of the interaction of the player with the game, simplifying debugging and testing.  Disadvantage: - However, this doesn’t actually solve any of the problems that would be solved by an actual AI

 AI Agent gathers information which is then used by a “conventional” AI layer to adapt the agent’s behavior.  Calculate optimal camping locations in an FPS, then hand it over to the AI layer.  Advantages: -The information about the game world upon which the changes in behavior are based can often be extracted very easily and reliably, resulting in fast and effective adaptation. -Since changes in behavior are made by a conventional AI layer, they are well defined and controlled, and hence easy to debug and test.  Disadvantages: -It requires both the information to be learned and the changes in behavior that occur in response to it to be defined a priori by the AI designer.  Fun

 Using learning algorithms to adapt an agent’s behavior directly, usually by testing modifications to it in the game world to see if it can be improved.  Consider a game with no built-in AI whatsoever which evolves rules for controlling AI agents as the game is played. Such a system would be Direct Adaptation.  This type of learning closely mimics human learning and is a very bright-eyed idealistic way to think about learning, but it not very applicable in its most general form.

 Direct adaptation is the ultimate form of AI  All the behaviors developed by the AI agents would be learned from their experience in the game world, and would therefore be unconstrained by the preconceptions of the Ai designer.  The evolution of the AI would be open ended in the sense that there would be no limit to the complexity and sophistication of the rule sets, and hence the behaviors that could evolve.

 A measure of the agent’s performance must be developed that reflects the real aim of learning and the role of the agent in-game.  Each agent’s performance must be evaluated over a substantial period of time to minimize the impact of random events on the measured performance.  Too many evaluations are likely to be required for each agent’s performance to be measured against a representative sample of human opponents.  The lack of constraints on the types of behavior that can develop makes it impossible to guarantee that the game would continue to be playable once adaption had begun. Testing for such cases would be difficult.

 Specifically for Direct Adaptation, it works best when used for specific subsets of an overall AI goal.  Incorporate as much prior knowledge as possible.  Design a good performance measure (this can be extremely difficult because: -Many alternative measures of apparently equal merit often exist, requiring an arbitrary choice to be made between them. -The most obvious or logical measure of performance might produce the same value for wide ranges of parameter values, providing little guide as to how to choose between them. -Carelessly designed performance measures can encourage undesirable behavior, or introduce locally optimal behavior.

 Learn by Optimization -Search for sets of parameters that make the agent perform well in-game.  Learn by Reinforcement -Learn the relationship between an action taken by the agent in a particular state of the game world and the performance of the agent.  Learn by Imitation -Imitating a player or allowing the player to evaluate performance assessments. e=relatedhttp://youtube.com/watch?v=swxtKrVb0m0&featur e=related (3:45)

 Avoid Locally Optimal Behaviors -Behaviors that, while not the best possible, can’t be improved upon by making small changes.  Minimize Dependencies -Multiple learning dependencies can rely on each other. Example- An AI with both a ‘location’ and a ‘weapon choice algorithm.  Avoid Overfitting -Overfitting means an agent has adapted its behavior to a very specific set of states of the game world and performs poorly in other states.  Explore and Exploit -Should the AI explore new strategies or repeat what it already knows?

 Indirect adaptation works well in game because the agent’s behavior is determined by the AI layer in the design phase of a game.  Direct adaptation can be performed during a game’s development, but is limited to only limited and specific problems in game and only when guided by a well thought out heuristic.

 Handle learning opportunities efficiently -When do we calculate what has been learned?  Generate effective novel behaviors -The AI should experiment plausibly and effectively.  Be robust with respect to nondeterminism -Be aware that sometimes random choices will work well just by sheer luck  Require minimal computational resources -Typically Ai has access to a very small proportion of a machines resources.

 Potentially NPCs could have adaptive AI. Why not?  Adaptive Ai takes too long to learn and, in general, the search space of behaviors is too large and complex to be explored quickly and efficiently.  Solution: Dynamic Scripting. An adaptive mechanism that uses domain knowledge to restrict the search space enough to make real-time adaption a reality while ensuring that ALL NPC behaviors are always plausible.  We can also go back and consider how Dynamic Scripting solves all of the “requirements” page as well.

 At a tactical level, the number of possible states and actions in a game can be huge, while the number of learning opportunities is often relatively small. In these circumstances, reinforcement learning is unlikely to achieve acceptable results unless the size of the state-action space can be dramatically reduced. This is the approach of dynamic scripting.  This is different than just reinforcement learning because (although reinforcement learning may be involved) because it is a strategy to reduce the number of actual state actions that can be utilized and it can also adapt more easily to the strategy in use by the opponent.  Now, how do we actually go about implementing this?

 The first step in creating an application of dynamic scripting is to create the rulebases from which it will build its scripts.  Example rules: use a melee attack against the closest enemy. Another rule could be to use a special ability at a certain time on a certain.  Using the many rulebases for each NPC, we then go and create our script, or strategy that the NPCs plan on using against their foes.  Also need to set priorities so the NPC can determine the order it needs to take its actions. Pulling out a weapon should be higher priority than attacking for example since you need to do it first.

 Choose rules in a random or weighted-random fashion before an encounter with the player.  Choosing the right size for the script is important.  If you set the size too small, the AI won’t be very complex and won’t seem very smart.  If you set the size too large, the AI might spend too much time doing high priority tasks and won’t have any time to do the lower priority ones. This would be a problem, for example, for our fighter mentioned earlier if a vast amount of potential high priority tasks were defined.

 Dynamic scripting requires a fitness function that will assign a value to the results of an encounter that indicates how well the dynamically scripted AI performed.  Evaluate the “fitness” value of the encounter afterwards. Based on these evaluations, we’ll change the weights on our potential rules that we used to create the script.  This fitness function is game specific and needs to be carefully designed by the developers in advance.  Consider individual performance and also team performance. If one bot’s performance was terrible, but team performance was great, maybe its performance wasn’t so terrible after all? Then again, can’t only consider team performance, because maybe its performance is dragging the rest of the team down and they could do even better.

 Update the weights of each ruleset as needed after implementing the fitness function.  There are many possibilities for the design of the formula, the book has its favorite, it doesn’t matter so much which you use.  The main goal here is to make sure that your Dynamically learning NPCs are generating variability, exploring new behaviors to adapt to the players strategy, but still utilizing strategies that work.

 In games we don’t want to create the ultimate AI to defeat the player every time. Not everyone is this good =related (3:10) =related  Therefore we do the following:  When the computer loses a fight, change the weights such that the computer focuses on successful behaviors instead of experimenting with new ones.  When the computer wins a fight, change the weights such that the computer focuses on varying and experimental strategies instead of ones that it knows are successful against the particular player.  If these two things happen correctly, the player will find themselves facing an AI that beats them roughly 50% of the time.

 Supervised learning is a machine learning technique for creating a function from training data. The training data consist of pairs of input objects (typically vectors), and desired outputs. The output of the function can be a continuous value, or can predict a class label of the input object. The task of the supervised learner is to predict the value of the function for any valid input object after having seen a number of training examples (i.e. pairs of input and target output). To achieve this, the learner has to generalize from the presented data to unseen situations in a "reasonable" way.

 Unsupervised learning is a type of machine language where manual labels of inputs are not used. It is distinguished from supervised learning approaches which learn how to perform a task, such as classification or regression, using a set of human prepared examples.

 In reality these 4 different strategies intermingle. You can have direct supervised, indirect unsupervised, etc.  An example: Supervised Direct Adaptation might be your self-learning AI that the developers run for a game for several days worth of run/test time before releasing it to the public to figure out some of the best strategies for their AI to implement.

 The concept: Many video games or games in general boil down to extremely overcomplicated versions of rock paper scissors. In a fighting game you have abilities such as kick/punch/block and many abilities are good at countering other abilities.  If a bot can predict what the player is going to do more reliably, it can win more reliably.  Or if an opponent always does the same thing in an FPS game: Moves to pick up his favorite weapon, gets a medkit, gets some armor, then goes to his favorite hiding spot in that order, we can more reliably counteract their plans.

 Basic Idea: If you had to predict what the next number would be in the following string sequence how would you do it?  Find the longest substring that matches the tail end of the string, then figure out what comes after that. What…? The answer to the example might help elaborate.

  It’s 1. The substring is and the following number is 1.  Problem: O(N^2) run time on this algorithm.  The solution is to instead, update our knowledge base of match sizes incrementally. A picture speaks a thousand words. Instead of trying to explain this in words, here comes the next page.  RockPaperScissors Program:

 Compute all matching substrings, not just the longest one.  Prediction Value = Length(S)/DistanceToTail(S)  Take the sum of the prediction value for each time the same length occurred.  The idea here is that smaller string matches that occurred recently may be more reasonable than longer ones in the past and string matches that have occurred multiple times are much more reasonable than ones that have only occurred once or twice.

 There are several different methods of machine learning one can use for various effects on computers.  Any good examples of this happening? Yes.

 Black and White is an excellent example of a game which uses mostly Supervised Indirect Adaptation

L8

 There are many simple, non-resource-eating methods to utilize machine learning in a game.  Although it has been done before, the AI that exists in many new games is pathetically boring and a few simple tweaks would make mountains out of molehills with learning AI that can adapt to, counteract, and learn strategies used by the player. I would soil myself if AI ever learned some nonsense like this from a player (:50)  Should anyone in this class ever be making a computer game, consider using some simple machine learning concepts to improve the game greatly.