Presentation is loading. Please wait.

Presentation is loading. Please wait.

A presentation by Matthew Dilts.  To solve problems that would take a long amount of time to manually solve ie. What’s the best strategy in a certain.

Similar presentations


Presentation on theme: "A presentation by Matthew Dilts.  To solve problems that would take a long amount of time to manually solve ie. What’s the best strategy in a certain."— Presentation transcript:

1 A presentation by Matthew Dilts

2  To solve problems that would take a long amount of time to manually solve ie. What’s the best strategy in a certain game?  Learning AI can adapt to conditions that cannot be anticipated prior to a game’s release (such as individuals playing tastes and strategies) http://www.youtube.com/watch?v=SbipWOe qF1c

3  Until recently, the lack of precedent of the successful application of learning in a mainstream top-rated game means that the technology is unproven and hence perceived as being high risk. (Software companies are wusses!)  Learning algorithms are frequently associated with techniques such as neural networks and genetic algorithms, which are difficult to apply in-game due to their relatively low efficiency.

4

5  Faking It  Indirect Adaptation  Direct Adaptation  Supervised Learning  Unsupervised Learning

6  Simply degrade an AI that performs very well through the addition of random errors, then over time reduce the number of random errors.  Advantages: - The ‘rate of learning’ can be carefully controlled and specified prior to release, as can the behavior of the AI at each stages of its development. - The state of the AI at any point in time is independent of the details of the interaction of the player with the game, simplifying debugging and testing.  Disadvantage: - However, this doesn’t actually solve any of the problems that would be solved by an actual AI

7  AI Agent gathers information which is then used by a “conventional” AI layer to adapt the agent’s behavior.  Calculate optimal camping locations in an FPS, then hand it over to the AI layer.  Advantages: -The information about the game world upon which the changes in behavior are based can often be extracted very easily and reliably, resulting in fast and effective adaptation. -Since changes in behavior are made by a conventional AI layer, they are well defined and controlled, and hence easy to debug and test.  Disadvantages: -It requires both the information to be learned and the changes in behavior that occur in response to it to be defined a priori by the AI designer.  Fun http://www.20q.net/http://www.20q.net/

8  Using learning algorithms to adapt an agent’s behavior directly, usually by testing modifications to it in the game world to see if it can be improved.  Consider a game with no built-in AI whatsoever which evolves rules for controlling AI agents as the game is played. Such a system would be Direct Adaptation.  This type of learning closely mimics human learning and is a very bright-eyed idealistic way to think about learning, but it not very applicable in its most general form.

9  Direct adaptation is the ultimate form of AI http://www.youtube.com/watch?v=s2CS9XijFvs  All the behaviors developed by the AI agents would be learned from their experience in the game world, and would therefore be unconstrained by the preconceptions of the Ai designer.  The evolution of the AI would be open ended in the sense that there would be no limit to the complexity and sophistication of the rule sets, and hence the behaviors that could evolve.

10  A measure of the agent’s performance must be developed that reflects the real aim of learning and the role of the agent in-game.  Each agent’s performance must be evaluated over a substantial period of time to minimize the impact of random events on the measured performance.  Too many evaluations are likely to be required for each agent’s performance to be measured against a representative sample of human opponents.  The lack of constraints on the types of behavior that can develop makes it impossible to guarantee that the game would continue to be playable once adaption had begun. Testing for such cases would be difficult.

11  Specifically for Direct Adaptation, it works best when used for specific subsets of an overall AI goal.  Incorporate as much prior knowledge as possible.  Design a good performance measure (this can be extremely difficult because: -Many alternative measures of apparently equal merit often exist, requiring an arbitrary choice to be made between them. -The most obvious or logical measure of performance might produce the same value for wide ranges of parameter values, providing little guide as to how to choose between them. -Carelessly designed performance measures can encourage undesirable behavior, or introduce locally optimal behavior.

12  Learn by Optimization -Search for sets of parameters that make the agent perform well in-game.  Learn by Reinforcement -Learn the relationship between an action taken by the agent in a particular state of the game world and the performance of the agent.  Learn by Imitation -Imitating a player or allowing the player to evaluate performance assessments. http://youtube.com/watch?v=swxtKrVb0m0&featur e=relatedhttp://youtube.com/watch?v=swxtKrVb0m0&featur e=related (3:45)

13  Avoid Locally Optimal Behaviors -Behaviors that, while not the best possible, can’t be improved upon by making small changes.  Minimize Dependencies -Multiple learning dependencies can rely on each other. Example- An AI with both a ‘location’ and a ‘weapon choice algorithm.  Avoid Overfitting -Overfitting means an agent has adapted its behavior to a very specific set of states of the game world and performs poorly in other states.  Explore and Exploit -Should the AI explore new strategies or repeat what it already knows?

14  Indirect adaptation works well in game because the agent’s behavior is determined by the AI layer in the design phase of a game.  Direct adaptation can be performed during a game’s development, but is limited to only limited and specific problems in game and only when guided by a well thought out heuristic.

15  Handle learning opportunities efficiently -When do we calculate what has been learned?  Generate effective novel behaviors -The AI should experiment plausibly and effectively.  Be robust with respect to nondeterminism -Be aware that sometimes random choices will work well just by sheer luck  Require minimal computational resources -Typically Ai has access to a very small proportion of a machines resources.

16  Potentially NPCs could have adaptive AI. Why not?  Adaptive Ai takes too long to learn and, in general, the search space of behaviors is too large and complex to be explored quickly and efficiently.  Solution: Dynamic Scripting. An adaptive mechanism that uses domain knowledge to restrict the search space enough to make real-time adaption a reality while ensuring that ALL NPC behaviors are always plausible.  We can also go back and consider how Dynamic Scripting solves all of the “requirements” page as well.

17  At a tactical level, the number of possible states and actions in a game can be huge, while the number of learning opportunities is often relatively small. In these circumstances, reinforcement learning is unlikely to achieve acceptable results unless the size of the state-action space can be dramatically reduced. This is the approach of dynamic scripting.  This is different than just reinforcement learning because (although reinforcement learning may be involved) because it is a strategy to reduce the number of actual state actions that can be utilized and it can also adapt more easily to the strategy in use by the opponent.  Now, how do we actually go about implementing this?

18  The first step in creating an application of dynamic scripting is to create the rulebases from which it will build its scripts.  Example rules: use a melee attack against the closest enemy. Another rule could be to use a special ability at a certain time on a certain.  Using the many rulebases for each NPC, we then go and create our script, or strategy that the NPCs plan on using against their foes.  Also need to set priorities so the NPC can determine the order it needs to take its actions. Pulling out a weapon should be higher priority than attacking for example since you need to do it first.

19  Choose rules in a random or weighted-random fashion before an encounter with the player.  Choosing the right size for the script is important.  If you set the size too small, the AI won’t be very complex and won’t seem very smart.  If you set the size too large, the AI might spend too much time doing high priority tasks and won’t have any time to do the lower priority ones. This would be a problem, for example, for our fighter mentioned earlier if a vast amount of potential high priority tasks were defined.

20  Dynamic scripting requires a fitness function that will assign a value to the results of an encounter that indicates how well the dynamically scripted AI performed.  Evaluate the “fitness” value of the encounter afterwards. Based on these evaluations, we’ll change the weights on our potential rules that we used to create the script.  This fitness function is game specific and needs to be carefully designed by the developers in advance.  Consider individual performance and also team performance. If one bot’s performance was terrible, but team performance was great, maybe its performance wasn’t so terrible after all? Then again, can’t only consider team performance, because maybe its performance is dragging the rest of the team down and they could do even better.

21  Update the weights of each ruleset as needed after implementing the fitness function.  There are many possibilities for the design of the formula, the book has its favorite, it doesn’t matter so much which you use.  The main goal here is to make sure that your Dynamically learning NPCs are generating variability, exploring new behaviors to adapt to the players strategy, but still utilizing strategies that work.

22  In games we don’t want to create the ultimate AI to defeat the player every time. Not everyone is this good http://www.youtube.com/watch?v=Jen46qkZVNI&feature =related (3:10) http://www.youtube.com/watch?v=Jen46qkZVNI&feature =related  Therefore we do the following:  When the computer loses a fight, change the weights such that the computer focuses on successful behaviors instead of experimenting with new ones.  When the computer wins a fight, change the weights such that the computer focuses on varying and experimental strategies instead of ones that it knows are successful against the particular player.  If these two things happen correctly, the player will find themselves facing an AI that beats them roughly 50% of the time.

23  Supervised learning is a machine learning technique for creating a function from training data. The training data consist of pairs of input objects (typically vectors), and desired outputs. The output of the function can be a continuous value, or can predict a class label of the input object. The task of the supervised learner is to predict the value of the function for any valid input object after having seen a number of training examples (i.e. pairs of input and target output). To achieve this, the learner has to generalize from the presented data to unseen situations in a "reasonable" way.

24  Unsupervised learning is a type of machine language where manual labels of inputs are not used. It is distinguished from supervised learning approaches which learn how to perform a task, such as classification or regression, using a set of human prepared examples.

25  In reality these 4 different strategies intermingle. You can have direct supervised, indirect unsupervised, etc.  An example: Supervised Direct Adaptation might be your self-learning AI that the developers run for a game for several days worth of run/test time before releasing it to the public to figure out some of the best strategies for their AI to implement.

26  The concept: Many video games or games in general boil down to extremely overcomplicated versions of rock paper scissors. In a fighting game you have abilities such as kick/punch/block and many abilities are good at countering other abilities.  If a bot can predict what the player is going to do more reliably, it can win more reliably.  Or if an opponent always does the same thing in an FPS game: Moves to pick up his favorite weapon, gets a medkit, gets some armor, then goes to his favorite hiding spot in that order, we can more reliably counteract their plans.

27  Basic Idea: If you had to predict what the next number would be in the following string sequence how would you do it? 10010110111000010001101  Find the longest substring that matches the tail end of the string, then figure out what comes after that. What…? The answer to the example might help elaborate.

28  1001 01101 11000010001101  It’s 1. The substring is 01101 and the following number is 1.  Problem: O(N^2) run time on this algorithm.  The solution is to instead, update our knowledge base of match sizes incrementally. A picture speaks a thousand words. Instead of trying to explain this in words, here comes the next page.  RockPaperScissors Program:

29

30

31  Compute all matching substrings, not just the longest one.  Prediction Value = Length(S)/DistanceToTail(S)  Take the sum of the prediction value for each time the same length occurred.  The idea here is that smaller string matches that occurred recently may be more reasonable than longer ones in the past and string matches that have occurred multiple times are much more reasonable than ones that have only occurred once or twice.

32  There are several different methods of machine learning one can use for various effects on computers.  Any good examples of this happening? Yes.

33  Black and White is an excellent example of a game which uses mostly Supervised Indirect Adaptation

34 http://www.youtube.com/watch?v=sTu41uCOX L8 http://youtube.com/watch?v=LKv-YPhrrx8 http://www.wischik.com/lu/senses/bwcreature.html#basics

35  There are many simple, non-resource-eating methods to utilize machine learning in a game.  Although it has been done before, the AI that exists in many new games is pathetically boring and a few simple tweaks would make mountains out of molehills with learning AI that can adapt to, counteract, and learn strategies used by the player. I would soil myself if AI ever learned some nonsense like this from a player http://www.youtube.com/watch?v=wh6JPbIWNU0 (:50) http://www.youtube.com/watch?v=wh6JPbIWNU0  Should anyone in this class ever be making a computer game, consider using some simple machine learning concepts to improve the game greatly.


Download ppt "A presentation by Matthew Dilts.  To solve problems that would take a long amount of time to manually solve ie. What’s the best strategy in a certain."

Similar presentations


Ads by Google