Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ai in game programming it university of copenhagen Reinforcement Learning [Intro] Marco Loog.

Similar presentations


Presentation on theme: "Ai in game programming it university of copenhagen Reinforcement Learning [Intro] Marco Loog."— Presentation transcript:

1 ai in game programming it university of copenhagen Reinforcement Learning [Intro] Marco Loog

2 ai in game programming it university of copenhagen Introduction  How can an agent learn if there is no teacher around who tells it with every action what’s right and what’s wrong?  E.g., an agent can learn how to play chess by supervised learning, given that examples of states and their correct actions are provided  But what if these examples are not available?

3 ai in game programming it university of copenhagen Introduction  But what if these examples are not available?  Through random moves, i.e., exploratory behavior, agent may be able to infer knowledge about the environment it is in  But what is good and what is bad? = necessary knowledge to decide what to do in order to reach its goal

4 ai in game programming it university of copenhagen Introduction  But what is good and what is bad? = necessary knowledge to decide what to do in order to reach its goal  ‘Rewarding’ the agent when it did something good and ‘punishing’ it when it did something bad is called reinforcement  Task of reinforcement learning is to use observed rewards to learn a [best] policy for the environment

5 ai in game programming it university of copenhagen E.g. [D. Terzopoulos et al.]

6 ai in game programming it university of copenhagen E.g. [T. Streeter]

7 ai in game programming it university of copenhagen E.g. [K. Sims]

8 ai in game programming it university of copenhagen Reinforcement Learning  Use observed rewards to learn an [almost?] optimal policy for an environment  Reward R(s) assigns to every state s a number  Utility of an environment history is [as an example] the sum of the rewards received  Policy describes agent’s action from any state s in order to reach the goal  Optimal policy is policy with highest expected utility

9 ai in game programming it university of copenhagen Rewards, Utilities, &c.  +1  -1

10 ai in game programming it university of copenhagen Rewards, Utilities, &c.  +1  -1

11 ai in game programming it university of copenhagen Reinforcement Learning  How to learn a policy like the previous one?  Complicating factors  Normally, both the environment and the reward function are unknown  In many complex domains reinforcement learning is the only feasible way to success

12 ai in game programming it university of copenhagen Reinforcement Learning  Might be considered to encompass all of AI : an agent is dropped off somewhere and it should itself figure everything out  We will concentrate on simple settings and agent designs to keep things manageable  E.g. fully observable environment

13 ai in game programming it university of copenhagen 3 Agent Designs  Utility-based agents : learns a utility function based on which it chooses actions  Q-learning agent : learns an action value function given the expected utility of taking a given action in a given state  Reflex agent : learns a policy that maps directly from states to actions

14 ai in game programming it university of copenhagen More  Next week...

15 ai in game programming it university of copenhagen


Download ppt "Ai in game programming it university of copenhagen Reinforcement Learning [Intro] Marco Loog."

Similar presentations


Ads by Google