Download presentation
Presentation is loading. Please wait.
1
ai in game programming it university of copenhagen Reinforcement Learning [Intro] Marco Loog
2
ai in game programming it university of copenhagen Introduction How can an agent learn if there is no teacher around who tells it with every action what’s right and what’s wrong? E.g., an agent can learn how to play chess by supervised learning, given that examples of states and their correct actions are provided But what if these examples are not available?
3
ai in game programming it university of copenhagen Introduction But what if these examples are not available? Through random moves, i.e., exploratory behavior, agent may be able to infer knowledge about the environment it is in But what is good and what is bad? = necessary knowledge to decide what to do in order to reach its goal
4
ai in game programming it university of copenhagen Introduction But what is good and what is bad? = necessary knowledge to decide what to do in order to reach its goal ‘Rewarding’ the agent when it did something good and ‘punishing’ it when it did something bad is called reinforcement Task of reinforcement learning is to use observed rewards to learn a [best] policy for the environment
5
ai in game programming it university of copenhagen E.g. [D. Terzopoulos et al.]
6
ai in game programming it university of copenhagen E.g. [T. Streeter]
7
ai in game programming it university of copenhagen E.g. [K. Sims]
8
ai in game programming it university of copenhagen Reinforcement Learning Use observed rewards to learn an [almost?] optimal policy for an environment Reward R(s) assigns to every state s a number Utility of an environment history is [as an example] the sum of the rewards received Policy describes agent’s action from any state s in order to reach the goal Optimal policy is policy with highest expected utility
9
ai in game programming it university of copenhagen Rewards, Utilities, &c. +1 -1
10
ai in game programming it university of copenhagen Rewards, Utilities, &c. +1 -1
11
ai in game programming it university of copenhagen Reinforcement Learning How to learn a policy like the previous one? Complicating factors Normally, both the environment and the reward function are unknown In many complex domains reinforcement learning is the only feasible way to success
12
ai in game programming it university of copenhagen Reinforcement Learning Might be considered to encompass all of AI : an agent is dropped off somewhere and it should itself figure everything out We will concentrate on simple settings and agent designs to keep things manageable E.g. fully observable environment
13
ai in game programming it university of copenhagen 3 Agent Designs Utility-based agents : learns a utility function based on which it chooses actions Q-learning agent : learns an action value function given the expected utility of taking a given action in a given state Reflex agent : learns a policy that maps directly from states to actions
14
ai in game programming it university of copenhagen More Next week...
15
ai in game programming it university of copenhagen
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.