Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rational Agents (Chapter 2)

Similar presentations


Presentation on theme: "Rational Agents (Chapter 2)"— Presentation transcript:

1 Rational Agents (Chapter 2)

2 Agents An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators

3 Example: Vacuum-Agent
Percepts: Location and status, e.g., [A,Dirty] Actions: Left, Right, Suck, NoOp function Vacuum-Agent([location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left

4 Rational agents For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and the agent’s built-in knowledge Performance measure (utility function): An objective criterion for success of an agent's behavior Expected utility: Can a rational agent make mistakes? ExpectedUtility(action) = sum_outcomes Utility(outcome) * P(outcome|action)

5 Back to Vacuum-Agent Percepts: Location and status, e.g., [A,Dirty]
Actions: Left, Right, Suck, NoOp function Vacuum-Agent([location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left Is this agent rational? Depends on performance measure, environment properties

6 Specifying the task environment
PEAS: Performance measure, Environment, Actuators, Sensors P: a function the agent is maximizing (or minimizing) Assumed given In practice, needs to be computed somewhere E: a formal representation for world states For concreteness, a tuple (var1=val1, var2=val2, … ,varn=valn) A: actions that change the state according to a transition model Given a state and action, what is the successor state (or distribution over successor states)? S: observations that allow the agent to infer the world state Often come in very different form than the state itself E.g., in tracking, observations may be pixels and state variables 3D coordinates

7 PEAS Example: Autonomous taxi
Performance measure Safe, fast, legal, comfortable trip, maximize profits Environment Roads, other traffic, pedestrians, customers Actuators Steering wheel, accelerator, brake, signal, horn Sensors Cameras, LIDAR, speedometer, GPS, odometer, engine sensors, keyboard

8 Another PEAS example: Spam filter
Performance measure Minimizing false positives, false negatives Environment A user’s account, server Actuators Mark as spam, delete, etc. Sensors Incoming messages, other information about user’s account

9 Environment types Fully observable vs. partially observable
Deterministic vs. stochastic Episodic vs. sequential Static vs. dynamic Discrete vs. continuous Single agent vs. multi-agent Known vs. unknown

10 Fully observable vs. partially observable
Do the agent's sensors give it access to the complete state of the environment? For any given world state, are the values of all the variables known to the agent? vs. Source: L. Zettlemoyer

11 Deterministic vs. stochastic
Is the next state of the environment completely determined by the current state and the agent’s action? Is the transition model deterministic (unique successor state given current state and action) or stochastic (distribution over successor states given current state and action)? Strategic: the environment is deterministic except for the actions of other agents vs.

12 Episodic vs. sequential
Is the agent’s experience divided into unconnected single decisions/actions, or is it a coherent sequence of observations and actions in which the world evolves according to the transition model? vs. Episodic: The agent's experience is divided into atomic “episodes,” and the choice of action in each episode depends only on the episode itself

13 Static vs. dynamic Is the world changing while the agent is thinking? Semidynamic: the environment does not change with the passage of time, but the agent's performance score does vs.

14 Discrete vs. continuous
Does the environment provide a fixed number of distinct percepts, actions, and environment states? Are the values of the state variables discrete or continuous? Time can also evolve in a discrete or continuous fashion vs.

15 Single-agent vs. multiagent
Is an agent operating by itself in the environment? vs.

16 Known vs. unknown Are the rules of the environment (transition model and rewards associated with states) known to the agent? Strictly speaking, not a property of the environment, but of the agent’s state of knowledge vs.

17 Examples of different environments
Word jumble solver Chess with a clock Scrabble Autonomous driving Fully Fully Partially Partially Observable Deterministic Episodic Static Discrete Single agent Deterministic Strategic Stochastic Stochastic Episodic Sequential Sequential Sequential Static Semidynamic Static Dynamic Discrete Discrete Discrete Continuous Single Multi Multi Multi

18 Preview of the course Deterministic environments: search, constraint satisfaction, classical planning Can be sequential or episodic Multi-agent, strategic environments: minimax search, games Can also be stochastic, partially observable Stochastic environments Episodic: Bayesian networks, pattern classifiers Sequential, known: Markov decision processes Sequential, unknown: reinforcement learning

19 Review: PEAS

20 Review: PEAS P: Performance measure
Function the agent is maximizing (or minimizing) E: Environment A formal representation for world states For concreteness, a tuple (var1=val1, var2=val2, … ,varn=valn) A: Actions Transition model: Given a state and action, what is the successor state (or distribution over successor states)? S: Sensors Observations that allow the agent to infer the world state Often come in very different form than the state itself

21 Review: Environment types
Fully observable vs. partially observable Deterministic vs. stochastic (vs. strategic) Episodic vs. sequential Static vs. dynamic (vs. semidynamic) Discrete vs. continuous Single agent vs. multi-agent Known vs. unknown


Download ppt "Rational Agents (Chapter 2)"

Similar presentations


Ads by Google