Decision Theory: Single Stage Decisions Computer Science cpsc322, Lecture 33 (Textbook Chpt 9.2) March, 30, 2009.

Slides:



Advertisements
Similar presentations
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
Advertisements

Computer Science CPSC 322 Lecture 3 AI Applications.
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) Nov, 28, 2012.
CPSC 322, Lecture 4Slide 1 Search: Intro Computer Science cpsc322, Lecture 4 (Textbook Chpt ) Sept, 11, 2013.
Department of Computer Science Undergraduate Events More
Slide 1 Planning: Representation and Forward Search Jim Little UBC CS 322 October 10, 2014 Textbook §8.1 (Skip )- 8.2)
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 6, 2013.
Decision Making Under Uncertainty Jim Little Decision 1 Nov
CPSC 502, Lecture 11Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 11 Oct, 18, 2011.
CPSC 322, Lecture 16Slide 1 Stochastic Local Search Variants Computer Science cpsc322, Lecture 16 (Textbook Chpt 4.8) February, 9, 2009.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
CPSC 322, Lecture 35Slide 1 Finish VE for Sequential Decisions & Value of Information and Control Computer Science cpsc322, Lecture 35 (Textbook Chpt 9.4)
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
CPSC 322, Lecture 19Slide 1 Propositional Logic Intro, Syntax Computer Science cpsc322, Lecture 19 (Textbook Chpt ) February, 23, 2009.
CPSC 322, Lecture 37Slide 1 Finish Markov Decision Processes Last Class Computer Science cpsc322, Lecture 37 (Textbook Chpt 9.5) April, 8, 2009.
CPSC 322, Lecture 4Slide 1 Search: Intro Computer Science cpsc322, Lecture 4 (Textbook Chpt ) January, 12, 2009.
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) April, 1, 2009.
CPSC 322, Lecture 18Slide 1 Planning: Heuristics and CSP Planning Computer Science cpsc322, Lecture 18 (Textbook Chpt 8) February, 12, 2010.
CPSC 322, Lecture 30Slide 1 Reasoning Under Uncertainty: Variable elimination Computer Science cpsc322, Lecture 30 (Textbook Chpt 6.4) March, 23, 2009.
CPSC 322, Lecture 11Slide 1 Constraint Satisfaction Problems (CSPs) Introduction Computer Science cpsc322, Lecture 11 (Textbook Chpt 4.0 – 4.2) January,
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) January, 6, 2010.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) April, 12, 2010.
CPSC 322, Lecture 17Slide 1 Planning: Representation and Forward Search Computer Science cpsc322, Lecture 17 (Textbook Chpt 8.1 (Skip )- 8.2) February,
CPSC 322, Lecture 17Slide 1 Planning: Representation and Forward Search Computer Science cpsc322, Lecture 17 (Textbook Chpt 8.1 (Skip )- 8.2) February,
CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.
CPSC 322, Lecture 22Slide 1 Logic: Domain Modeling /Proofs + Top-Down Proofs Computer Science cpsc322, Lecture 22 (Textbook Chpt 5.2) March, 8, 2010.
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5) March, 27, 2009.
CPSC 322, Lecture 35Slide 1 Value of Information and Control Computer Science cpsc322, Lecture 35 (Textbook Chpt 9.4) April, 14, 2010.
CPSC 322, Lecture 24Slide 1 Reasoning under Uncertainty: Intro to Probability Computer Science cpsc322, Lecture 24 (Textbook Chpt 6.1, 6.1.1) March, 15,
Computer Science CPSC 322 Lecture 3 AI Applications 1.
Computer Science CPSC 322 Lecture 4 Search: Intro (textbook Ch: ) 1.
CPSC 322, Lecture 22Slide 1 Logic: Domain Modeling /Proofs + Top-Down Proofs Computer Science cpsc322, Lecture 22 (Textbook Chpt 5.2) Oct, 26, 2010.
Department of Computer Science Undergraduate Events More
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
Computer Science CPSC 502 Lecture 14 Markov Decision Processes (Ch. 9, up to 9.5.3)
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5.2) Nov, 25, 2013.
Lecture 33, Bayesian Networks Wrap Up Intro to Decision Theory Slide 1.
CPSC 322, Lecture 19Slide 1 (finish Planning) Propositional Logic Intro, Syntax Computer Science cpsc322, Lecture 19 (Textbook Chpt – 5.2) Oct,
CPSC 322, Lecture 4Slide 1 Search: Intro Computer Science cpsc322, Lecture 4 (Textbook Chpt ) Sept, 12, 2012.
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 7, 2012.
CPSC 322, Lecture 16Slide 1 Stochastic Local Search Variants Computer Science cpsc322, Lecture 16 (Textbook Chpt 4.8) Oct, 11, 2013.
Decision Theory: Single & Sequential Decisions. VE for Decision Networks. CPSC 322 – Decision Theory 2 Textbook §9.2 April 1, 2011.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Lecture 21, Bayesian Networks Wrap Up Intro to Decision Theory Slide 1.
Example: Localization for “Pushed around” Robot
Please Also Complete Teaching Evaluations
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 2
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Chapter 10 (part 3): Using Uncertain Knowledge
Reasoning Under Uncertainty: Conditioning, Bayes Rule & Chain Rule
Planning: Representation and Forward Search
Planning Under Uncertainty Computer Science cpsc322, Lecture 11
& Decision Theory: Intro
Decision Theory: Single Stage Decisions
Planning: Representation and Forward Search
Probability and Time: Markov Models
Probability and Time: Markov Models
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 2
Decision Theory: Single & Sequential Decisions.
Decision Theory: Sequential Decisions
Decision Theory: Single Stage Decisions
Probability and Time: Markov Models
Please Also Complete Teaching Evaluations
Probability and Time: Markov Models
Domain Splitting CPSC 322 – CSP 4 Textbook §4.6 February 4, 2011.
Planning: Representation and Forward Search
Presentation transcript:

Decision Theory: Single Stage Decisions Computer Science cpsc322, Lecture 33 (Textbook Chpt 9.2) March, 30, 2009

Lecture Overview Intro One-Off Decision Example Utilities / Preferences and optimal Decision Single stage Decision Networks

CPSC 322, Lecture 2Slide 3 Planning in Stochastic Environments Environment Problem Query Planning Deterministic Stochastic Search Arc Consistency Search Value Iteration Var. Elimination Constraint Satisfaction Logics STRIPS Belief Nets Vars + Constraints Decision Nets Markov Decision Processes Var. Elimination Static Sequential Representation Reasoning Technique SLS Markov Chains and HMMs

Planning Under Uncertainty: Intro Planning how to select and organize a sequence of actions/decisions to achieve a given goal. Deterministic Goal: A possible world in which some propositions are true Planning under Uncertainty: how to select and organize a sequence of actions/decisions to “maximize the probability” of achieving a given goal Goal under Uncertainty: we'll move from all-or- nothing goals to a richer notion: rating how happy the agent is in different possible worlds.

“Single” Action vs. Sequence of Actions Set of primitive decisions that can be treated as a single macro decision to be made before acting Agents makes observations Decides on an action Carries out the action

Lecture Overview Intro One-Off Decision Example Utilities / Preferences and Optimal Decision Single stage Decision Networks

Recap: One-off decision example Delivery Robot Example Robot needs to reach a certain room Going through stairs may cause an accident. It can go the short way through long stairs, or the long way through short stairs (that reduces the chance of an accident but takes more time) The Robot can choose to wear pads to protect itself or not (to protect itself in case of an accident) but pads slow it down If there is an accident the Robot does not get to the room

Decision Tree for Delivery Robot This scenario can be represented as the following decision tree The agent has a set of decisions to make (a macro-action it can perform) Decisions can influence random variables Decisions have probability distributions over outcomes Which way Accident long short true false true false

Decision Variables: Some general Considerations A possible world specifies a value for each random variable and each decision variable. For each assignment of values to all decision variables, the probabilities of the worlds satisfying that assignment sum to 1.

Lecture Overview Intro One-Off Decision Problems Utilities / Preferences and Optimal Decision Single stage Decision Networks

What are the optimal decisions for our Robot? It all depends on how happy the agent is in different situations. For sure getting to the room is better than not getting there….. but we need to consider other factors..

Utility / Preferences Utility: a measure of desirability of possible worlds to an agent Let U be a real-valued function such that U ( w ) represents an agent's degree of preference for world w. This would be a reasonable utility function for our Robot Which way Accident Wear PadsUtilityWorld short true true short false true long true true long false true short true false short false false long false false long true false w0, moderate damage w1, reaches room, quick, extra weight w2, moderate damage, low energy w3, reaches room, slow, extra weight w4, severe damage w5, reaches room, quick w6, severe damage, low energy w7, reaches room, slow

Utility: Simple Goals Can simple (boolean) goals still be specified? Which way Accident Wear PadsUtility long true true long true false long false true long false false short true true short true false short false true short false false

Optimal decisions: How to combine Utility with Probability What is the utility of achieving a certain probability distribution over possible worlds? It is its expected utility/value i.e., its average utility, weighting possible worlds by their probability

Optimal decision in one-off decisions Given a set of n decision variables var i (e.g., Wear Pads, Which Way), the agent can choose: D = d i for any d i  dom(var 1 ) x.. x dom(var n ). Wear Pads Which way true short true long false short false long

Optimal decision: Maximize Expected Utility The expected utility of decision D = d i is E (U | D = d i ) =  w╞ D = d i P( w | D = d i ) U( w ) e.g., E (U | D = {WP=, WW= })= An optimal decision is the decision D = d max whose expected utility is maximal: Wear Pads Which way true short true long false short false long

Lecture Overview Intro One-Off Decision Problems Utilities / Preferences and Optimal Decison Single stage Decision Networks

Single-stage decision networks Extend belief networks with: Decision nodes, that the agent chooses the value for. Drawn as rectangle. Utility node, the parents are the variables on which the utility depends. Drawn as a diamond. Shows explicitly which decision nodes affect random variables Which way Accident long short true false true false Which way Accident Wear PadsUtility long true true long true false long false true long false false short true true short true false short false true short false false

Finding the optimal decision: We can use VE Suppose the random variables are X 1, …, X n, the decision variables are the set D, and utility depends on pU⊆ {X 1, …, X n } ∪ D E (U |D ) = = To find the optimal decision we can use VE: 1. Create a factor for each conditional probability and for the utility 2. Multiply factors and sum out all of the random variables (This creates a factor on that gives the expected utility for each ) 3. Choose the D with the maximum value in the factor.

Example Initial Factors (Step1) Which way Accident Wear PadsUtility long true true long true false long false true long false false short true true short true false short false true short false false Which wayAccidentProbability long short true false true false

Example: Multiply Factors (Step 2a) Which way Accident Wear PadsUtility long true true long true false long false true long false false short true true short true false short false true short false false Which wayAccidentProbability long short true false true false Which way Accident Wear PadsUtility long true true long true false long false true long false false short true true short true false short false true short false false 30 *…………

Example: Sum out var s and choose max (Steps 2b-3) Which way Accident Wear PadsUtility long true true long true false long false true long false false short true true short true false short false true short false false 0.01* *0 0.99* *80 0.2*35 0.2*3 0.8*95 0.8*100 Which wayWear PadsExpected Utility long short true false true false 0.01* *75= *0+0.99*80= *35+0.8*95=83 0.2*3+0.8*100=80.6 Sum out accident: Thus the optimal policy is to take the short way and wear pads, with an expected utility of 83.

CPSC 322, Lecture 4Slide 23 Learning Goals for today’s class You can: Compare and contrast stochastic single-stage (one-off) decisions vs. multistage decisions Define a Utility Function on possible worlds Define and compute optimal one-off decision (max expected utility) Represent one-off decisions as single stage decision networks and compute optimal decisions by Variable Elimination

Next Class (textbook sec. 9.3) Set of primitive decisions that can be treated as a single macro decision to be made before acting Agents makes observations Decides on an action Carries out the action Sequential Decisions