Please Also Complete Teaching Evaluations

Slides:



Advertisements
Similar presentations
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) Nov, 28, 2012.
Advertisements

Department of Computer Science Undergraduate Events More
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 6, 2013.
CPSC 502, Lecture 11Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 11 Oct, 18, 2011.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
CPSC 322, Lecture 35Slide 1 Finish VE for Sequential Decisions & Value of Information and Control Computer Science cpsc322, Lecture 35 (Textbook Chpt 9.4)
CPSC 322, Lecture 19Slide 1 Propositional Logic Intro, Syntax Computer Science cpsc322, Lecture 19 (Textbook Chpt ) February, 23, 2009.
Decision Theory: Single Stage Decisions Computer Science cpsc322, Lecture 33 (Textbook Chpt 9.2) March, 30, 2009.
CPSC 322, Lecture 37Slide 1 Finish Markov Decision Processes Last Class Computer Science cpsc322, Lecture 37 (Textbook Chpt 9.5) April, 8, 2009.
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) April, 1, 2009.
CPSC 322, Lecture 18Slide 1 Planning: Heuristics and CSP Planning Computer Science cpsc322, Lecture 18 (Textbook Chpt 8) February, 12, 2010.
CPSC 322, Lecture 30Slide 1 Reasoning Under Uncertainty: Variable elimination Computer Science cpsc322, Lecture 30 (Textbook Chpt 6.4) March, 23, 2009.
CPSC 322, Lecture 11Slide 1 Constraint Satisfaction Problems (CSPs) Introduction Computer Science cpsc322, Lecture 11 (Textbook Chpt 4.0 – 4.2) January,
CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) April, 12, 2010.
CPSC 322, Lecture 17Slide 1 Planning: Representation and Forward Search Computer Science cpsc322, Lecture 17 (Textbook Chpt 8.1 (Skip )- 8.2) February,
CPSC 322, Lecture 17Slide 1 Planning: Representation and Forward Search Computer Science cpsc322, Lecture 17 (Textbook Chpt 8.1 (Skip )- 8.2) February,
CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.
CPSC 322, Lecture 22Slide 1 Logic: Domain Modeling /Proofs + Top-Down Proofs Computer Science cpsc322, Lecture 22 (Textbook Chpt 5.2) March, 8, 2010.
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5) March, 27, 2009.
CPSC 422, Lecture 14Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14 Feb, 4, 2015 Slide credit: some slides adapted from Stuart.
CPSC 322, Lecture 35Slide 1 Value of Information and Control Computer Science cpsc322, Lecture 35 (Textbook Chpt 9.4) April, 14, 2010.
CPSC 322, Lecture 24Slide 1 Reasoning under Uncertainty: Intro to Probability Computer Science cpsc322, Lecture 24 (Textbook Chpt 6.1, 6.1.1) March, 15,
Computer Science CPSC 322 Lecture 3 AI Applications 1.
CPSC 322, Lecture 22Slide 1 Logic: Domain Modeling /Proofs + Top-Down Proofs Computer Science cpsc322, Lecture 22 (Textbook Chpt 5.2) Oct, 26, 2010.
Department of Computer Science Undergraduate Events More
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5.2) Nov, 25, 2013.
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
CPSC 322, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 30, 2015 Slide source: from David Page (MIT) (which were.
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 7, 2012.
Decision Theory: Single & Sequential Decisions. VE for Decision Networks. CPSC 322 – Decision Theory 2 Textbook §9.2 April 1, 2011.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Example: Localization for “Pushed around” Robot
Probability and Time: Hidden Markov Models (HMMs)
Reasoning Under Uncertainty: Belief Networks
Computer Science cpsc322, Lecture 4
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 2
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Planning: Representation and Forward Search
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Planning Under Uncertainty Computer Science cpsc322, Lecture 11
Decision Theory: Single Stage Decisions
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Planning: Representation and Forward Search
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 20
Probability and Time: Markov Models
UBC Department of Computer Science Undergraduate Events More
Probability and Time: Markov Models
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 2
Decision Theory: Single & Sequential Decisions.
Decision Theory: Sequential Decisions
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17
Decision Theory: Single Stage Decisions
Probability and Time: Markov Models
Please Also Complete Teaching Evaluations
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Logic: Domain Modeling /Proofs + Computer Science cpsc322, Lecture 22
VE for Decision Networks,
Probability and Time: Markov Models
Reasoning Under Uncertainty: Variable elimination
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Planning: Representation and Forward Search
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Presentation transcript:

Please Also Complete Teaching Evaluations TA Evaluations Please Also Complete Teaching Evaluations Johnson, David Johnson, Jordon will close on Sun, June 25th Kazemi, Seyed Mehran Rahman, MD Abed Wang, Wenyi Dylan Dong <wdong@cs.ubc.ca> Johnson, David davewj@cs.ubc.ca Johnson, Jordon jordon@cs.ubc.ca Kazemi, Seyed Mehran smkazemi@cs.ubc.ca Rahman, MD Abed abed90@cs.ubc.ca abedrahman1990@gmail.com Wang, Wenyi wenyi.wang@alumni.ubc.ca

Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) June, 22, 2017 Lecture 34

“Single” Action vs. Sequence of Actions Set of primitive decisions that can be treated as a single macro decision to be made before acting Agent makes observations Decides on an action Carries out the action One-off decisions Sequential Decisions

Finding Optimal Policies Lecture Overview Sequential Decisions Representation Policies Finding Optimal Policies This extends single-stage decision networks by allowing both chance nodes and decision nodes to be parents of decision nodes. Arcs coming into decision nodes represent the information that will be available when the decision is made. Arcs coming into chance nodes represents probabilistic dependence. Arcs coming into the utility node represent what the utility depends on

Sequential decision problems A sequential decision problem consists of a sequence of decision variables D1 ,…..,Dn. Each Di has an information set of variables pDi, whose value will be known at the time decision Di is made. Example syntactic Simplest <-> degenerate case Weather Forecast Umbrella

Sequential decisions : Simplest possible Only one decision! (but different from one-off decisions) Early in the morning. I listen to the weather forecast, shall I take my umbrella today? (I’ll have to go for a long walk at noon) What is a reasonable decision network ? Weather@12 Weather@12 Morning Forecast Morning Forecast U U Example syntactic Simplest <-> degenerate case Weather Forecast Umbrella Take Umbrella Take Umbrella A. C . None of these B.

Sequential decisions : Simplest possible Only one decision! (but different from one-off decisions) Early in the morning. Shall I take my umbrella today? (I’ll have to go for a long walk at noon) Relevant Random Variables? Example syntactic Simplest <-> degenerate case Weather Forecast Umbrella

Policies for Sequential Decision Problem: Intro A policy specifies what an agent should do under each circumstance (for each decision, consider the parents of the decision node) In the Umbrella “degenerate” case: D1 One possible Policy pD1 How many policies?

Sequential decision problems: “complete” Example A sequential decision problem consists of a sequence of decision variables D1 ,…..,Dn. Each Di has an information set of variables pDi, whose value will be known at the time decision Di is made. What should an agent do? What an agent should do at any time depends on what it will do in the future. What an agent does in the future depends on what it did before. No-forgetting decision network: decisions are totally ordered if a decision Db comes before Da ,then Db is a parent of Da any parent of Db is a parent of Da

Policies for Sequential Decision Problems A policy is a sequence of δ1 ,….., δn decision functions δi : dom(pDi ) → dom(Di ) This policy means that when the agent has observed O  dom(pDi ) , it will do δi(O) Example: Report Check Smoke Report CheckSmoke SeeSmoke Call true true true true true false true false true true false false false true true false true false false false true false false false true false Check smoke false and See smoke true is impossible so that entry will never occur How many policies?

Finding Optimal Policies Lecture Overview Recap Sequential Decisions Finding Optimal Policies

When does a possible world satisfy a policy? A possible world specifies a value for each random variable and each decision variable. Possible world w satisfies policy δ , written w ╞ δ if the value of each decision variable is the value selected by its decision function in the policy (when applied in w). Decision function for… Report Check Smoke true false VARs Fire Tampering Alarm Leaving Report Smoke SeeSmoke CheckSmoke Call true false true true Decision function for… Let’s parse this together! Report CheckSmoke SeeSmoke Call true true true true true false true false true true false false false true true false true false false false true false false false true false

When does a possible world satisfy a policy? Possible world w satisfies policy δ , written w ╞ δ if the value of each decision variable is the value selected by its decision function in the policy (when applied in w). Decision function for… Report Check Smoke true false VARs Fire Tampering Alarm Leaving Report Smoke SeeSmoke CheckSmoke Call true false true true Decision function for… Report CheckSmoke SeeSmoke Call true true true true true false true false true true false false false true true false true false false false true false false false true false Let’s parse this together! A. B. C . Cannot tell

Expected Value of a Policy Each possible world w has a probability P(w) and a utility U(w) The expected utility of policy δ is The optimal policy is one with the expected utility.

Finding Optimal Policies (Efficiently) Lecture Overview Recap Sequential Decisions Finding Optimal Policies (Efficiently)

Complexity of finding the optimal policy: how many policies? How many assignments to parents? How many decision functions? (binary decisions) How many policies? If a decision D has k binary parents, how many assignments of values to the parents are there? If there are b possible actions (possible values for D), how many different decision functions are there? If there are d decisions, each with k binary parents and b possible actions, how many policies are there? 2k b2k (b2k)d

Finding the optimal policy more efficiently: VE Create a factor for each conditional probability table and a factor for the utility. Sum out random variables that are not parents of a decision node. Eliminate (aka sum out) the decision variables Sum out the remaining random variables. Multiply the factors: this is the expected utility of the optimal policy. Finally, there are a few restrictions in place for decision networks. First, there may only be one utility node. Also, the utility of that node and the probabilities of any nodes which have decision nodes as parents are undefined until the decision functions for their parents are defined. A final restriction is that a decision may not be optimized if one of its parents has an observed value. A decision function defines the agent's actions for any context. Thus, if some decisions are observed or have observed parents, you will not be able to optimize them. It cannot find a decision function for any context because it is limited by its parent's observed value.

Eliminate the decision Variables: step3 details Select a variable D that corresponds to the latest decision to be made this variable will appear in only one factor with its parents Eliminate D by maximizing. This returns: A new factor to use in VE, maxD f The optimal decision function for D, arg maxD f Repeat till there are no more decision nodes. Report Value true false New factor Example: Eliminate CheckSmoke this variable will appear in only one factor with its parents Because of the no-forgetting assumption the parents of the latest decision to be made will include all the Parents of the other decisions and so all the vars that may influence the utility factor (which is the one/only factor that will contain the latest decision) And also because we have eliminated all the random variables that are not parents of a decision var (so vars that could be “downstream” from the latest decision to be made) Report CheckSmoke Value true true true false false true false false -5.0 -5.6 -23.7 -17.5 Decision Function Report CheckSmoke true false

VE elimination reduces complexity of finding the optimal policy We have seen that, if a decision D has k binary parents, there are b possible actions, If there are d decisions, Then there are: (b 2k)d policies Doing variable elimination lets us find the optimal policy after considering only d .b 2k policies (we eliminate one decision at a time) VE is much more efficient than searching through policy space. However, this complexity is still doubly-exponential we'll only be able to handle relatively small problems.

Learning Goals for today’s class You can: Represent sequential decision problems as decision networks. And explain the non forgetting property Verify whether a possible world satisfies a policy and define the expected value of a policy Compute the number of policies for a decision problem Compute the optimal policy by Variable Elimination CPSC 322, Lecture 4

Big Picture: Planning under Uncertainty Probability Theory Decision Theory One-Off Decisions/ Sequential Decisions Markov Decision Processes (MDPs) Fully Observable MDPs Partially Observable MDPs (POMDPs) Decision Support Systems (medicine, business, …) Economics Control Systems Robotics

Constraint Satisfaction Cpsc 322 Big Picture Environment Stochastic Deterministic Problem Arc Consistency Search Constraint Satisfaction Vars + Constraints SLS Static Belief Nets Logics Query Var. Elimination Search Markov Chains R&R Sys Representation and reasoning Systems Each cell is a R&R system STRIPS actions preconditions and effects Sequential STRIPS Decision Nets Planning Var. Elimination Search Representation Reasoning Technique CPSC 322, Lecture 2

Applications of AI 422 big picture Query Planning Deterministic Stochastic Value Iteration Approx. Inference Full Resolution SAT Logics Belief Nets Markov Decision Processes and Partially Observable MDP Markov Chains and HMMs First Order Logics Ontologies Applications of AI Approx. : Gibbs Undirected Graphical Models Markov Networks Conditional Random Fields Reinforcement Learning Representation Reasoning Technique Prob CFG Prob Relational Models Markov Logics StarAI (statistical relational AI) Hybrid: Det +Sto Forward, Viterbi…. Approx. : Particle Filtering 422 big picture Breadth and Depth CPSC 422, Lecture 35

More AI .. Machine Learning Knowledge Acquisition StarAI (statistical relational AI) Hybrid: Det +Sto Machine Learning Knowledge Acquisition Preference Elicitation Prob CFG Prob Relational Models Markov Logics Deterministic Stochastic Belief Nets Logics Where are the components of our representations coming from? First Order Logics Markov Chains and HMMs Ontologies Query Undirected Graphical Models Markov Networks Conditional Random Fields The probabilities? The utilities? The logical formulas? R&R Sys Representation and reasoning Systems Each cell is a R&R system Markov Decision Processes and Partially Observable MDP Planning From people and from data! Reinforcement Learning CPSC 422, Lecture 35

Some of our Grad Courses 522: Artificial Intelligence II : Reasoning and Acting Under Uncertainty Sample Advanced Topics….. Relational Reinforcement Learning for Agents in Worlds with Objects, relational learning. - Probabilistic Relational Learning and Inductive Logic Programming at a Global Scale, CPSC 422, Lecture 35

Some of our Grad Courses 503: Computational Linguistics I / Natural Language Processing  Sample Advanced Topics….. Topic Modeling (LDA) – Large Scale Graphical Models Discourse Parsing by Deep Learning (Neural Nets) Abstractive Summarization CPSC 422, Lecture 35

Other AI Grad Courses: check them out 532: Topics in Artificial Intelligence (different courses) User-Adaptive Systems and Intelligent Learning Environments Foundations of Multiagent Systems 540: Machine Learning 505: Image Understanding I: Image Analysis  525: Image Understanding II: Scene Analysis  515: Computational Robotics  CPSC 422, Lecture 35

Announcements Assignment 4. Due Sunday, June 25th @ 11:59 pm.  Late submissions will not be accepted, and late days may not be used.. FINAL EXAM: Thu, Jun 29 at 7-9:30 PM Room: BUCH A101 Final will comprise: 10 -15 short questions + 3-4 problems Work on all practice exercises (including 9.B) and sample review questions and problems (will be posted over the weekend) While you revise the learning goals, work on review questions - I may even reuse some verbatim  Come to remaining Office hours! (schedule for next week will be posted on piazza) My office hour tomorrow will be at 2PM CPSC 322, Lecture 37