CS 188: Artificial Intelligence Fall 2009 Lecture 18: Decision Diagrams 10/29/2009 Dan Klein – UC Berkeley.

Slides:



Advertisements
Similar presentations
Making Simple Decisions Chapter 16 Some material borrowed from Jean-Claude Latombe and Daphne Koller by way of Marie desJadines,
Advertisements

Making Simple Decisions
Lirong Xia Probabilistic reasoning over time Tue, March 25, 2014.
Lirong Xia Hidden Markov Models Tue, March 28, 2014.
Reasoning Under Uncertainty: Bayesian networks intro Jim Little Uncertainty 4 November 7, 2014 Textbook §6.3, 6.3.1, 6.5, 6.5.1,
Advanced Artificial Intelligence
QUIZ!!  T/F: Rejection Sampling without weighting is not consistent. FALSE  T/F: Rejection Sampling (often) converges faster than Forward Sampling. FALSE.
Planning under Uncertainty
CS 188: Artificial Intelligence Fall 2009 Lecture 20: Particle Filtering 11/5/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
Decision Making Under Uncertainty Russell and Norvig: ch 16, 17 CMSC421 – Fall 2005.
CS 188: Artificial Intelligence Fall 2009 Lecture 17: Bayes Nets IV 10/27/2009 Dan Klein – UC Berkeley.
Decision Making Under Uncertainty Russell and Norvig: ch 16 CMSC421 – Fall 2006.
Markov Models. Markov Chain A sequence of states: X 1, X 2, X 3, … Usually over time The transition from X t-1 to X t depends only on X t-1 (Markov Property).
CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.
CS 188: Artificial Intelligence Fall 2006 Lecture 17: Bayes Nets III 10/26/2006 Dan Klein – UC Berkeley.
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5) March, 27, 2009.
Announcements Homework 8 is out Final Contest (Optional)
CPSC 422, Lecture 14Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14 Feb, 4, 2015 Slide credit: some slides adapted from Stuart.
Rutgers CS440, Fall 2003 Introduction to Statistical Learning Reading: Ch. 20, Sec. 1-4, AIMA 2 nd Ed.
CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.
Decision Making Under Uncertainty Russell and Norvig: ch 16, 17 CMSC421 – Fall 2003 material from Jean-Claude Latombe, and Daphne Koller.
CS 188: Artificial Intelligence Fall 2009 Lecture 14: Bayes’ Nets 10/13/2009 Dan Klein – UC Berkeley.
CMSC 671 Fall 2003 Class #26 – Wednesday, November 26 Russell & Norvig 16.1 – 16.5 Some material borrowed from Jean-Claude Latombe and Daphne Koller by.
CHAPTER 15 SECTION 3 – 4 Hidden Markov Models. Terminology.
CS 188: Artificial Intelligence Fall 2008 Lecture 18: Decision Diagrams 10/30/2008 Dan Klein – UC Berkeley 1.
QUIZ!!  T/F: The forward algorithm is really variable elimination, over time. TRUE  T/F: Particle Filtering is really sampling, over time. TRUE  T/F:
Recap: Reasoning Over Time  Stationary Markov models  Hidden Markov models X2X2 X1X1 X3X3 X4X4 rainsun X5X5 X2X2 E1E1 X1X1 X3X3 X4X4 E2E2 E3E3.
Making Simple Decisions
CS 188: Artificial Intelligence Fall 2008 Lecture 19: HMMs 11/4/2008 Dan Klein – UC Berkeley 1.
Introduction to Bayesian Networks
CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains.
CS 188: Artificial Intelligence Fall 2006 Lecture 18: Decision Diagrams 10/31/2006 Dan Klein – UC Berkeley.
QUIZ!!  T/F: Forward sampling is consistent. True  T/F: Rejection sampling is faster, but inconsistent. False  T/F: Rejection sampling requires less.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Bayes’ Nets: Sampling [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available.
CS 416 Artificial Intelligence Lecture 17 Reasoning over Time Chapter 15 Lecture 17 Reasoning over Time Chapter 15.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
QUIZ!!  In HMMs...  T/F:... the emissions are hidden. FALSE  T/F:... observations are independent given no evidence. FALSE  T/F:... each variable X.
CS 188: Artificial Intelligence Bayes Nets: Approximate Inference Instructor: Stuart Russell--- University of California, Berkeley.
CPS 170: Artificial Intelligence Markov processes and Hidden Markov Models (HMMs) Instructor: Vincent Conitzer.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.
Reasoning over Time  Often, we want to reason about a sequence of observations  Speech recognition  Robot localization  User attention  Medical monitoring.
Making Simple Decisions Chapter 16 Some material borrowed from Jean-Claude Latombe and Daphne Koller by way of Marie desJadines,
CS 188: Artificial Intelligence Spring 2009 Lecture 20: Decision Networks 4/2/2009 John DeNero – UC Berkeley Slides adapted from Dan Klein.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
QUIZ!!  T/F: You can always (theoretically) do BNs inference by enumeration. TRUE  T/F: In VE, always first marginalize, then join. FALSE  T/F: VE is.
CS 188: Artificial Intelligence Spring 2006
Making Simple Decisions
Probabilistic reasoning over time
Probabilistic Reasoning
Artificial Intelligence
CS 4/527: Artificial Intelligence
CAP 5636 – Advanced Artificial Intelligence
Probabilistic Reasoning
CS 188: Artificial Intelligence Spring 2007
CS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence
Probabilistic Reasoning
CS 188: Artificial Intelligence Fall 2007
Probabilistic Reasoning
CS 188: Artificial Intelligence Fall 2008
Hidden Markov Models (cont.) Markov Decision Processes
Hidden Markov Models Lirong Xia.
CS 188: Artificial Intelligence Fall 2007
Announcements Assignments HW10 Due Wed 4/17 HW11
Instructor: Vincent Conitzer
Probabilistic reasoning over time
CS 188: Artificial Intelligence Fall 2008
Presentation transcript:

CS 188: Artificial Intelligence Fall 2009 Lecture 18: Decision Diagrams 10/29/2009 Dan Klein – UC Berkeley

Announcements  Mid-Semester Evaluations  Link is in your  Assignments  W3 out later this week  P4 out next week  Contest! 2

Decision Networks  MEU: choose the action which maximizes the expected utility given the evidence  Can directly operationalize this with decision networks  Bayes nets with nodes for utility and actions  Lets us calculate the expected utility for each action  New node types:  Chance nodes (just like BNs)  Actions (rectangles, cannot have parents, act as observed evidence)  Utility node (diamond, depends on action and chance nodes) Weather Forecast Umbrella U 3 [DEMO: Ghostbusters]

Decision Networks  Action selection:  Instantiate all evidence  Set action node(s) each possible way  Calculate posterior for all parents of utility node, given the evidence  Calculate expected utility for each action  Choose maximizing action Weather Forecast Umbrella U 4

Example: Decision Networks Weather Umbrella U WP(W) sun0.7 rain0.3 AWU(A,W) leavesun100 leaverain0 takesun20 takerain70 Umbrella = leave Umbrella = take Optimal decision = leave

Decisions as Outcome Trees  Almost exactly like expectimax / MDPs  What’s changed? 6 U(t,s) Weather take leave {} sun U(t,r) rain U(l,s)U(l,r) rain sun

Evidence in Decision Networks  Find P(W|F=bad)  Select for evidence  First we join P(W) and P(bad|W)  Then we normalize Weather Forecast WP(W) sun0.7 rain0.3 FP(F|rain) good0.1 bad0.9 FP(F|sun) good0.8 bad0.2 WP(W) sun0.7 rain0.3 WP(F=bad|W) sun0.2 rain0.9 WP(W,F=bad) sun0.14 rain0.27 WP(W | F=bad) sun0.34 rain Umbrella U

Example: Decision Networks Weather Forecast =bad Umbrella U AWU(A,W) leavesun100 leaverain0 takesun20 takerain70 WP(W|F=bad) sun0.34 rain0.66 Umbrella = leave Umbrella = take Optimal decision = take 8

Decisions as Outcome Trees 9 U(t,s) W | {b} take leave sun U(t,r) rain U(l,s)U(l,r) rain sun {b}

Conditioning on Action Nodes  An action node can be a parent of a chance node  Chance node conditions on the outcome of the action  Action nodes are like observed variables in a Bayes’ net, except we max over their values 10 S’ A U S T(s,a,s’) R(s,a,s’)

Value of Information  Idea: compute value of acquiring evidence  Can be done directly from decision network  Example: buying oil drilling rights  Two blocks A and B, exactly one has oil, worth k  You can drill in one location  Prior probabilities 0.5 each, & mutually exclusive  Drilling in either A or B has MEU = k/2  Question: what’s the value of information?  Value of knowing which of A or B has oil  Value is expected gain in MEU from new info  Survey may say “oil in a” or “oil in b,” prob 0.5 each  If we know OilLoc, MEU is k (either way)  Gain in MEU from knowing OilLoc?  VPI(OilLoc) = k/2  Fair price of information: k/2 OilLoc DrillLoc U DOU aak ab0 ba0 bbk OP a1/2 b 11

Value of Information  Assume we have evidence E=e. Value if we act now:  Assume we see that E’ = e’. Value if we act then:  BUT E’ is a random variable whose value is unknown, so we don’t know what e’ will be  Expected value if E’ is revealed and then we act:  Value of information: how much MEU goes up by revealing E’ first: P(s | e) {e} a U {e, e’} a P(s | e, e’) U {e} P(e’ | e) {e, e’}

VPI Example: Weather Weather Forecast Umbrella U AWU leavesun100 leaverain0 takesun20 takerain70 MEU with no evidence MEU if forecast is bad MEU if forecast is good FP(F) good0.59 bad0.41 Forecast distribution 13 [Demo]

VPI Properties  Nonnegative  Nonadditive ---consider, e.g., obtaining E j twice  Order-independent 14

Quick VPI Questions  The soup of the day is either clam chowder or split pea, but you wouldn’t order either one. What’s the value of knowing which it is?  There are two kinds of plastic forks at a picnic. It must be that one is slightly better. What’s the value of knowing which?  You have $10 to bet double-or-nothing and there is a 75% chance that Berkeley will beat Stanford. What’s the value of knowing the outcome in advance?  You must bet on Cal, either way. What’s the value now?

Reasoning over Time  Often, we want to reason about a sequence of observations  Speech recognition  Robot localization  User attention  Medical monitoring  Need to introduce time into our models  Basic approach: hidden Markov models (HMMs)  More general: dynamic Bayes’ nets 16

Markov Models  A Markov model is a chain-structured BN  Each node is identically distributed (stationarity)  Value of X at a given time is called the state  As a BN:  Parameters: called transition probabilities or dynamics, specify how the state evolves over time (also, initial probs) X2X2 X1X1 X3X3 X4X4 [DEMO: Ghostbusters]

Conditional Independence  Basic conditional independence:  Past and future independent of the present  Each time step only depends on the previous  This is called the (first order) Markov property  Note that the chain is just a (growing) BN  We can always use generic BN reasoning on it if we truncate the chain at a fixed length X2X2 X1X1 X3X3 X4X4 18

Example: Markov Chain  Weather:  States: X = {rain, sun}  Transitions:  Initial distribution: 1.0 sun  What’s the probability distribution after one step? rainsun This is a CPT, not a BN! 19

Mini-Forward Algorithm  Question: probability of being in state x at time t?  Slow answer:  Enumerate all sequences of length t which end in s  Add up their probabilities … 20

Mini-Forward Algorithm  Better way: cached incremental belief updates  An instance of variable elimination! sun rain sun rain sun rain sun rain Forward simulation

Example  From initial observation of sun  From initial observation of rain P(X 1 )P(X 2 )P(X 3 )P(X  ) P(X 1 )P(X 2 )P(X 3 )P(X  ) 22

Stationary Distributions  If we simulate the chain long enough:  What happens?  Uncertainty accumulates  Eventually, we have no idea what the state is!  Stationary distributions:  For most chains, the distribution we end up in is independent of the initial distribution  Called the stationary distribution of the chain  Usually, can only predict a short time out [DEMO: Ghostbusters]

Web Link Analysis  PageRank over a web graph  Each web page is a state  Initial distribution: uniform over pages  Transitions:  With prob. c, uniform jump to a random page (dotted lines)  With prob. 1-c, follow a random outlink (solid lines)  Stationary distribution  Will spend more time on highly reachable pages  E.g. many ways to get to the Acrobat Reader download page!  Somewhat robust to link spam  Google 1.0 returned the set of pages containing all your keywords in decreasing rank, now all search engines use link analysis along with many other factors 24

Value of (Perfect) Information  Current evidence E=e, utility depends on S=s  Potential new evidence E’: suppose we knew E’ = e’  BUT E’ is a random variable whose value is currently unknown, so:  Must compute expected gain over all possible values  (VPI = value of perfect information) 25

VPI Example: Ghostbusters TBGP(T,B,G) +t+b+g0.16 +t+b gg t bb +g0.24 +t bb gg 0.04  t +b+g0.04 tt +b gg 0.24 tt bb +g0.06 tt bb gg  Reminder: ghost is hidden, sensors are noisy  T: Top square is red B: Bottom square is red G: Ghost is in the top  Sensor model: P( +t | +g ) = 0.8 P( +t |  g ) = 0.4 P( +b | +g) = 0.4 P( +b |  g ) = 0.8 Joint Distribution [Demo]

VPI Example: Ghostbusters TBGP(T,B,G) +t+b+g0.16 +t+b gg t bb +g0.24 +t bb gg 0.04  t +b+g0.04 tt +b gg 0.24 tt bb +g0.06 tt bb gg Utility of bust is 2, no bust is 0  Q1: What’s the value of knowing T if I know nothing?  Q1’: E P(T) [MEU(t) – MEU()]  Q2: What’s the value of knowing B if I already know that T is true (red)?  Q2’: E P(B|t) [MEU(t,b) – MEU(t)]  How low can the value of information ever be? Joint Distribution [Demo]