1 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart CSC384: Lecture 25  Last time Decision trees and decision networks  Today wrap up.

3 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Policies  A policy is associated with every decision node telling what to do given the value of the parents.  A policy for BT might be: δ BT (c,f) = bt δ BT (c,~f) = ~bt δ BT (~c,f) = bt δ BT (~c,~f) = ~bt Chills Fever BloodTst

4 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Value of a Policy  Value of a policy δ is the expected utility given that decision nodes are set according to δ  If we set all of the chance nodes, the decision nodes will be set by the policy δ and the utility obtained will be set by the utility node.  Every setting of the chance nodes has a certain probability—given by the CPTs in the decision net.  Thus we can compute the expected value of a policy  Value of δ : EU( δ ) = Σ X P(X, δ (X)) U(X, δ (X)) Probability of this setting of the chance nodes Utility obtained

5 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Value of a Policy EU( δ ) = Σ X P(X, δ (X)) U(X, δ (X)) Note the one needs to have the value of the decision nodes to compute the probability of the setting of the chance node Similarly, the utility might depend on the setting of both types of nodes. Disease TstResult BloodTst Drug U

7 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Computing the Best Policy  We can work backwards  First compute optimal policy for last decision D. for each asst to parents its parents and each setting of D we compute the expected value of choosing that value of D. The policy mapping is always to map each setting of the parents to the value of D that had maximum expected value. Since all decision nodes are parents of D, setting D and its parents yields a BN with a single UTILITY node.

8 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Computing the Best Policy  TRICK: we can include the utility node’s mapping (setting of parents to utility value) in the collection of input factors, then eliminate ALL variables to obtain the EXPECTED VALUE (without having to first compute the Probability then the Expectation.)

9 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Computing the Best Policy Pr(U) = Σ A,B,C P(A,B,C) Util(U,B,C) = Σ A,B,C P(C|B)P(B|A)P(A)Util(U,B,C) Expt(U) = Pr(U) X Val(U) U C B A Expt(U) = Σ A,B,C P(A,B,C) Util(B,C) = Σ A,B,C P(C|B)P(B|A)P(A)Util(B,C) U is a variable with numbers as values. Now just eliminate all variables. Util(B,C) is the value of U given a setting of its parents. Final result is a single value!

10 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Computing the Best Policy  Next compute policy for previous decision C given policy δ D just determined for last decision D. δ D can be treated as being an ordinary CPT with 0/1 probabilities, and D an ordinary BN variable. With δ D treated this way, when we set C and all of its parents we again obtain an ordinary Bayes net with only a single utility node. Now VE can again be used to compute the expected value of every choice of C give a setting of its parents. And a policy δc for C can be produced that maps C to it’s the value that yielded highest expected value for each setting of its parents. Then we continue to decision before C.

11 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Decision Network Notes  Decision networks commonly used by decision analysts to help structure decision problems  Much work put into computationally effective techniques to solve these common trick: replace the decision nodes with random variables at outset and solve a plain Bayes net (a subtle but useful transformation)  Complexity much greater than BN inference we need to perform a number of variable eliminations to compute the policy for each node. one VE problem for each setting of decision node parents and decision node value

12 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart A Detailed Decision Net Example  Setting: you want to buy a used car, but there’s a good chance it is a “lemon” (i.e., prone to breakdown). Before deciding to buy it, you can take it to a mechanic for inspection. S/he will give you a report on the car, labelling it either “good” or “bad”. A good report is positively correlated with the car being sound, while a bad report is positively correlated with the car being a lemon.  The report costs $50 however. So you could risk it, and buy the car without the report.  Owning a sound car is better than having no car, which is better than owning a lemon.

13 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Car Buyer’s Network Lemon Report InspectBuy U l ~l 0.5 g b n l i 0.2 0.8 0 ~l i 0.9 0.1 0 l ~i 0 0 1 ~l ~i 0 0 1 Rep: good,bad,none b l -600 b ~l 1000 ~b l -300 ~b~l -300 Utility -50 if inspect

14 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Evaluate Last Decision: Buy (1)  EU(B|I,R) = Σ L P(L,I,R,B) U(L,B,I) = Σ L P(L)P(R|L,I) U(L,B,I)  I = i, R = g: EU(buy) = P(l)P(g|l,i)U(l,buy,i) + P(~l)P(g|~l,i)U(~l,buy,i) =.5*.2*-650 +.5*.9*950 = 362.5 EU(~buy) = P(l)P(g|l,i)U(l,~buy,i) +P(~l)P(g|~l,i)U(~l,~buy,i) =.5*.2*-350 +.5*.9*-350 = -192.5 So optimal δ Buy (i,g) = buy

15 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Evaluate Last Decision: Buy (2)  I = ~i, R = g  I = i, R = b  I = ~i, R = b  I = i, R = n  I = ~i, R = n  Now we have policy for BUY  And can compute policy for Inspect.  I’ll post the details to the web site.

16 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart Course Summary—Study Guide.  Main emphasis in exam will be questions that involve showing how particular algorithms operate.  There will also be some questions about concepts.  At most one question involving a proof (you should be able to get a very good mark even if you don’t do this question).  You should know how to do questions of the type that appeared in the two term tests.  Only material covered in lecture and your assignments will be on the test.

1 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart CSC384: Lecture 25  Last time Decision trees and decision networks  Today wrap up.

Similar presentations

Presentation on theme: "1 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart CSC384: Lecture 25  Last time Decision trees and decision networks  Today wrap up."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart CSC384: Lecture 25  Last time Decision trees and decision networks  Today wrap up.

Similar presentations

Presentation on theme: "1 CSC 384 Lecture Slides (c) 2002-2003, C. Boutilier and P. Poupart CSC384: Lecture 25  Last time Decision trees and decision networks  Today wrap up."— Presentation transcript:

Similar presentations

About project

Feedback