Download presentation

Presentation is loading. Please wait.

1
**CS344 : Introduction to Artificial Intelligence**

Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 19- Probabilistic Planning

2
Example : Blocks World STRIPS : A planning system – Has rules with precondition deletion list and addition list Robot hand Robot hand A C B A B C START GOAL on(B, table) on(A, table) on(C, A) hand empty clear(C) clear(B) on(C, table) on(B, C) on(A, B) hand empty clear(A)

3
Rules R1 : pickup(x) Precondition & Deletion List : handempty, on(x,table), clear(x) Add List : holding(x) R2 : putdown(x) Precondition & Deletion List : holding(x) Add List : handempty, on(x,table), clear(x)

4
Rules R3 : stack(x,y) Precondition & Deletion List :holding(x), clear(y) Add List : on(x,y), clear(x), handempty R4 : unstack(x,y) Precondition & Deletion List : on(x,y), clear(x),handempty Add List : holding(x), clear(y)

5
**Plan for the block world problem**

For the given problem, Start Goal can be achieved by the following sequence : Unstack(C,A) Putdown(C) Pickup(B) Stack(B,C) Pickup(A) Stack(A,B) Execution of a plan: achieved through a data structure called Triangular Table.

6
**(discussion based on the book “Automated Planning” by Dana Nau)**

Why Probability? (discussion based on the book “Automated Planning” by Dana Nau)

7
Motivation c a b In many situations, actions may have more than one possible outcome Action failures e.g., gripper drops its load Exogenous events e.g., road closed Would like to be able to plan in such situations One approach: Markov Decision Processes Intended outcome c a b Grasp block c a b c Unintended outcome

8
**Stochastic Systems Stochastic system: a triple = (S, A, P)**

S = finite set of states A = finite set of actions Pa (s | s) = probability of going to s if we execute a in s s S Pa (s | s) = 1

9
**Example Robot r1 starts at location l1**

State s1 in the diagram Objective is to get r1 to location l4 State s4 in the diagram Start Goal

10
Example No classical plan (sequence of actions) can be a solution, because we can’t guarantee we’ll be in a state where the next action is applicable e.g., π = move(r1,l1,l2), move(r1,l2,l3), move(r1,l3,l4) Start Goal

11
Policies π1 = {(s1, move(r1,l1,l2)), (s2, move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s4, wait), (s5, wait)} π2 = {(s1, move(r1,l1,l2)), (s2, move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s4, wait), (s5, move(r1,l5,l4))} π3 = {(s1, move(r1,l1,l4)), (s2, move(r1,l2,l1)), (s3, move(r1,l3,l4)), (s4, wait), (s5, move(r1,l5,l4)} Policy: a function that maps states into actions Write it as a set of state-action pairs Start Goal

12
Initial States For every state s, there will be a probability P(s) that the system begins in the state s Start Goal

13
**Histories Start Goal History: sequence of system states**

h = s0, s1, s2, s3, s4, … h0 = s1, s3, s1, s3, s1, … h1 = s1, s2, s3, s4, s4, … h2 = s1, s2, s5, s5, s5, … h3 = s1, s2, s5, s4, s4, … h4 = s1, s4, s4, s4, s4, … h5 = s1, s1, s4, s4, s4, … h6 = s1, s1, s1, s4, s4, … h7 = s1, s1, s1, s1, s1, … Each policy induces a probability distribution over histories If h = s0, s1, … then P(h | π) = P(s0) i ≥ 0 Pπ(Si) (si+1 | si) move(r1,l2,l1) Start Goal

14
Hidden Markov Models

15
**Hidden Markov Model Set of states : S where |S|=N Output Alphabet : V**

Transition Probabilities : A = {aij} Emission Probabilities : B = {bj(ok)} Initial State Probabilities : π

16
**Three Basic Problems of HMM**

Given Observation Sequence O ={o1… oT} Efficiently estimate P(O|λ) Get best Q ={q1… qT} i.e. Maximize P(Q|O, λ) How to adjust to best maximize Re-estimate λ

17
**Solutions Problem 1: Likelihood of a sequence**

Forward Procedure Backward Procedure Problem 2: Best state sequence Viterbi Algorithm Problem 3: Re-estimation Baum-Welch ( Forward-Backward Algorithm )

18
**Problem 2 Given Observation Sequence O ={o1… oT} Solution :**

Get “best” Q ={q1… qT} i.e. Solution : Best state individually likely at a position i Best state given all the previously observed states and observations Viterbi Algorithm

19
**Example Output observed – aabb**

What state seq. is most probable? Since state seq. cannot be predicted with certainty, the machine is given qualification “hidden”. Note: ∑ P(outlinks) = 1 for all states

20
**Probabilities for different possible seq**

1 1,2 1,1 0.4 0.15 1,1,2 0.06 1,1,1 0.16 1,2,1 0.0375 1,2,2 0.0225 1,1,1,1 0.016 1,1,1,2 0.056 1,1,2,1 0.018 1,1,2,2 0.018 ...and so on

21
**P(si|si-1, si-2) (order 2 HMM) **

Viterbi for higher order HMM If P(si|si-1, si-2) (order 2 HMM) then the Markovian assumption will take effect only after two levels. (generalizing for n-order… after n levels)

22
**Viterbi Algorithm Define such that,**

i.e. the sequence which has the best joint probability so far. By induction, we have,

23
Viterbi Algorithm

24
Viterbi Algorithm

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google