Download presentation

Presentation is loading. Please wait.

Published byJovan Cowser Modified about 1 year ago

1
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 19- Probabilistic Planning

2
Example : Blocks World STRIPS : A planning system – Has rules with precondition deletion list and addition list on(B, table) on(A, table) on(C, A) hand empty clear(C) clear(B) on(C, table) on(B, C) on(A, B) hand empty clear(A) A C A CB B STARTGOAL Robot hand

3
Rules R1 : pickup(x) Precondition & Deletion List : handempty, on(x,table), clear(x) Add List : holding(x) R2 : putdown(x) Precondition & Deletion List : holding(x) Add List : handempty, on(x,table), clear(x)

4
Rules R3 : stack(x,y) Precondition & Deletion List :holding(x), clear(y) Add List : on(x,y), clear(x), handempty R4 : unstack(x,y) Precondition & Deletion List : on(x,y), clear(x),handempty Add List : holding(x), clear(y)

5
Plan for the block world problem For the given problem, Start Goal can be achieved by the following sequence : 1.Unstack(C,A) 2.Putdown(C) 3.Pickup(B) 4.Stack(B,C) 5.Pickup(A) 6.Stack(A,B) Execution of a plan: achieved through a data structure called Triangular Table.

6
Why Probability? (discussion based on the book “Automated Planning” by Dana Nau)

7
Motivation In many situations, actions may have more than one possible outcome Action failures e.g., gripper drops its load Exogenous events e.g., road closed Would like to be able to plan in such situations One approach: Markov Decision Processes a c b Grasp block c a c b Intended outcome abc Unintended outcome

8
Stochastic Systems Stochastic system: a triple = (S, A, P) S = finite set of states A = finite set of actions P a (s | s) = probability of going to s if we execute a in s s S P a (s | s) = 1

9
Robot r1 starts at location l1 State s1 in the diagram Objective is to get r1 to location l4 State s4 in the diagram Goal Start Example

10
No classical plan (sequence of actions) can be a solution, because we can’t guarantee we’ll be in a state where the next action is applicable e.g., π = move(r1,l1,l2), move(r1,l2,l3), move(r1,l3,l4) Goal Start Example

11
Goal π 1 = { (s1, move(r1,l1,l2)), (s2, move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s4, wait), (s5, wait) } π 2 = { (s1, move(r1,l1,l2)), (s2, move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s4, wait), (s5, move(r1,l5,l4)) } π 3 = { (s1, move(r1,l1,l4)), (s2, move(r1,l2,l1)), (s3, move(r1,l3,l4)), (s4, wait), (s5, move(r1,l5,l4) } Policy: a function that maps states into actions Write it as a set of state-action pairs Policies Start

12
For every state s, there will be a probability P(s) that the system begins in the state s Goal Start Initial States

13
Goal Histories Start History: sequence of system states h = s 0, s 1, s 2, s 3, s 4, … h 0 = s1, s3, s1, s3, s1, … h 1 = s1, s2, s3, s4, s4, … h 2 = s1, s2, s5, s5, s5, … h 3 = s1, s2, s5, s4, s4, … h 4 = s1, s4, s4, s4, s4, … h 5 = s1, s1, s4, s4, s4, … h 6 = s1, s1, s1, s4, s4, … h 7 = s1, s1, s1, s1, s1, … Each policy induces a probability distribution over histories If h = s 0, s 1, … then P(h | π) = P(s 0 ) i ≥ 0 P π(Si) (s i+1 | s i ) move(r1,l2,l1)

14
Hidden Markov Models

15
Hidden Markov Model Set of states : S where |S|=N Output Alphabet : V Transition Probabilities : A = {a ij } Emission Probabilities : B = {b j (o k )} Initial State Probabilities : π

16
Three Basic Problems of HMM 1. Given Observation Sequence O ={o 1 … o T } Efficiently estimate P(O|λ) 2. Given Observation Sequence O ={o 1 … o T } Get best Q ={q 1 … q T } i.e. Maximize P(Q|O, λ) 3. How to adjust to best maximize Re-estimate λ

17
Solutions Problem 1: Likelihood of a sequence Forward Procedure Backward Procedure Problem 2: Best state sequence Viterbi Algorithm Problem 3: Re-estimation Baum-Welch ( Forward-Backward Algorithm )

18
Problem 2 Given Observation Sequence O ={o 1 … o T } Get “best” Q ={q 1 … q T } i.e. Solution : 1. Best state individually likely at a position i 2. Best state given all the previously observed states and observations Viterbi Algorithm

19
Example Output observed – aabb What state seq. is most probable? Since state seq. cannot be predicted with certainty, the machine is given qualification “hidden”. Note: ∑ P(outlinks) = 1 for all states

20
Probabilities for different possible seq 1 1,2 1, ,1, ,1, ,2, ,2, ,1,1, ,1,1, and so on 1,1,2, ,1,2,

21
If P(s i |s i-1, s i-2 ) (order 2 HMM) then the Markovian assumption will take effect only after two levels. (generalizing for n-order… after n levels) Viterbi for higher order HMM

22
Viterbi Algorithm Define such that, i.e. the sequence which has the best joint probability so far. By induction, we have,

23
Viterbi Algorithm

24

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google