# CS344 : Introduction to Artificial Intelligence

## Presentation on theme: "CS344 : Introduction to Artificial Intelligence"— Presentation transcript:

CS344 : Introduction to Artificial Intelligence
Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 19- Probabilistic Planning

Example : Blocks World STRIPS : A planning system – Has rules with precondition deletion list and addition list Robot hand Robot hand A C B A B C START GOAL on(B, table) on(A, table) on(C, A) hand empty clear(C) clear(B) on(C, table) on(B, C) on(A, B) hand empty clear(A)

Rules R1 : pickup(x) Precondition & Deletion List : handempty, on(x,table), clear(x) Add List : holding(x) R2 : putdown(x) Precondition & Deletion List : holding(x) Add List : handempty, on(x,table), clear(x)

Rules R3 : stack(x,y) Precondition & Deletion List :holding(x), clear(y) Add List : on(x,y), clear(x), handempty R4 : unstack(x,y) Precondition & Deletion List : on(x,y), clear(x),handempty Add List : holding(x), clear(y)

Plan for the block world problem
For the given problem, Start  Goal can be achieved by the following sequence : Unstack(C,A) Putdown(C) Pickup(B) Stack(B,C) Pickup(A) Stack(A,B) Execution of a plan: achieved through a data structure called Triangular Table.

(discussion based on the book “Automated Planning” by Dana Nau)
Why Probability? (discussion based on the book “Automated Planning” by Dana Nau)

Motivation c a b In many situations, actions may have more than one possible outcome Action failures e.g., gripper drops its load Exogenous events e.g., road closed Would like to be able to plan in such situations One approach: Markov Decision Processes Intended outcome c a b Grasp block c a b c Unintended outcome

Stochastic Systems Stochastic system: a triple  = (S, A, P)
S = finite set of states A = finite set of actions Pa (s | s) = probability of going to s if we execute a in s s  S Pa (s | s) = 1

Example Robot r1 starts at location l1
State s1 in the diagram Objective is to get r1 to location l4 State s4 in the diagram Start Goal

Example No classical plan (sequence of actions) can be a solution, because we can’t guarantee we’ll be in a state where the next action is applicable e.g., π = move(r1,l1,l2), move(r1,l2,l3), move(r1,l3,l4) Start Goal

Policies π1 = {(s1, move(r1,l1,l2)), (s2, move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s4, wait), (s5, wait)} π2 = {(s1, move(r1,l1,l2)), (s2, move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s4, wait), (s5, move(r1,l5,l4))} π3 = {(s1, move(r1,l1,l4)), (s2, move(r1,l2,l1)), (s3, move(r1,l3,l4)), (s4, wait), (s5, move(r1,l5,l4)} Policy: a function that maps states into actions Write it as a set of state-action pairs Start Goal

Initial States For every state s, there will be a probability P(s) that the system begins in the state s Start Goal

Histories Start Goal History: sequence of system states
h = s0, s1, s2, s3, s4, …  h0 = s1, s3, s1, s3, s1, …  h1 = s1, s2, s3, s4, s4, …  h2 = s1, s2, s5, s5, s5, …  h3 = s1, s2, s5, s4, s4, …  h4 = s1, s4, s4, s4, s4, …  h5 = s1, s1, s4, s4, s4, …  h6 = s1, s1, s1, s4, s4, …  h7 = s1, s1, s1, s1, s1, …  Each policy induces a probability distribution over histories If h = s0, s1, …  then P(h | π) = P(s0) i ≥ 0 Pπ(Si) (si+1 | si) move(r1,l2,l1) Start Goal

Hidden Markov Models

Hidden Markov Model Set of states : S where |S|=N Output Alphabet : V
Transition Probabilities : A = {aij} Emission Probabilities : B = {bj(ok)} Initial State Probabilities : π

Three Basic Problems of HMM
Given Observation Sequence O ={o1… oT} Efficiently estimate P(O|λ) Get best Q ={q1… qT} i.e. Maximize P(Q|O, λ) How to adjust to best maximize Re-estimate λ

Solutions Problem 1: Likelihood of a sequence
Forward Procedure Backward Procedure Problem 2: Best state sequence Viterbi Algorithm Problem 3: Re-estimation Baum-Welch ( Forward-Backward Algorithm )

Problem 2 Given Observation Sequence O ={o1… oT} Solution :
Get “best” Q ={q1… qT} i.e. Solution : Best state individually likely at a position i Best state given all the previously observed states and observations Viterbi Algorithm

Example Output observed – aabb
What state seq. is most probable? Since state seq. cannot be predicted with certainty, the machine is given qualification “hidden”. Note: ∑ P(outlinks) = 1 for all states

Probabilities for different possible seq
1 1,2 1,1 0.4 0.15 1,1,2 0.06 1,1,1 0.16 1,2,1 0.0375 1,2,2 0.0225 1,1,1,1 0.016 1,1,1,2 0.056 1,1,2,1 0.018 1,1,2,2 0.018 ...and so on

P(si|si-1, si-2) (order 2 HMM)
Viterbi for higher order HMM If P(si|si-1, si-2) (order 2 HMM) then the Markovian assumption will take effect only after two levels. (generalizing for n-order… after n levels)

Viterbi Algorithm Define such that,
i.e. the sequence which has the best joint probability so far. By induction, we have,

Viterbi Algorithm

Viterbi Algorithm