Sequential Stochastic Models CSE 573
Logistics Reading Project 1 AIMA 17 thru 17.2 and 17.4 Due Wed 11/10 end of day Emphasis on report, experiments and conclusions Please hand in 2 printouts of your report One to Dan One to Miao Also send .doc or .pdf file We will link this to the class web © Daniel S. Weld
Sequential Stochastic Models Why sequential? What kind of sequences? © Daniel S. Weld
“Doing Time” (Hidden) Markov Models Dynamic Bayesian Networks © Daniel S. Weld
Knowledge Representation & Inference 573 Topics Reinforcement Learning Supervised Learning Planning Knowledge Representation & Inference Logic-Based Probabilistic Search Problem Spaces Agency © Daniel S. Weld
Where in Time are We? STRIPS Representation Uncertainty Bayesian Networks Sequential Stochastic Processes (Hidden) Markov Models Dynamic Bayesian networks (DBNs) Markov Decision Processes (MDPs) Reinforcement Learning © Daniel S. Weld
Defn: Markov Model Q: set of states p: init prob distribution A: transition probability distribution © Daniel S. Weld
E.g. Predict Web Behavior When will visitor leave site? Q: set of states (Pages) p: init prob distribution (Likelihood of site entry point) A: transition probability distribution (User navigation model) © Daniel S. Weld
Probability Distribution, A Forward Causality The probability of st does not depend directly on values of future states. Probability of new state could depend on The history of states visited. Pr(st|st-1,st-2,…, s0) Markovian Assumption Pr(st|st-1,st-2,…s0) = Pr(st|st-1) Stationary Model Assumption Pr(st|st-1) = Pr(sk|sk-1) for all k. © Daniel S. Weld
Representing A Q: set of states p: init prob distribution A: transition probabilities how can we represent these? s0 s1 s2 s3 s4 s5 s6 s0 s1 s2 s3 s4 s5 s6 p12 Probability of transitioning from s1 to s2 ∑ ? © Daniel S. Weld
How Learn A? Something on smoothing © Daniel S. Weld
Defn: Hidden Markov Model Q: set of states p: init prob distribution A: transition probability distribution Add obs model O: set of possible observatons © Daniel S. Weld
Ex: Speech Recognition Ground it © Daniel S. Weld
What can we do with HMMs? State estimation (filtering, monitoring) Compute belief state given evidence Prediction Predict posterior distribution over future state Smoothing (hindsight) Predict posterior distribution over past state Most likely explanation Given obs. seq, predict most likely seq of states Planning If you add rewards (Next class) © Daniel S. Weld
HMMs DBNs Recursive Estimation Forward-Backward Verterbi Filtering & Prediction – O(n) time; O(1) space Forward-Backward Smoothing – O(n) time; O(1) space Verterbi Finding Most Likely Sequence O(n) time; O(n) space N is exponential in the number of features © Daniel S. Weld
Factoring Q Represent Q simply as a Is there internal structure? set of states? Is there internal structure? Consider a robot domain What is the state space? © Daniel S. Weld
A Factored domain Six Boolean Variables : How many states? has_user_coffee (huc) , has_robot_coffee (hrc), robot_is_wet (w), has_robot_umbrella (u), raining (r), robot_in_office (o) How many states? 26 = 64 © Daniel S. Weld
Representing Compactly Q: set of states s3 s1 s0 p: init prob distribution How represent this efficiently? s4 s6 s2 s5 r u hrc With a Bayes net (of course!) w © Daniel S. Weld
Representing A Compactly Q: set of states s3 s1 s0 p: init prob distribution s4 s6 s2 A: transition probabilities s5 s0 s1 s2 s3 s4 s5 s6 … s35 s0 s1 s2 s3 s4 s5 s6 … s35 p12 How big is matrix version of A? 4096 © Daniel S. Weld
2-Dynamic Bayesian Network 8 4 16 2 huc huc Total values required to represent transition probability table = 36 Vs. 4096 required for a complete state probablity table? hrc hrc w w u u r r o o T T+1 © Daniel S. Weld
ADD Make concrete by showing CPT values This means we must have actions by now. © Daniel S. Weld
Dynamic Bayesian Network huc hrc w u r o huc hrc w u r o T T+1 Defined formally as * Set of random vars * BN for initial state * 2-layer DBN for transitions Also known as a Factored Markov Model © Daniel S. Weld
This is Really a “Schema” huc hrc w u r o T T+1 “Unrolling” © Daniel S. Weld
Semantics: A DBN A Normal Bayesian Network u w hrc huc o r u w hrc huc o r u w hrc huc huc hrc w u r o 1 2 3 © Daniel S. Weld
Add Actions must come Before inference Observatuons can come later Or…. Add a concrete example of why this would help Observatuons can come later Talk about noisy sensors (Early on assume some vars observerd perfectly) © Daniel S. Weld
Inference Expand DBN (schema) into BN Particle Filters Use inference as shown last week Problem? Particle Filters © Daniel S. Weld
Multiple Actions Variables : Actions : We need an extra variable has_user_coffee (huc) , has_robot_coffee (hrc), robot_is_wet (w), has_robot_umbrella (u), raining (r), robot_in_office (o) Actions : buy_coffee, deliver_coffee, get_umbrella, move We need an extra variable (Alternatively a separate 2DBN / action) © Daniel S. Weld
Actions in DBN T T+1 huc huc hrc hrc w w u u r r o o a © Daniel S. Weld
Simplifying Assumptions Environment Static vs. Dynamic Instantaneous vs. Durative Deterministic Stochastic vs. Deterministic vs. Stochastic Fully Observable vs. Partially Observable What action next? Perfect vs. Noisy Percepts Actions Full vs. Partial satisfaction © Daniel S. Weld
STRIPS Action Schemata Instead of defining ground actions: pickup-A and pickup-B and … Define a schema: (:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1) (on-table ?ob1) (arm-empty)) :effect (and (not (clear ?ob1)) (not (on-table ?ob1)) (not (arm-empty)) (holding ?ob1))) Note: strips doesn’t allow derived effects; you must be complete! } © Daniel S. Weld
Deterministic “STRIPS”? arm_empty pick_up_A clear_A on_table_A - arm_empty - on_table_A + holding_A © Daniel S. Weld
Probabilistic STRIPS? pick_up_A arm_empty clear_A on_table_A P<0.9 - arm_empty - on_table_A + holding_A pick_up_A © Daniel S. Weld
NEED!!! Concrete DBN corresponding to STRIPS Ideally change example To robot domain Use robot coffee examples of strips © Daniel S. Weld
State Estimation O R Huc Hrc U W ? ? Move BuyC © Daniel S. Weld
Noisy Observations T T+1 True state vars (unobserved) huc huc hrc hrc w w u u True state vars (unobserved) r r o o Yo Yo Noisy, sensor reading of o a T T+1 o o Action var Known with certainty Yo .9 .2 © Daniel S. Weld
State Estimation O R Huc Hrc U W ? ? Move BuyC Yo=T … © Daniel S. Weld
Maybe we don’t know action! huc huc hrc hrc w w u u True state vars (unobserved) r r o o Yo Yo Noisy, sensor reading a T T+1 o o Action var (other agent’s action) Yo .9 .2 © Daniel S. Weld
State Estimation O R Huc Hrc U W ? ? ? ? So=T … © Daniel S. Weld
Particle Filters Create 1000s of “Particles” Simulate Each one has a “copy” of each random var These have concrete values Distribution is approximated by whole set Approximate technique! Fast! Simulate Each time step, iterate per particle Use the 2 layer DBN + random # generator to Predict next value for each random var Resample! © Daniel S. Weld
Resampling Simulate Each time step, iterate per particle Resample: Use the 2 layer DBN + random # generator to Predict next value for each random var Resample: Given observations (if any) at new time point Calculate the likelihood of each particle Create a new population of particles by Sampling from old population Proportionally to likelihood © Daniel S. Weld
Add Dieter example Or KDD cup Gazelle example. And Thad Starner example. © Daniel S. Weld