Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

Dynamic Bayesian Networks (DBNs)
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Rao-Blackwellised Particle Filtering Based on Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks by Arnaud Doucet, Nando de Freitas, Kevin.
3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)
Introduction of Probabilistic Reasoning and Bayesian Networks
LCSLCS 18 September 2002DARPA MARS PI Meeting Intelligent Adaptive Mobile Robots Georgios Theocharous MIT AI Laboratory with Terran Lane and Leslie Pack.
Lecture 8: Hidden Markov Models (HMMs) Michael Gutkin Shlomi Haba Prepared by Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 2002 Modified.
Bayes Filters Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics TexPoint fonts used in EMF. Read the.
1 Reasoning Under Uncertainty Over Time CS 486/686: Introduction to Artificial Intelligence Fall 2013.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1.
Paper Discussion: “Simultaneous Localization and Environmental Mapping with a Sensor Network”, Marinakis et. al. ICRA 2011.
Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.
… Hidden Markov Models Markov assumption: Transition model:
PatReco: Hidden Markov Models Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
CS 188: Artificial Intelligence Fall 2009 Lecture 20: Particle Filtering 11/5/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
Lecture 5: Learning models using EM
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Reinforcement Learning in the Presence of Hidden States Andrew Howard Andrew Arnold {ah679
QUIZ!!  T/F: The forward algorithm is really variable elimination, over time. TRUE  T/F: Particle Filtering is really sampling, over time. TRUE  T/F:
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
DARPA Mobile Autonomous Robot SoftwareLeslie Pack Kaelbling; March Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence.
Simultaneous Localization and Mapping Presented by Lihan He Apr. 21, 2006.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Recap: Reasoning Over Time  Stationary Markov models  Hidden Markov models X2X2 X1X1 X3X3 X4X4 rainsun X5X5 X2X2 E1E1 X1X1 X3X3 X4X4 E2E2 E3E3.
Mapping and Localization with RFID Technology Matthai Philipose, Kenneth P Fishkin, Dieter Fox, Dirk Hahnel, Wolfram Burgard Presenter: Aniket Shah.
Hyperparameter Estimation for Speech Recognition Based on Variational Bayesian Approach Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee and Keiichi.
Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Hierarchical Reinforcement Learning Using Graphical Models Victoria Manfredi and.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Dynamic Bayesian Networks and Particle Filtering COMPSCI 276 (chapter 15, Russel and Norvig) 2007.
S. Salzberg CMSC 828N 1 Three classic HMM problems 2.Decoding: given a model and an output sequence, what is the most likely state sequence through the.
Robust Object Tracking by Hierarchical Association of Detection Responses Present by fakewen.
QUIZ!!  In HMMs...  T/F:... the emissions are hidden. FALSE  T/F:... observations are independent given no evidence. FALSE  T/F:... each variable X.
Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
School of Computer Science 1 Information Extraction with HMM Structures Learned by Stochastic Optimization Dayne Freitag and Andrew McCallum Presented.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
Rao-Blackwellised Particle Filtering for Dynamic Bayesian Network Arnaud Doucet Nando de Freitas Kevin Murphy Stuart Russell.
Dynamic Bayesian Network Fuzzy Systems Lifelog management.
Probabilistic Reasoning Inference and Relational Bayesian Networks.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
General approach: A: action S: pose O: observation Position at time t depends on position previous position and action, and current observation.
CS 541: Artificial Intelligence Lecture VIII: Temporal Probability Models.
Engineering Societies in the Agents World Workshop 2003
Online Multiscale Dynamic Topic Models
HMM: Particle filters Lirong Xia. HMM: Particle filters Lirong Xia.
Inference in Bayesian Networks
Learning Coordination Classifiers
CS b659: Intelligent Robotics
Sequential Stochastic Models
Reinforcement Learning in POMDPs Without Resets
Context-based vision system for place and object recognition
Probabilistic Reasoning Over Time
Propagating Uncertainty In POMDP Value Iteration with Gaussian Process
Robust Belief-based Execution of Manipulation Programs
Hierarchical POMDP Solutions
Instructors: Fei Fang (This Lecture) and Dave Touretzky
CONTEXT DEPENDENT CLASSIFICATION
Chapter14-cont..
Parametric Methods Berlin Chen, 2005 References:
Qiang Huo(*) and Chorkin Chan(**)
HMM: Particle filters Lirong Xia. HMM: Particle filters Lirong Xia.
Presentation transcript:

Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi

Outline Define H-HMM –Flattening H-HMM Define H-POMDP –Flattening H-POMDP Approximate H-POMDP with DBN Inference and Learning in H-POMDP

Introduction H-POMDPs represent state-space at multiple levels of abstraction –Scale much better to larger environments –Simplify planning Abstract states are more deterministic –Simplify learning Number of free parameters is reduced

Hierarchical HMMs A generalization of HMM to model hierarchical structure domains –Application: NLP Concrete states: emit single observation Abstract states: emit strings of observations Emitted strings by abstract states are governed by sub-HMMs

Example HHMM representing a(xy) + b | c(xy) + d When the sub-HHMM is finished, control is returned to wherever it was called from

HHMM to HMM Create a state for every leaf in HHMM

HHMM to HMM Create a state for every leaf in HHMM Flat transition probability = Sum( P( all paths in HHMM)) Disadvantages: Flattening loses modularity Learning requires more samples

Representing HHMMs as DBNs : state at level d if HMM at level d finished

H-POMDPs HHMMs with inputs and reward function Problems: –Planning: Find mapping from belief states to actions –Filtering: Compute the belief state online –Smoothing: Compute offline –Learning: Find MLE of model parameters

H-POMDP for Robot Navigation Flat model Hierarchical model 4 * Abstract state: X t 1 (1..4) * Concrete state: X t 2 (1..3) * Observation: Y t (4 bits) * Robot position: X t (1..10) In this paper, Ignore the problem of how to choose the actions

State Transition Diagram for 2-H-POMDP Sample path:

State Transition Diagram for Corridor Environment Abstract States Entry States Exit States Concrete States

Flattening H-POMDPs Advantages of H-POMDP over corresponding POMDP: –Learning is easier: Learn sub-models –Planning is easier: Reason in terms of “macro” actions

STATE POMDP FACTORED DBN POMDP Dynamic Bayesian Networks # of parameters

STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

H-POMDPs as DBNs : Abstract location : Orientation: Concrete location : Exit node (5 values) : Observation : Action node Representing no-exit, s, n, l, r -exit

Transition Model If e = no-exit otherwise Abstract horizontal transition matrix

Transition Model If e = no-exit otherwise If e = no-exit otherwise Concrete vertical entry vector Concrete horizontal transition matrix Probability of entering exit state e

Observation Model Probability of seeing a wall or opening on each of 4 sides of the robot Naïve Bayes assumption: where Map global coordinate frame to robot’s local coordinate frame Then,  Learn the appearance of the cell in all directions

Example

Inference Online filtering: –Input of controller: MLE of the abstract and concrete states Offline smoothing: –O(DK 1.5D T) D: # of dimensions K: # of states in each level –1.5D: size of largest clique in DBN = The state nodes at t-1 + half of the state nodes at t –Approximation (belief propagation): O(DKT)

Learning Maximum likelihood parameter estimate using EM In E step, compute: In M step, compute normalizing matrix of expected counts :

Learning (Cont.) Concrete horizontal transition matrix: Exit probabilities: Vertical transition vector:

Estimating Observation Model Map local observations into world- centered Probability of observing y, facing North

Hierarchical Localizes better Factored DBN H-POMDP H-POMDP STATE POMDP Before training

Conclusions Represent H-POMDPs with DBNs –Learn large models with less data Difference with SLAM: –SLAM is harder to generalize

Complexity of Inference STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Number of states: