Presentation is loading. Please wait.

Presentation is loading. Please wait.

Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi.

Similar presentations


Presentation on theme: "Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi."— Presentation transcript:

1 Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi

2 Outline Define H-HMM –Flattening H-HMM Define H-POMDP –Flattening H-POMDP Approximate H-POMDP with DBN Inference and Learning in H-POMDP

3 Introduction H-POMDPs represent state-space at multiple levels of abstraction –Scale much better to larger environments –Simplify planning Abstract states are more deterministic –Simplify learning Number of free parameters is reduced

4 Hierarchical HMMs A generalization of HMM to model hierarchical structure domains –Application: NLP Concrete states: emit single observation Abstract states: emit strings of observations Emitted strings by abstract states are governed by sub-HMMs

5 Example HHMM representing a(xy) + b | c(xy) + d When the sub-HHMM is finished, control is returned to wherever it was called from

6 HHMM to HMM Create a state for every leaf in HHMM

7 HHMM to HMM Create a state for every leaf in HHMM Flat transition probability = Sum( P( all paths in HHMM)) Disadvantages: Flattening loses modularity Learning requires more samples

8 Representing HHMMs as DBNs : state at level d if HMM at level d finished

9 H-POMDPs HHMMs with inputs and reward function Problems: –Planning: Find mapping from belief states to actions –Filtering: Compute the belief state online –Smoothing: Compute offline –Learning: Find MLE of model parameters

10 H-POMDP for Robot Navigation Flat model Hierarchical model 4 * Abstract state: X t 1 (1..4) * Concrete state: X t 2 (1..3) * Observation: Y t (4 bits) * Robot position: X t (1..10) In this paper, Ignore the problem of how to choose the actions

11 State Transition Diagram for 2-H-POMDP Sample path:

12 State Transition Diagram for Corridor Environment Abstract States Entry States Exit States Concrete States

13 Flattening H-POMDPs Advantages of H-POMDP over corresponding POMDP: –Learning is easier: Learn sub-models –Planning is easier: Reason in terms of “macro” actions

14 STATE POMDP FACTORED DBN POMDP 0.08 0.01 0.7 0.05 0.08 0.01 Dynamic Bayesian Networks # of parameters

15 STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

16 STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

17 STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

18 STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

19 STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Representing H-POMDPs as DBNs

20 H-POMDPs as DBNs : Abstract location : Orientation: Concrete location : Exit node (5 values) : Observation : Action node Representing no-exit, s, n, l, r -exit

21 Transition Model If e = no-exit otherwise Abstract horizontal transition matrix

22 Transition Model If e = no-exit otherwise If e = no-exit otherwise Concrete vertical entry vector Concrete horizontal transition matrix Probability of entering exit state e

23 Observation Model Probability of seeing a wall or opening on each of 4 sides of the robot Naïve Bayes assumption: where Map global coordinate frame to robot’s local coordinate frame Then,  Learn the appearance of the cell in all directions

24 Example

25 Inference Online filtering: –Input of controller: MLE of the abstract and concrete states Offline smoothing: –O(DK 1.5D T) D: # of dimensions K: # of states in each level –1.5D: size of largest clique in DBN = The state nodes at t-1 + half of the state nodes at t –Approximation (belief propagation): O(DKT)

26 Learning Maximum likelihood parameter estimate using EM In E step, compute: In M step, compute normalizing matrix of expected counts :

27 Learning (Cont.) Concrete horizontal transition matrix: Exit probabilities: Vertical transition vector:

28 Estimating Observation Model Map local observations into world- centered Probability of observing y, facing North

29 Hierarchical Localizes better Factored DBN H-POMDP H-POMDP STATE POMDP Before training

30 Conclusions Represent H-POMDPs with DBNs –Learn large models with less data Difference with SLAM: –SLAM is harder to generalize

31 Complexity of Inference STATE H-POMDP FACTORED DBN H-POMDP EASTWESTEAST WEST EAST WEST EAST Number of states:


Download ppt "Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi."

Similar presentations


Ads by Google