Presentation is loading. Please wait.

Presentation is loading. Please wait.

By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Similar presentations


Presentation on theme: "By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering."— Presentation transcript:

1 By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering

2 Introduction Robotic Planning under uncertainty MDP solutions Limited real-world application

3 Assumptions for Multi-Robot teams Communication (Inexpensive, free, or costly) Synchronous and steady state transitions Discretization of environment

4 A Different Approach States and actions discrete (like MDP) Continuous measure of time State transitions regarded as random ‘events’

5 Advantages Non-Markovian effects of discretization minimized Fully reactive to changes Communication only required for ‘events’

6 GSMDPs Generic temporal probability distributions over events Can model concurrent (persistently enabled) events Solvable by discrete-time MDP algorithms by obtaining an equivalent (semi-)Markovian model Avoids negative effects of synchronous alternatives

7 Why GSMDPs for Robotics Cooperative Robotics requires: Operation in inherently continuous environments Uncertainty in actions (and observations) Joint decision making for optimization Reactive

8 Definitions multiagent GSMDP: tuple d = number agents S = state space (contains state factors) X = state factors A = set of joint actions T = transition function F = time model R = instantaneous reward function C = cumulative reward rate h = planning over continuous time

9 Definitions Event in a GSMDP: An abstraction to state transitions that share the same properties Persistently enabled events: Events that are enabled from step ‘t’ to step ‘t+1’, but not triggered at step ‘t’

10 Common Approach Synchronous action Pre-defined time step Performance Reaction time

11 GSMDPs Persistently enabled events modeled by allowing their temporal distributions to depend on the time they were enabled Explicit modeling of non-Markovian effects from discretization Communication efficiency

12 Modeling Events Group state transitions as events to minimize temporal distributions and transitions(battery low) Transition function found by estimating relative frequency of each transition in the event Time model found by timing the transition data Approximated as a phase-type distribution Replaces events with acyclic Markov chains

13 Events (cont.) Not always possible Decompose events with minimum duration into deterministically timed transitions Can then better approximate using phase-type distribution

14 Solving a GSMDP Can be viewed as an equivalent discrete-time MDP Almost all solution algorithms for MDPs work

15 Experiment Robotic soccer Score a goal (reward 150) Passing around obstacle (reward 60)

16 Results MDP: T = 4s GSMDP

17 Results No idle time Reduced communication Improved scoring efficiency System failures (zero goals) independent of model

18 Example Video

19 Future Work Extend to partially observable domains Apply bilateral phase distributions to increase the class of non-Markovian events that are able to be modeled

20 Questions?

21 MESSIAS, J.; SPAAN, M.; LIMA, P.. GSMDPs for Multi-Robot Sequential Decision- Making. AAAI Conference on Artificial Intelligence, North America, jun. 2013. Available at:. Date accessed: 06 Apr. 2014http://www.aaai.org/ocs/index.php/AAAI/AAAI13/paper/view/6432/6843


Download ppt "By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering."

Similar presentations


Ads by Google