Twenty Second Conference on Artificial Intelligence AAAI 2007 Improved State Estimation in Multiagent Settings with Continuous or Large Discrete State.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Motivating Markov Chain Monte Carlo for Multiple Target Tracking
State Estimation and Kalman Filtering CS B659 Spring 2013 Kris Hauser.
CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
Partially Observable Markov Decision Process (POMDP)
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
CSE-573 Artificial Intelligence Partially-Observable MDPS (POMDPs)
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Dynamic Bayesian Networks (DBNs)
Modeling Uncertainty over time Time series of snapshot of the world “state” we are interested represented as a set of random variables (RVs) – Observable.
1 Graphical Models for Online Solutions to Interactive POMDPs Prashant Doshi Yifeng Zeng Qiongyu Chen University of Georgia Aalborg University National.
Individual Localization and Tracking in Multi-Robot Settings with Dynamic Landmarks Anousha Mesbah Prashant Doshi Prashant Doshi University of Georgia.
Partially Observable Markov Decision Process By Nezih Ergin Özkucur.
POMDPs: Partially Observable Markov Decision Processes Advanced AI
Introduction to Mobile Robotics Bayes Filter Implementations Gaussian filters.
Graphical Models for Mobile Robot Localization Shuang Wu.
CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California
Nonlinear and Non-Gaussian Estimation with A Focus on Particle Filters Prasanth Jeevan Mary Knox May 12, 2006.
Incremental Pruning CSE 574 May 9, 2003 Stanley Kok.
Single Point of Contact Manipulation of Unknown Objects Stuart Anderson Advisor: Reid Simmons School of Computer Science Carnegie Mellon University.
Particle Filtering for Non- Linear/Non-Gaussian System Bohyung Han
Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”
Novel approach to nonlinear/non- Gaussian Bayesian state estimation N.J Gordon, D.J. Salmond and A.F.M. Smith Presenter: Tri Tran
Bayesian Filtering for Location Estimation D. Fox, J. Hightower, L. Liao, D. Schulz, and G. Borriello Presented by: Honggang Zhang.
G. Hendeby Performance Issues in Non-Gaussian Filtering Problems NSSPW ‘06 Corpus Christi College, Cambridge Performance Issues in Non-Gaussian Filtering.
G. Hendeby Recursive Triangulation Using Bearings-Only Sensors TARGET ‘06 Austin Court, Birmingham Recursive Triangulation Using Bearings-Only Sensors.
Bayesian Filtering for Robot Localization
Tracking Pedestrians Using Local Spatio- Temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS.
Conference Paper by: Bikramjit Banerjee University of Southern Mississippi From the Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence.
Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact Solutions of Interactive POMDPs Using Behavioral Equivalence.
Markov Localization & Bayes Filtering
From Bayesian Filtering to Particle Filters Dieter Fox University of Washington Joint work with W. Burgard, F. Dellaert, C. Kwok, S. Thrun.
Particle Filtering (Sequential Monte Carlo)
Simultaneous Localization and Mapping Presented by Lihan He Apr. 21, 2006.
Generalized and Bounded Policy Iteration for Finitely Nested Interactive POMDPs: Scaling Up Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University.
Probabilistic Robotics Bayes Filter Implementations Gaussian filters.
Probabilistic Robotics Bayes Filter Implementations.
Particle Filters.
Online Learning for Collaborative Filtering
Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Dynamic Bayesian Networks and Particle Filtering COMPSCI 276 (chapter 15, Russel and Norvig) 2007.
Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 6.1: Bayes Filter Jürgen Sturm Technische Universität München.
Using Reinforcement Learning to Model True Team Behavior in Uncertain Multiagent Settings in Interactive DIDs Muthukumaran Chandrasekaran THINC Lab, CS.
-Arnaud Doucet, Nando de Freitas et al, UAI
Expectation-Maximization (EM) Case Studies
Confidence Based Autonomy: Policy Learning by Demonstration Manuela M. Veloso Thanks to Sonia Chernova Computer Science Department Carnegie Mellon University.
CS Statistical Machine learning Lecture 24
Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.
The set of SE models include s those that are BE. It further includes models that include identical distributions over the subject agent’s action observation.
Sequential Monte-Carlo Method -Introduction, implementation and application Fan, Xin
Tracking with dynamics
Nonlinear State Estimation
Learning Team Behavior Using Individual Decision Making in Multiagent Settings Using Interactive DIDs Muthukumaran Chandrasekaran THINC Lab, CS Department.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks Arnaud Doucet, Nando de Freitas, Kevin Murphy and Stuart Russell CS497EA presentation.
On the Difficulty of Achieving Equilibrium in Interactive POMDPs Prashant Doshi Dept. of Computer Science University of Georgia Athens, GA Twenty.
General approach: A: action S: pose O: observation Position at time t depends on position previous position and action, and current observation.
Zhaoxia Fu, Yan Han Measurement Volume 45, Issue 4, May 2012, Pages 650–655 Reporter: Jing-Siang, Chen.
CS 541: Artificial Intelligence Lecture VIII: Temporal Probability Models.
Yifeng Zeng Aalborg University Denmark
Partially Observable Markov Decision Process and RL
POMDPs Logistics Outline No class Wed
Ch3: Model Building through Regression
Auxiliary particle filtering: recent developments
≠ Particle-based Variational Inference for Continuous Systems
6.891 Computer Experiments for Particle Filtering
Presentation transcript:

Twenty Second Conference on Artificial Intelligence AAAI 2007 Improved State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces Prashant Doshi Dept. of Computer Science University of Georgia Speaker Yifeng Zeng Aalborg University, Denmark

State Estimation Physical State (Loc, Orient,...) Physical State (Loc, Orient,...) Single agent setting

Interactive state State Estimation Physical State (Loc, Orient,...) Physical State (Loc, Orient,...) Multiagent setting (See AAMAS 05)

State Estimation in Multiagent Settings Ascribe intentional models (POMDPs) to other agents Update the other agents' beliefs Estimate the interactive state (See JAIR’05)

Previous Approach Interactive particle filter (I-PF; see AAMAS'05, AAAI'05)  Generalizes PF to multiagent settings  Approximate simulation of the state estimation Limitations of the I-PF  Large no. of particles needed even for small state spaces Distributes particles over the physical state and model spaces  Poor performance when the physical state space is large or continuous

Factoring the State Estimation Update the physical state space Update other agent's model

Factoring the State Estimation Sample particles from just the physical state space Substitute in state estimation Implement using PF Perform as exactly as possible Rao-Blackwellisation of the I-PF

Assumptions on Distributions Prior beliefs  Singly nested and conditional linear Gaussian (CLG) Transition functions  Deterministic or CLG Observation functions  Softmax or CLG Why these distributions?  Good statistical properties  Well-known methods for learning these distributions from data  Applications in target tracking, fault diagnosis

Belief Update over Models Step 1: Update other agent's level 0 beliefs Product of a Gaussian and Softmax  Use variational approximation of softmax (see Jordan '99)  Softmax Gaussian – tight lower bound Update is then analogous to the Kalman filter

Belief Update over Models Step 2: Update belief over other's beliefs Solve other's models – compute other's policy  Large variance – Listen Obtain piecewise distributions L Updated Gaussian if prior belief supports the action 0 otherwise Updated belief over other's belief = Approximate piecewise with Gaussian using ML

Belief Update over Models Step 3: Form a mixture of Gaussians Each Gaussian is for the optimal action and possible observation of the other agent Weight the Gaussian with the likelihood of receiving the observation Mixture components grow unbounded  components after one step  components after t steps

Comparative Performance Compare accuracy of state estimation with I-PF (L1 metric)  Continuous multi-agent tiger problem  Public good problem with punishment RB-IPF focuses particles on the large physical state space Updates beliefs over other's models more accurately (supporting plots in paper)

Comparative Performance Compare run times with I-PF ( Linux, Xeon 3.4GHz, 4GB RAM ) Sensitivity to Gaussian approximation of piecewise distribution

Discussion How restrictive are the assumptions on the distributions?  Can we generalize RB-IPF, like I-PF? Will RB-IPF scale to large number of update steps?  Closed form mixtures are needed Is RB-IPF applicable to multiply-nested beliefs  Recursive application may not improve performance over I- PF

Thank you Questions?