1 Random Disambiguation Paths Al Aksakalli In Collaboration with Carey Priebe & Donniell Fishkind Department of Applied Mathematics and Statistics Johns.

Slides:



Advertisements
Similar presentations
Shortest Vector In A Lattice is NP-Hard to approximate
Advertisements

Partially Observable Markov Decision Process (POMDP)
Approximations of points and polygonal chains
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
ANDREW MAO, STACY WONG Regrets and Kidneys. Intro to Online Stochastic Optimization Data revealed over time Distribution of future events is known Under.
Decision Theoretic Planning
Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,
Applications of Single and Multiple UAV for Patrol and Target Search. Pinsky Simyon. Supervisor: Dr. Mark Moulin.
Amplicon-Based Quasipecies Assembly Using Next Generation Sequencing Nick Mancuso Bassam Tork Computer Science Department Georgia State University.
Los Angeles September 27, 2006 MOBICOM Localization in Sparse Networks using Sweeps D. K. Goldenberg P. Bihler M. Cao J. Fang B. D. O. Anderson.
Infinite Horizon Problems
Planning under Uncertainty
1 Policies for POMDPs Minqing Hu. 2 Background on Solving POMDPs MDPs policy: to find a mapping from states to actions POMDPs policy: to find a mapping.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.
Data Transmission and Base Station Placement for Optimizing Network Lifetime. E. Arkin, V. Polishchuk, A. Efrat, S. Ramasubramanian,V. PolishchukA. EfratS.
1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.
Localized Techniques for Power Minimization and Information Gathering in Sensor Networks EE249 Final Presentation David Tong Nguyen Abhijit Davare Mentor:
Scheduling Algorithms for Wireless Ad-Hoc Sensor Networks Department of Electrical Engineering California Institute of Technology. [Cedric Florens, Robert.
Dynamic Medial Axis Based Motion Planning in Sensor Networks Lan Lin and Hyunyoung Lee Department of Computer Science University of Denver
Probability Grid: A Location Estimation Scheme for Wireless Sensor Networks Presented by cychen Date : 3/7 In Secon (Sensor and Ad Hoc Communications and.
Backtracking Reading Material: Chapter 13, Sections 1, 2, 4, and 5.
1 Target-Oriented Scheduling in Directional Sensor Networks Yanli Cai, Wei Lou, Minglu Li,and Xiang-Yang Li* The Hong Kong Polytechnic University, Hong.
Data Selection In Ad-Hoc Wireless Sensor Networks Olawoye Oyeyele 11/24/2003.
CHAPTER 15 S IMULATION - B ASED O PTIMIZATION II : S TOCHASTIC G RADIENT AND S AMPLE P ATH M ETHODS Organization of chapter in ISSO –Introduction to gradient.
MAKING COMPLEX DEClSlONS
Conference Paper by: Bikramjit Banerjee University of Southern Mississippi From the Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence.
Exposure In Wireless Ad-Hoc Sensor Networks Seapahn Meguerdichian Computer Science Department University of California, Los Angeles Farinaz Koushanfar.
Energy Efficient Routing and Self-Configuring Networks Stephen B. Wicker Bart Selman Terrence L. Fine Carla Gomes Bhaskar KrishnamachariDepartment of CS.
Target Tracking with Binary Proximity Sensors: Fundamental Limits, Minimal Descriptions, and Algorithms N. Shrivastava, R. Mudumbai, U. Madhow, and S.
Network Aware Resource Allocation in Distributed Clouds.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Introduction to Job Shop Scheduling Problem Qianjun Xu Oct. 30, 2001.
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
CSE-573 Reinforcement Learning POMDPs. Planning What action next? PerceptsActions Environment Static vs. Dynamic Fully vs. Partially Observable Perfect.
TKK | Automation Technology Laboratory Partially Observable Markov Decision Process (Chapter 15 & 16) José Luis Peralta.
Maximum Network Lifetime in Wireless Sensor Networks with Adjustable Sensing Ranges Cardei, M.; Jie Wu; Mingming Lu; Pervaiz, M.O.; Wireless And Mobile.
Binary Stochastic Fields: Theory and Application to Modeling of Two-Phase Random Media Steve Koutsourelakis University of Innsbruck George Deodatis Columbia.
The Application of The Improved Hybrid Ant Colony Algorithm in Vehicle Routing Optimization Problem International Conference on Future Computer and Communication,
An Asymptotic Analysis of Generative, Discriminative, and Pseudolikelihood Estimators by Percy Liang and Michael Jordan (ICML 2008 ) Presented by Lihan.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
On optimal quantization rules for some sequential decision problems by X. Nguyen, M. Wainwright & M. Jordan Discussion led by Qi An ECE, Duke University.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Efficient Computing k-Coverage Paths in Multihop Wireless Sensor Networks XuFei Mao, ShaoJie Tang, and Xiang-Yang Li Dept. of Computer Science, Illinois.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
A Simulation-Based Study of Overlay Routing Performance CS 268 Course Project Andrey Ermolinskiy, Hovig Bayandorian, Daniel Chen.
Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.
ICS 353: Design and Analysis of Algorithms Backtracking King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Planning Tracking Motions for an Intelligent Virtual Camera Tsai-Yen Li & Tzong-Hann Yu Presented by Chris Varma May 22, 2002.
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
An Exact Algorithm for Difficult Detailed Routing Problems Kolja Sulimma Wolfgang Kunz J. W.-Goethe Universität Frankfurt.
Efficient Point Coverage in Wireless Sensor Networks Jie Wang and Ning Zhong Department of Computer Science University of Massachusetts Journal of Combinatorial.
Metaheuristics for the New Millennium Bruce L. Golden RH Smith School of Business University of Maryland by Presented at the University of Iowa, March.
Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Beard & McLain, “Small Unmanned Aircraft,” Princeton University Press, 2012, Chapter 12: Slide 1 Chapter 12 Path Planning.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Partially Observable Markov Decision Process and RL
Distributed Vehicle Routing Approximation
Traveling Salesman Problems Motivated by Robot Navigation
POMDPs Logistics Outline No class Wed
Analytics and OR DP- summary.
Reinforcement Learning in POMDPs Without Resets
Jason A. Fuemmeler, and Venugopal V. Veeravalli Edwin Lei
Chapter 6 Network Flow Models.
Reinforcement Learning Dealing with Partial Observability
Adaptation of the Simulated Risk Disambiguation Protocol to a Discrete Setting ICAPS Workshop on POMDP, Classification and Regression: Relationships and.
Discrete Optimization
Presentation transcript:

1 Random Disambiguation Paths Al Aksakalli In Collaboration with Carey Priebe & Donniell Fishkind Department of Applied Mathematics and Statistics Johns Hopkins University Adaptive Sensing MURI Workshop June 28, 2006 Duke University

2 1)Problem Description 2)Markov Decision Process Formulation 3)Simulated Risk Disambiguation Protocol 4)Computational Experiments 5)Ongoing Research 6)Summary and Conclusions Outline:

3 Spatial arrangement of detections: true detections, false detections Problem Description:

Spatial arrangement of detections: true detections, false detections Assume for all that is the probability that Problem Description: We only see

5 Given start and destination start t destination s Problem Description:

6 About each detection there is a hazard region, an open disk of fixed radius s t Given start and destination Problem Description:

s t ?? About each detection there is a hazard region, an open disk of fixed radius Given start and destination We seek a continuous curve from to in of shortest achievable arclength Problem Description:

s t About each detection there is a hazard region, an open disk of fixed radius Given start and destination …and we assume the ability to disambiguate detections from the boundary of their hazard regions. We seek a continuous curve from to in of shortest achievable arclength Problem Description:

s t About each detection there is a hazard region, an open disk of fixed radius Given start and destination …and we assume the ability to disambiguate detections from the boundary of their hazard regions. true We seek a continuous curve from to in of shortest achievable arclength Problem Description:

s t About each detection there is a hazard region, an open disk of fixed radius Given start and destination …and we assume the ability to disambiguate detections from the boundary of their hazard regions. …or false We seek a continuous curve from to in of shortest achievable arclength Problem Description:

s t About each detection there is a hazard region, an open disk of fixed radius Given start and destination …and we assume the ability to disambiguate detections from the boundary of their hazard regions. the rest of the transversal… We seek a continuous curve from to in of shortest achievable arclength Problem Description:

12 Definition: A disambiguation protocol is a function # disambiguations allowed cost per disambiguation which detection disambiguated next… …and where the disambiguation performed

13 Example 1: Protocol gives rise to the RDP Length=707.97, prob=.89670Length= , prob=.10330

14 Example 2: Protocol gives rise to the RDP (superimposed composite)

15 Random Disambiguation Paths (RDP) Problem: Given, find protocol of minimum.

16 Related work: Canadian Traveller Problem (CTP): Graph theoretic RDP Given a finite graph – edges with specific probabilities of being traversable, and a starting and a destination vertex – each edge’s status is revealed only when one of the end points is visited: objective is to minimize expected traversal length Shown to be #P-hard

17 Markov Decision Process (MDP) formulation: Let be the information vector keeping track of the decision maker’s current knowledge; be the set of all possible disambiguation points RDP Problem can be cast as a K-stage finite horizon MDP with States: Actions: where v is a disambiguation point and i is a hazard region index Rewards: the negative of the shortest path distance between the state vertex and the action vertex minus c, if not going to d - d is an absorbative state for which there is a one-time and very large reward for entering Transitions: governed by ‘s

18 Simulated Risk Protocol: For purpose of deciding next disambiguation point, we pretend that ambiguous disks are riskily traversable… traversal? ? ? ? ?

19 is the surprise length of, which is the negative logarithm of the probability that is traversable in actuality. Risk Simulation Protocol: For purpose of deciding next disambiguation point, we pretend that ambiguous disks are riskily traversable… traversal? ? ? ? ? is the usual Euclidean length of.

20 Given undesirability function (henceforth, monotonically non-decreasing in its arguments) and, say,

21 Given undesirability function (henceforth, monotonically non-decreasing in its arguments) and, say, Definition: The simulated risk protocol is defined as dictating that the next disambiguation be at the first ambiguous point of. traversal? ? ? ? ?

22 Given undesirability function (henceforth, monotonically non-decreasing in its arguments) and, say, Definition: The simulated risk protocol is defined as dictating that the next disambiguation be at the first ambiguous point of. traversal? ? ? ? ? How to proceed once this disambiguation is performed: update and, decrement, and set the new s to be y.

23 How to navigate in this continuous setting: The Tangent Arc Graph (TAG) is the superimposition/subdivision of all visibility graphs generated by all subsets of disks.  For any undesirability function, is an path in TAG !

24 Linear undesirability functions: Because of the efficiency in their realization, we will consider simulated risk protocols generated by linear undesirability functions for a chosen parameter. As a further shorthand, denote such a protocol by.

25 How (during the simulation of risk phase) can be affected by :

26 How (during the simulation of risk phase) can be affected by :

27 How (during the simulation of risk phase) can be affected by :

28 How (during the simulation of risk phase) can be affected by :

29 How (during the simulation of risk phase) can be affected by :

30 Example 1: Protocol gives rise to the RDP Length=707.97, prob=.89670Length= , prob=.10330

31 Example 2: Protocol gives rise to the RDP (superimposed composite)

32 Lattice Discretization: Discretization via a subgraph of the integer lattice with unit edge lengths:

33 Example: Adapting the simulated risk protocol to lattice discretization:

34 A 40 by 20 integer lattice is used Each hazard region is a disk with radius 5.5 Disk centers sampled from a uniform distribution of integers in ‘s sampled from uniform distribution on (0,1) Cost of disambiguation is taken as 1.5 For each N, K combination, 50 different instances were sampled Optimal solutions found by solving the MDP model via value iteration Computational experiments:

35 Illustration with N=7, K=1: Expected length:

36 Runtime to find overall optimal (SR-RDP runtime negligible) Comparison of optimal versus simulated risk: Simulated risk found the optimal solution 74% of the time Overall mean percentage error of simulated risk solutions was less than 1% For N=7, K=3; VI took more than an hour for N=10, K=1; VI did not run due to insufficient memory

37 Ongoing Research: Pruning State Space via AO* Implemented an enhanced version of AO-star algorithm Preliminary results suggest up to 99% of the state space can be pruned N=15, K=2 can be solved under 15 mins! Not practical for K>2: N=15, K=3 takes 10.5 hours!!! Simulated Risk protocol still seems to perform well

38 Example: Enhanced AO* with N=15, K=2

39 Ongoing Research: Multiple sensors & Neutralization Deployment of multiple sensors with different accuracy rates & ranges at different costs Also consider a limited neutralization capability Develop and solve corresponding Partially Observable Markov Decision Process (POMDP) models

40 Summary and Conclusions RDP is an important, yet hard mine-countermeasures problem Obtaining optimal solutions presently not feasible for realistic values of N and K Simulated risk protocol is a sub-optimal yet efficient algorithm that performed well in computational experiments

41 Q & A