Download presentation

Presentation is loading. Please wait.

Published byTaylor MacKay Modified over 4 years ago

1
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology Nicola Basilico, Nicola Gatti, Francesco Amigoni {basilico,ngatti,amigoni}@elet.polimi.it

2
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Mobile Robot Patrolling A robotic patrolling situation is characterized by one or more mobile robotic patrollers and by some targets to be patrolled (Agmon et al., 2008), (Amigoni et al., 2008), (Gatti, 2008), (Paruchuri et al., 2008) Usually, due to the characteristics of the setting, the patrollers cannot employ a deterministic strategy, otherwise the intruder will surely succeed in attacking a target Patrollers should adopt an unpredictable patrolling strategy, randomizing over the targets and trying to reduce the intrusion probability (Pita et al., 2008) The scientific challenge is the development of effective patrolling strategies, i.e., the optimal probabilities with which the patroller moves in the environment

3
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Example: Deterministic Strategy

4
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Example: Non-Deterministic Strategy

5
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Patrolling with Adversaries Considering adversaries can provide the patrolling robots a larger expected utility (Agmon et al., 2008) considers non-explicit adversary models in perimeter settings: the optimal patrollers strategy is its max-min strategy (Paruchuri et al., 2008) considers explicit adversary models in settings without topologies: the optimal patrollers strategy is its leader-follower strategy (von Stengel and Zamir, 2004) Knowing the adversary preferences (also with uncertainty) provides the patrollers a larger expected utility than not knowing them

6
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Assumptions Time is discretized in turns There is a single patrolling robot equipped with a sensor (e.g., a camera) able to detect intruders The environment is discretized in cells and its topology is represented by a directed graph The intruder cannot do anything else for some turns once it has attempted to enter a cell (this amounts to say that penetration takes some time)

7
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano The Environment Model C is the set of n cells to be patrolled Matrix T(n x n) represents the topology, where t i,j = 1 means that cells i and j are adjacent (the patroller can go from i to j with one action) and t i,j = 0 means that they are not Matrix V(n x n) represents the patrollers sensing capabilities, where v i,j = 1 if cell j can be sensed by the patroller from cell i and v i,j = 0 otherwise Each cell i requires the intruder d i > 0 turns to enter X i and Y i (with i {1, 2, …, n}) are the values to the patroller and to the intruder, respectively, when the intruder successfully strikes cell i X 0 and Y 0 are the values to the patroller and to the intruder, respectively, when the intruder is captured

8
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Example: Environment d = 6 X=.8,Y=.4 d = 4 X=.7,Y=.5 d = 5 X=.8,Y=.4 X 0 =1 Y 0 =-1 The shortest cycle is 12 cells long and then no deterministic strategy is effective

9
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano The Game Model At each turn, a strategic-form game is repeated in which the players act simultaneously The patroller chooses the next cell to move to among those directly connected to its current cell: called i the current cell of the robot, its actions are move(j), such that t i,j = 1 If the intruder has not previously attempted to enter any cell, chooses whether or not to enter a cell and, in the former case, what cell to enter: its actions are wait and enter(i) If, instead, the intruder has previously attempted to enter a cell i, it cannot take any action for d i turns

10
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Example

11
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Markov Hypothesis Being the game an extensive-form game with infinite-horizon, its study is commonly conducted by introducing symmetries: the patrollers strategy depends only on the last |H| actions (i.e., the last |H| visited cells) The patrollers strategy is denoted by α H,j, that is the probability to make action move(j) after history H When |H| = 0, the patrollers strategy does not depend on any last visited cell, i.e., α j When |H| = 1, the patrollers strategy depends only on the last visited cell, i.e., α i,j The intruders action space can be reduced to: stay-out: the intruder never enters the environment enter-when(i,j): the intruder enters cell i when the patroller is in cell j

12
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Equilibrium In a leader-follower scenario, the follower chooses its strategy after having observed the leaders strategy The patrolling scenario is a leader-follower scenario where the patroller is the leader and the intruder is follower A leader-follower equilibrium is the appropriate solution concept for a game with a leader-follower position: it assures the leader a non smaller expected utility than any Nash equilibrium (von Stengel and Zamir, 2004) The followers strategy is pure (von Stengel and Zamir, 2004)

13
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Solving Algorithm Computing a leader-follower equilibrium is essentially a bilevel optimization problem (Conitzer and Sandholm, 2006) In our case the second optimization level is bilinear Our algorithm develops in two steps Step 1: stay-out as intruders best response Checking whether or not there exists a patrollers strategy such that stay-out is a intruders best response is a bilinear feasibility problem Step 2: enter-when as intruders best response For each possible enter-when(i,j) we compute the optimal patroller strategy as a bilinear optimization problem The leader-follower strategy is the strategy that assures the largest expected utility to the patroller among all the ones computed above

14
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Example: Solution d = 6 X=.8,Y=.4 d = 4 X=.7,Y=.5 d = 5 X=.8,Y=.4.65.95.53.35.55.35.33.45.41.05.23.59 1.00.44.48.47.52.35 1.00.30

15
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Action Space Reduction We developed two reduction algorithms We reduce the patrollers action space We reduce the intruders action space Our algorithms allow one to save about 96%-97% computational time Example: Without reduction: 13 2 bilinear optimization problems With reduction:10 bilinear optimization problems

16
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Conclusions and Future Works Achievements A satisfactory patrolling model that generalizes models from the state of the art An exact algorithm for leader-follower equilibrium computation Effective algorithms for removing dominated actions and improving computational efficiency Future Works Introducing uncertainty over intruders preferences and patrollers sensing capabilities Developing an algorithm for computing the deterministic patrolling strategy in general settings Extending our framework to multiple patrollers

Similar presentations

OK

Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.

Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google