Presentation is loading. Please wait.

Presentation is loading. Please wait.

Traveling Salesman Problems Motivated by Robot Navigation Maria Minkoff MIT With Avrim Blum, Shuchi Chawla, David Karger, Terran Lane, Adam Meyerson.

Similar presentations


Presentation on theme: "Traveling Salesman Problems Motivated by Robot Navigation Maria Minkoff MIT With Avrim Blum, Shuchi Chawla, David Karger, Terran Lane, Adam Meyerson."— Presentation transcript:

1 Traveling Salesman Problems Motivated by Robot Navigation Maria Minkoff MIT With Avrim Blum, Shuchi Chawla, David Karger, Terran Lane, Adam Meyerson

2 A Robot Navigation Problem Robot delivering packages in a building Goal to deliver as quickly as possible Classic model: Traveling Salesman Problem Find a tour of minimum length Additional constraints: some packages have higher priority uncertainty in robot’s behavior battery failure sensor error, motor control error

3 Markov Decision Process Model State space S Choice of actions a  A at each state s Transition function T ( s’|s,a ) action determines probability distribution on next state sequence of actions produces a random path through graph Rewards R(s) on states If arrive in state s at time t, receive discounted reward  t R(s) for  MDP Goal: policy for picking an action from any state that maximizes total discounted reward

4 Exponential Discounting Motivates to get to desired state quickly Inflation: reward collected in distant future decreases in value due to uncertainty at time t robot loses power with fixed probability probability of being alive at t is exponentially distributed discounting reflects value of reward in expectation

5 Solving MDP Fixing action at each state produces a Markov Chain with transition probabilities p vw Can compute expected discounted reward  v if start at state v :  v = r v +  w p vw  t(v,w)  w Choosing actions to optimize this recurrence is polynomial time solvable Linear programming Dynamic programming (like shortest paths)

6 Solving the wrong problem Package can only be delivered once So should not get reward each time reach target One solution: expand state space New state = current location  past locations (packages already delivered) Reward nonzero only on states where current location not included in list of previously visited Now apply MDP algorithm Problem: new state space has exponential size

7 Tackle an easier problem Problem has two novel elements for “theory” Discounting of reward based on arrival time Probability distribution on outcome of actions We will set aside second issue for now In practice, robot can control errors Even first issue by itself is hard and interesting First step towards solving whole problem

8 Discounted-Reward TSP Given undirected graph G=(V,E) edge weights (travel times) d e ≥ 0 weights on nodes (rewards) r v ≥ 0 discount factor   (0,1) root node s Goal find a path P starting at s that maximizes total discounted reward  (P) =  v  P r v  d P (v)

9 Approximation Algorithms Discounted-Reward TSP is NP -complete (and so is more general MDP-type problem) reduction from minimum latency TSP So intractable to solve exactly Goal: approximation algorithm that is guaranteed to collect at least some constant fraction of the best possible discounted reward

10 Related Problems Goal of Discounted-Reward TSP seems to be to find a “ short ” path that collects “ lots ” of reward Prize-Collecting TSP Given a root vertex v, find a tour containing v that minimizes total length + foregone reward (undiscounted) Primal-dual 2-approximation algorithm [GW 95]

11 k -TSP Find a tour of minimum length that visits at least k vertices 2-approximation algorithm known for undirected graphs based on algorithm for PC-TSP [Garg 99] Can be extended to handle node-weighted version

12 Mismatch Constant factor approximation on length doesn’t exponentiate well Suppose optimum solution reaches some vertex v at time t for reward  t r Constant factor approximation would reach within time 2t for reward  2t r Result: get only  t fraction of optimum discounted reward, not a constant fraction.

13 Orienteering Problem Find a path of length at most D that maximizes net reward collected Complement of k -TSP approximates reward collected instead of length avoids changing length, so exponentiation doesn’t hurt unrooted case can be solved via k -TSP Drawback: no constant factor approximation for rooted non-geometric version previously known Our techniques also give a constant factor approximation for Orienteering problem

14 Our Results Using  -approximation for k -TSP as subroutine (  3/2   +2 ) - approximation for Orienteering e(3/2  + 2)- approximation for Discounted- Reward Collection constant-factor approximations for tree- and multiple-path versions of the problems

15 Our Results Using  -approximation for k -TSP as subroutine substitute  =2 announced by Garg in 1999 (  3/2  +2 5 - approximation for Orienteering e(3/2  + 13 - approximation for Discounted- Reward Collection constant-factor approximations for tree- and multiple-path versions of the problems

16 Eliminating Exponentiation Let d v = shortest path distance (time) to v Define the prize at v as  v =  d v r v max discounted reward possibly collectable at v If given path reaches v at time t v, define excess e v = t v – d v difference between shortest path and chosen one Then discounted reward at v is  e v  v Idea: if excess small, prize ~ discounted reward Fact: excess only increases as traverse path excess reflects lost time; can ’ t make it up

17 Optimum path assume  = ½ (can scale edge lengths) Claim: at least ½ of optimum path’s discounted reward R is collected before path’s excess reaches 1 s u Proof by contradiction: Let u be first vertex with e u ≥ 1 Suppose more than R/2 reward follows u Can shortcut directly to u then traverse the rest of optimum reduces all excesses after u by at least 1 so “undiscounts” rewards by factor  -1 = 2 so doubles discounted reward collected but this was more than R/2 : contradiction 0 1 0.5 1.5 2 3 0 0.5 1 2

18 New problem: Approximate Min-Excess Path Suppose there exists an s - t path P * with prize value  of length l(P * )=d t +e Optimization: find s - t path P with prize value ≥  that minimizes excess l(P)-d t over shortest path to t equivalent to minimizing total length, e.g. k -TSP Approximation: find s - t path P with prize value ≥  that approximates optimum excess over shortest path to t, i.e. has length l(P) = d t + ce better than approximating entire path length

19 Using Min-Excess Path Recall discounted reward at v is  e v  v Prefix of optimum discounted reward path: collects discounted reward  e v  v  R/2  spans prize  v  R/2 and has no vertex with excess over 1 Guess t = last node on opt path with excess e t  1 Find a path to t of approximately ( 4 times) minimum excess that spans  R/2 prize (we can guess R/2 ) Excesses at most 4, so  e v  v   v /16  discounted reward on found path  R/32

20 Solving Min-Excess Path problem Exactly solvable case: monotonic paths Suppose optimum path goes through vertices in strictly increasing distance from root Then can find optimum by dynamic program Just as can solve longest path in an acyclic graph Build table For each vertex v : is there a monotonic path from v with length l and prize  ?

21 Solving Min-Excess Path problem Approximable case: wiggly paths Length of path to v is l v = d v + e v If e v > d v then l v > e v > l v / 2 i.e., take twice as long as necessary to reach v So if approximate l v to constant factor, also approximate e v to twice that constant factor

22 Approximating path length Can use k -TSP algorithm to find approximately shortest s-t path with specified prize merge s and t into vertex r opt path becomes a tour solve k -TSP with root r “unmerge”: can get one or more cycles r st connect s and t by shortest path

23 Decompose optimum path monotone wiggly > 2/3 of each wiggly path is excess Divides into independent problems

24 Decomposition Analysis 2/3 of each wiggly segment is excess That excess accumulates into whole path total excess of wiggly segment  excess of whole path  total length of wiggly segments  3/2 of path excess Use dynamic program to find shortest (min-excess) monotonic segments collecting target prize Use k -TSP to find approximately shortest wiggles collecting target prize Approximates length, so approximates excess Over all monotonic and wiggly segments, approximates total excess

25 Dynamic program for Min-Excess Path For each pair of vertices and each (discretized) prize value, find Shortest monotonic path collecting desired prize Approximately shortest wiggly path collecting desired prize Note: polynomially many subproblems Use dynamic programming to find optimum pasting together of segments

26 Solving Orienteering Problem: special case Given a path from s that collects prize  has length  D ends at t, the farthest point from s v t s For any const integer r  1, there exists a path from s to some v with prize   / r excess  (D-d v )/r 0 0.5 1 1.5 2 3 1

27 Solving Orienteering Problem General case: path ends at arbitrary t Let u be the farthest point from s Connect t to s via shortest path One of path segments ending at u has prize   /2 has length  D  Reduced to special case Using 4-approximation for Min-Excess Path get 8-approximation for Orienteering s t u

28 Budget Prize-Collecting Steiner Tree problem Find a rooted tree of edge cost at most D that spans maximum amount of prize Complement of k -MST Create Euler tour of opt tree T * of cost  2D Divide this tour into two paths starting at root each of length  D One of them contains at least ½ of total prize Path is a type of tree Use c -approximation algorithm for Orienteering to obtain 2c -approximation for Budget PCST

29 Summary Showed maximum discounted reward can be approximated using min-excess path Showed how to approximate min-excess path using k -TSP Min-excess path can also be used to solve rooted Orienteering problem (open question) Also solves “tree” and “cycle” versions of Orienteering

30 Open Questions Non-uniform discount factors each vertex v has its own  v Non-uniform deadlines each vertex specifies its own deadline by which it has to be visited in order to collect reward Directed graphs We used k -TSP, only solved for undirected For directed, even standard TSP has no known constant factor approximation We only use k-TSP/undirectedness in wiggly parts

31 Future directions Stochastic actions Stochastic seems to imply directed Special case: forget rewards. Given choice of actions, choose to minimize cover time of graph Applying discounting framework to other problems : Scheduling Exponential penalty in place of hard deadlines


Download ppt "Traveling Salesman Problems Motivated by Robot Navigation Maria Minkoff MIT With Avrim Blum, Shuchi Chawla, David Karger, Terran Lane, Adam Meyerson."

Similar presentations


Ads by Google