The Traveling Salesman Problem in Theory & Practice Lecture 1 21 January 2014 David S. Johnson Seeley Mudd.

Slides:



Advertisements
Similar presentations
Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.
Advertisements

Weighted Matching-Algorithms, Hamiltonian Cycles and TSP
Great Theoretical Ideas in Computer Science
Carthagène A brief introduction to combinatorial optimization: The Traveling Salesman Problem Simon de Givry Thales Research & Technology, France (minor.
Introduction to Graph Theory Instructor: Dr. Chaudhary Department of Computer Science Millersville University Reading Assignment Chapter 1.
22C:19 Discrete Math Graphs Fall 2014 Sukumar Ghosh.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Approximation Algorithms for TSP
1 The TSP : Approximation and Hardness of Approximation All exact science is dominated by the idea of approximation. -- Bertrand Russell ( )
9.2 The Traveling Salesman Problem. Let us return to the question of finding a cheapest possible cycle through all the given towns: We have n towns (points)
Combinatorial Algorithms
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Approximation Algorithms
Approximation Algorithms: Combinatorial Approaches Lecture 13: March 2.
Approximation Algorithms: Concepts Approximation algorithm: An algorithm that returns near-optimal solutions (i.e. is "provably good“) is called an approximation.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
Approximation Algorithms
1 Traveling Salesman Problem (TSP) Given n £ n positive distance matrix (d ij ) find permutation  on {0,1,2,..,n-1} minimizing  i=0 n-1 d  (i),  (i+1.
The Theory of NP-Completeness
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
Algorithms for Network Optimization Problems This handout: Minimum Spanning Tree Problem Approximation Algorithms Traveling Salesman Problem.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
The Traveling Salesperson Problem Algorithms and Networks.
Chapter 12 Coping with the Limitations of Algorithm Power Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
The Traveling Salesman Problem Approximation
University of Texas at Arlington Srikanth Vadada Kishan Kumar B P Fall CSE 5311 Solving Travelling Salesman Problem for Metric Graphs using MST.
1 The TSP : NP-Completeness Approximation and Hardness of Approximation All exact science is dominated by the idea of approximation. -- Bertrand Russell.
Complexity Classes (Ch. 34) The class P: class of problems that can be solved in time that is polynomial in the size of the input, n. if input size is.
Graph Theory Topics to be covered:
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Advanced Algorithm Design and Analysis (Lecture 13) SW5 fall 2004 Simonas Šaltenis E1-215b
Great Theoretical Ideas in Computer Science.
MIT and James Orlin1 NP-completeness in 2005.
Approximation Algorithms
1 Steiner Tree Algorithms and Networks 2014/2015 Hans L. Bodlaender Johan M. M. van Rooij.
Princeton University COS 423 Theory of Algorithms Spring 2001 Kevin Wayne Approximation Algorithms These lecture slides are adapted from CLRS.
The Traveling Salesman Problem Over Seventy Years of Research, and a Million in Cash Presented by Vladimir Coxall.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Unit 9: Coping with NP-Completeness
1 Approximation Algorithm Updated on 2012/12/25. 2 Approximation Algorithm Up to now, the best algorithm for solving an NP-complete problem requires exponential.
1 Algorithms and Networks Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij.
Approximation Algorithms for TSP Tsvi Kopelowitz 1.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
1 Approximation algorithms Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij TexPoint fonts used in EMF. Read the TexPoint manual.
Approximation Algorithms Guo QI, Chen Zhenghai, Wang Guanhua, Shen Shiqi, Himeshi De Silva.
Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.
Approximation Algorithms by bounding the OPT Instructor Neelima Gupta
Instructor Neelima Gupta Table of Contents Introduction to Approximation Algorithms Factor 2 approximation algorithm for TSP Factor.
1 Euler and Hamilton paths Jorge A. Cobb The University of Texas at Dallas.
Traveling Salesman Problem DongChul Kim HwangRyol Ryu.
Approximation algorithms
TU/e Algorithms (2IL15) – Lecture 11 1 Approximation Algorithms.
Introduction to Approximation Algorithms
Optimization problems such as
Joint work with Frans Schalekamp and Anke van Zuylen
Hamiltonian Cycle and TSP
Haim Kaplan and Uri Zwick
Computability and Complexity
1.3 Modeling with exponentially many constr.
Approximation Algorithms for TSP
Richard Anderson Lecture 25 NP-Completeness
Integer Programming (정수계획법)
Great Ideas: Algorithm Implementation
Richard Anderson Lecture 28 Coping with NP-Completeness
Richard Anderson Lecture 29 Complexity Theory
1.3 Modeling with exponentially many constr.
Integer Programming (정수계획법)
The Traveling Salesperson Problem
Approximation Algorithms
Lecture 24 Vertex Cover and Hamiltonian Cycle
Presentation transcript:

The Traveling Salesman Problem in Theory & Practice Lecture 1 21 January 2014 David S. Johnson Seeley Mudd 523, Tuesdays and Fridays

Today’s Outline 1.Requirements, References, & Introductions 2.Problem Definition 3.Applications 4.Paths and Cycles 5.Complexity 6.Introduction to Optimization 7.Introduction to Approximation 8.Preview of the Rest of the course

Requirements and Grading Class presentation of results from the literature. Written paper: – Survey paper on an approved topic – Report on your own new experimental work – Theoretical paper on new results of your own Regular class participation.

About Me Ph.D. in Mathematics from MIT (1973). Thesis: Near- Optimal Bin Packing Algorithms. 40 years at AT&T (Bell Labs, AT&T Labs – Research), with one year off for good behavior (U. Wisconsin, ). Most famous publication: Computers and Intractiability: A Guide to the Theory of NP- Completeness, (1979, with Mike Garey). Many theoretical and experimental papers on the TSP with many co-authors, starting with the proof that the Euclidean version is NP-Hard.

Optional Reference Books The Traveling Salesman Problem, Lawler, Lenstra, Rinnooy Kan, and Shmoys (Editors), Wiley (1985). $ (current amazon.com price, new) The Traveling Salesman Problem and Its Variations, Gutin and Punnen (Editors), Kluwer (2002). $ The Traveling Salesman Problem: A Computational Study, Applegate, Bixby, Chvatal, and Cook, Princeton University Press (2006). $57.99/$44.99 (Kindle) In Pursuit of the Traveling Salesman, Cook, Princeton University Press (2012). $20.64/$15.37 (Kindle)

Web Resources “The Traveling Salesman Problem” (Bill Cook) “The 8 th DIMACS Implementation Challenge: The Traveling Salesman Problem” (DSJ) “TSPLIB” (Testbed of Instances, Gerd Reinelt) (DSJ’s downloadable papers on the TSP and other topics) (Wikipedia Entry -- Much Improved)

The Traveling Salesman Problem Given: Set of cities {c 1,c 2,…,c N }. For each pair of cities {c i,c j }, a distance d(c i,c j ). Find: Permutation that minimizes

Alternative Definition Given: Graph G = (V,E) Length d(e) for each edge e in E. Find: Minimum length Hamiltonian Circuit in the complete graph G’ on V, where if {u,v} is not in E, we assume d(e) = ∞.

N = 10

N = 100

N = 1000

N = 10000

Jan Karel Lenstra

Planar Euclidean Application # 1 Cities: – Holes to be drilled in printed circuit boards

N = 10000

N = 2392

Planar Euclidean Application # 2 Cities: – Wires to be cut in a “Laser Logic” programmable circuit

N = 7397

N = 33,810

N = 85,900

Other Types of Instances X-ray crystallography – Cities: orientations of a crystal – Distances: time for motors to rotate the crystal from one orientation to the other High-definition video compression – Cities: binary vectors of length 64 identifying the summands for a particular function – Distances: Hamming distance (the number of terms that need to be added/subtracted to get the next sum)

Data Storage Layout Goal: For each row, have as many consecutive entries as possible (minimizes the number of random accesses)

Asymmetric Applications Payphone Money Collection with One-Way Streets Stacker-Crane No-Wait Flowshop Disk Scheduling Compiling to Minimize Branching Cost Minimum Length Common Superstring

The Stacker Crane Problem

No-Wait Flowshop Job: Task on Processor 1 Task on Processor 2 Schedule: Processor 2 Processor 1

No-Wait Flowshop

Disk Scheduling

Locations of the fragments of a file one want to retrieve Distance between two fragments = time it takes to move the read head from the end of one to the beginning of the next, taking into account the spinning of the disk

Compiling to Minimize Branching Cost Code Segment ending in a Branch In execution, the delay at the end of the segment is much less if the next instruction to be executed is the next one in the code, say 1 versus k. Based on profiling, one can determine the empirical probability that each branch is taken. Following A directly by B causes an expected delay of P B + k  P C. Following A directly by C causes an expected delay of P C + k  P B. Following A directly by anything else causes an expected delay of k. PBPB PCPC A C B

Shortest Superstring Given: Finite set of S strings over some alphabet. Find: Shortest string that contains all strings in S as substrings. Cities: Strings in S. Distances: d(x,y) = |y| - maximum overlap between a suffix of x and a prefix of y. X = “alphabet”, y =“ betrayal”d(x,y) = 5 alphabet betrayal d(y,x) = 6 betrayal alphabet

Hamiltonian Path versus Cycle Four variants (both for symmetric and asymmetric TSP). – Cycle – Path between between fixed endpoints – Path with fixed starting vertex – Path with unconstrained endpoints. A code for any one can be adapted to handle any of the others.

Path with Fixed Endpoints: Cycle via Path s t Call Path algorithm once for s and each vertex t in V-{s}. Return result with best value of Path Length + dist(t,s)

Path with Fixed Endpoints: Path via Cycle s t Add one new vertex and two new edges. Compute shortest cycle, then delete the added vertex and edges

Path with One Fixed Endpoint via Path with Two Fixed Endpoints s For each t in V – {s}, find shortest Hamiltonian path from s to t. Return the best.

Path with Two Fixed Endpoints via Path with One Fixed Endpoint s t Add one new vertex t’ with an edge to t. The shortest Hamiltonian path starting with s must end at t’. t’

Path with No Fixed Endpoints via Path with One Fixed Endpoint For each s in V, find shortest Hamiltonian path starting from s. Return the best.

Path with One Fixed Endpoint via Path with No Fixed Endpoint s Add new vertex s’ and an edge from s’ to s. s’

Directed via Undirected v 1 in v 1 out v1v1 v 2 in v 2 out v2v2 v 3 in v 3 out v3v3 v N in v N out vNvN  Replace each vertex v i by a triplet of vertices v i in, v i, v i out, and edges {v i in,v i } and {v i,v i out } Replace each directed edge (v i,v j ) by the undirected edge {v i out,v j in }.

v 3 out v 1 in v 1 out v1v1 v 2 in v 2 out v2v2 v 3 in v3v3 v 4 in v 4 out v4v4

TSP: The Canonical NP-Hard Problem? Commonly used in the popular press to explain NP- completeness and exponential time to the layman: The number of tours grows as N! (actually (N-1)!/2 for symmetric case): N# ToursN ,916, ,918, ,264,857, ,972,864, ,743,565,824,000 83, ,640,619,008, , ,531,142,144, , ,137,091,700,736, ,326, ,741,834,014,720,000

N! = Ω (2 NlogN ) time is not requi red O(N 2 2 N ) suffices! [Bellman, 1963][Held & Karp, 1962] Algorithmic technique: Dynamic Programming States: Pairs [U,j] with 2 ≤ j ≤ N and {v 1,v j } ⊆ U ⊆ V. Note: There are θ (N2 N ) states [U,j]. Values: X[U,j] is the length of the shortest Hamiltonian path, starting with v 1 and ending with v j, in the subgraph of G induced by U. Note: The optimal tour length equals min {X[V,j] + d(v j,v 1 ): 2 ≤ j ≤ N}.

Computing the Values X[U,j] X[{v 1,v j },j] = d(v 1,v j ), 2 ≤ j ≤ N. Now assume we already have computed X[U,j], 2 ≤ j ≤ N, for all U, {v 1,v j } ⊆ U ⊆ V, with |U| = k. Let W be such that v 1 ∈ W ⊆ V and |W| = k+1. Suppose v i, i > 1, is in W. Then X[W,i] = min {X[W - {v i },j] + d(v j,v i ): v j ∈ W - {v i }} Computation takes O(N) time for each state [W,i]. Since there are θ (N2 N ) states overall, this yields an overall running time of O(N 2 2 N ).

N = 85,900 Current World Record (2006) Using a parallelized version of the Concorde code, Helsgaun’s sophisticated variant on Iterated Lin-Kernighan, and cpu-days

Concorde “Branch-and-Cut” approach exploiting linear programming to determine lower bounds on optimal tour length. Based on 30+ years of theoretical developments in the “Mathematical Programming” community, plus some very good data structures and heuristics work from computer science. For surprisingly large instances, it finds an optimal tour and proves its optimality (unless it runs out of time/space). Executables and source code can be downloaded from

Running times (in seconds) for 10,000 Concorde runs on random 1000-city planar Euclidean instances (2.66 Ghz Intel Xeon processor in dual-processor PC, purchased late 2002). Range: 7.1 seconds to 38.3 hours

Concorde Asymptotics [Hoos and Stϋtzle, 2009 draft] Estimated median running time for random Euclidean instances. Based on – 1000 samples each for N = 500,600,…,2000 – 100 samples each for N = 2500, 3000,3500,4000,4500 – 2.4 Ghz AMD Opteron 2216 processors with 1MB L2 cache and 4 GB main memory, running Cluster Rocks Linux v · √N Actual median for N = 2000: ~57 minutes, for N = 4,500: ~96 hours

For Larger Instances: Fast Heuristics Tour construction heuristics like Nearest Neighbor, Greedy, Christofides. Local search heuristics like 2-Opt, 3-Opt, Lin- Kernighan, Iterated Lin-Kernighan, or Helsgaun’s Algorithm. A range of heurstics may be useful, based on tradeoffs between tour quality and running time.

Necessary Digression: Metrics As the TSP is defined, the city-city distances (edge lengths) are only constrained to satisfy 1. d(c,c’) ≥ 0, for all pairs of cities c,c’ (non-negativity) 2. d(c,c’) = 0 if and only if c = c’ To be a quasimetric, the distances also must satisfy the “triangle inequality” 3. d(c,c’) ≤ d(c,c’’) + d(c’’,c’) for all triples of cities To be a metric, the distances must also be symmetric: 4. d(c,c’) = c(c’,c), for all pairs of cities c,c’

Shortest Path “ Metric” Let d be a TSP distance function. For any pair c,c’ of cities, let d S (c,c’) be the length of shortest path from c to c’ under d. Note that d S will be a quasimetric (and a metric if d is symmetric) For most real-world applications, d S is actually the distance function of interest, and so the triangle inequality holds. As we shall see shortly, if we have the triangle inequality, we can obtain good performance guarantees for certain heuristics.

Additional Restriction in Practice Distances are integers. – Simplifies codes. – Yields a definitive optimal solution value. – Not a real restriction if distances are rational. – Allows us to cope with the problemmatic Euclidean metric.

Euclidean Difficulties The length of a TSP tour for points in the plane under the Euclidean metric is a sum of square roots: Length = ∑ i (x i ) 1/2 Given such an expression and a constant B our current best algorithm for determining whether the length is less than B takes exponential time. Hence, we do not even know whether the decision problem version of the Euclidean TSP is in NP. And if we round the distances to some fixed precision, then we may get different optimal tours for different precisions (up to an exponential number of bits).

Rounding Conventions 1.Round Nearest d n (x) = floor(x+.5) – Likely to be yield tour lengths closest to the true Euclidean – Although optimal tours may opportunistically favor the rounded-down edge lengths – And triangle inequality may no longer be obeyed d n (x,z) = 3 > d n (x,y) + d n (y,z) = = xz y

Rounding Conventions 2. Round Down d f (x) = floor(x) – Possibly most efficiently computable. – But underestimates true tour length. – Also fails to obey triangle inequality. floor(3.8) > floor(1.9) + floor(1.9) 3. Round Up d c (x) = ceiling(x) – Does obey the triangle inequality. – But overestimates true tour length.

Exploiting Triangle Inequality Observation 1: Any connected graph in which every vertex has even degree contains an “Euler Tour” – a cycle that traverses each edge exactly once, which can be found in linear time. Observation 2: If the Δ -inequality holds, then traversing an Euler tour but skipping past previously-visited vertices yields a Traveling Salesman tour of no greater length.

Obtaining the Initial Graph Double MST algorithm (DMST): – Combine two copies of a Minimum Spanning Tree. – Theorem [Folklore]: DMST(I) ≤ 2  Opt(I). Christofides algorithm (CH): – Combine one copy of an MST with a minimum-length matching on its odd-degree vertices (there must be an even number of them since the total sum of degrees for any graph is even). – Theorem [Christofides, 1976]: CH(I) ≤ 1.5  Opt(I).

Optimal Tour on Odd-Degree Vertices (No longer than overall Optimal Tour by the triangle inequality) Matching M 1 Matching M 2 += Optimal Tour Hence Optimal Matching ≤ min(M 1,M 2 ) ≤ OPT(I)/2

2-Opt 3-Opt Smart-Shortcut Christofides

1 million cities on my 3.06 Ghz iMac: Lin-Kernighan gets within 2% of optimal in 61 seconds. The “strip” heuristic gets within 30% in 2 seconds. Compared to 40% for the much slower “double MST” heuristic.

The Held-Karp Bound and the Optimal Solution Value

Integer Programming Formulation for Symmetric TSP Minimize ∑ d i x i where d i is the length of edge e i Subject to  x i ∈ {0,1}, for all edges e i ∈ C X C  ∑ c ∈ e i x i = 2, for all cities c ∈ C,  ∑ |e i ∈ U|=1 x i ≥ 2, for all proper subsets U ⊂ C

Linear Programming Relaxation: “Held-Karp” or “Subtour” Bound Minimize ∑d i x i where d i is the length of edge e i Subject to  x i ∈ [0,1], for all edges e i ∈ C X C  ∑ c ∈ e i x i = 2, for all cities c ∈ C,  ∑ |e i ∈ U|=1 x i ≥ 2, for all proper subsets U ⊂ C

Percent by which Optimal Tour exceeds Held-Karp Bound For “Uniform Points” in the Unit Square (+), the gap appears to decline to a value of about 0.44% asymptotically.

Computing the HK Bound Major obstacle: exponential number of cut constraints. ∑ |e i ∈ U|=1 x i ≥ 2, for all proper subsets U ⊂ C. However, one can find violated constraints in polynomial time by maximum flow techniques (and other heuristics). Concorde has options for computing the bound in roughly this way (5 hours on my iMac for a million cities). One can also construct an alternative LP formulation that is of polynomial size, so the HK bound can in principle be computed in polynomial time.

Topics to Be Covered NP-completeness proofs, hardness of approximation results. Polynomial-time (and 2 o(n) -time) solvable special cases. Branch-and-cut optimization algorithms ( Concorde, etc.): theory and engineering. Properties of optimal solutions. Polynomial-time approximation tour construction heuristics with good worst- case guarantees and/or average case performance. Data structures, exploiting geometry, and other speed-up tricks for heuristics. Local Optimization heuristics (2-Opt, 3-Opt, Lin-Kernighan). Metaheuristics (neural nets, simulated annealing, genetic algorithms, etc.). Variants (max TSP, min-latency TSP, prize-collecting TSP, Vehicle routing, …)