The Traveling Salesman Problem in Theory & Practice Lecture 7: Local Optimization 4 March 2014 David S. Johnson

The Traveling Salesman Problem in Theory & Practice Lecture 7: Local Optimization 4 March 2014 David S. Johnson dstiflerj@gmail.com http://davidsjohnson.net Seeley Mudd 523, Tuesdays and Fridays

Outline 1.Tour of the DIMACS TSP Challenge Website and other web resources 2.Basic local optimization heuristics and their implementations 2-Opt 3-Opt

Projects and Presentations Please email me by 3/11: – The planned subject for your project (survey paper, theoretical or experimental research project, etc.) and – The paper(s)/result(s) you plan to present in class. – Preferred presentation date. We have 7 more classes after this one: 3 more for me, 3 for presentations, and the last (4/29) for a wrap-up from me and 10-minute project descriptions from you. Final project write-ups are due Friday 5/2.

DIMACS Implementation Challenge Initiated in 2000 Major efforts wound down in 2002 Still updateable (in theory)

Challenge Testbeds All provided by means of instance generation code and specific seeds for the random number.

Running Time Normalization Source code for Greedy and a generator for random Euclidean instances, provided for download. Participants reported their running time for Greedy on the Test Battery instances.

Machine-Specific Correction Factors 10 3 10 4 10 5 10 6 10 7

A Tour of the Website Click Here

Local Optimization: 2-Opt

Basic Scheme Use a tour construction heuristic to build a starting tour. While there exists a 2-opt move that yields a shorter tour, – Choose one. – Perform it. Which heuristic? How do we determine this efficiently? Which one? With what data structure? Each choice can affect both running time and tour quality.

Determining the existence of an Improving 2-Opt move Naïve approach: Try all N(N-3)/2 possibilities. More sophisticated: Observe that one of the following must be true: d(a,b) > d(b,c) or d(c,d) > d(d,a). Suppose we consider each ordered pair (t 1,t 2 ) of adjacent tour vertices as candidates for the first deleted edge in an improving 2-opt move. Then we may restrict our attention to candidates for the new neighbor t 3 of t 2 that satisfy d(t 2,t 3 ) < d(t 1,t 2 ). If the improving move to the left is not caught when (t 1,t 2 ) = (a,b), it will be caught when (t 1,t 2 ) = (c,d).

t4t4 t3t3 t2t2 t1t1 Sequential Searching For t 1 going counterclockwise around the tour, For t 2 a tour neighbor of t 1, For all t 3 with d(t 2,t 3 ) < d(t 1,t 2 ), For the unique t 4 that will yield a legal 2-opt move, Test whether d(t 1,t 4 )+d(t 2,t 3 ) is less than d(t 1,t 2 )+d(t 3,t 4 ). If so, add 〈 (t 1, t 2 ),(t 4,t 3 ) 〉 to the list of improving moves. Otherwise, continue. t2t2 t3t3 Note: For geometric instances where k-d trees have been constructed, we can find the acceptable t 3 ’s using fixed-radius searches from t 2 with radius d(t 1,t 2 ).

Which Improving Move to Make? Best Possible: Time consuming, not necessarily best choice for the long run. Best of those for the current choice of t 1 : Still not necessarily best in the long run, but significantly faster. Best of the first 8 new champions for the current choice of t1: Still faster. First found for the current choice of t 1 : Even faster, but not necessarily best (or fastest) in the long run. Variant10 3 10 4 10 5 10 6 Best4.74.64.5 8th4.9 4.7 First6.16.05.85.7 10 3 10 4 10 5 10 6 0.213.154.92285 0.202.449.12344 0.172.247.52754 % Excess over Held-Karp BoundRunning Time in 150Mhz Seconds [Jon Bentley’s Geometric Code]

Don’t-Look-Bits One bit associated with each city, initially 0. If one fails to find an improving move for a given choice of t 1, we set the don’t-look-bit for t 1 to 1. If we find an improving move, we set the don’t-look-bits for t 1, t 2, t 3, and t 4 all to 0. If a given city’s don’t-look-bit is 1, we do not consider it for t 1. Costs perhaps 0.1% in tour quality, factor of 2 or greater speedup. Enables processing in “queue” order: Initially all cities are in queue. When a city has its don’t-look-bit set to 1, it is removed from the queue. When a city not in the queue has it’s don’t-look-bit set to 0, it is added to the end of the queue. For the next city to try as t 1, we pop off the element at the head of the queue.

Which Improving Move to Make? Best Possible: Time consuming, not necessarily best choice for the long run. Best of those for the current choice of t 1 : Still not necessarily best in the long run, but significantly faster. Best of the first 8 new champions for the current choice of t1: Still faster. First found for the current choice of t 1 : Even faster, but not necessarily best (or fastest) in the long run. Variant10 3 10 4 10 5 10 6 Best4.74.64.5 8th4.9 4.7 First6.16.05.85.7 10 3 10 4 10 5 10 6 0.213.154.92285 0.202.449.12344 0.172.247.52754 % Excess over Held-Karp BoundRunning Time in 150Mhz Seconds [Jon Bentley’s Geometric Code]

Tour Representations Must maintain a consistent ordering of the tour so that the following operations can be correctly performed. 1.Next(a) and Prev(a): Return the successor/predecessor of city a in the current ordering of the tour. 2.Between(a,b,c): Report whether, if one starts at city a and proceeds forward in the current tour order, one will encounter b before c. (This will be needed for 3-opt.) 3.Flip(a,b,c,d): If b = Next(a) and c = Next(d), update the tour to reflect the 2-opt move in which the tour edges (a,b) and (c,d) are replaced by (b,c) and (a,d). Otherwise, report “Invalid Move”.

Tour Representations Must maintain a consistent ordering of the tour so that the following operations can be correctly performed. 1.Next(a) and Prev(a): Return the successor/predecessor of city a in the current ordering of the tour. 2.Between(a,b,c): Report whether, if one starts at city a and proceeds forward in the current tour order, one will encounter b before c. (This will be needed for 3-opt.) 3.Flip(a,b,c,d): If b = Next(a) and c = Next(d), update the tour to reflect the 2-opt move in which the tour edges (a,b) and (c,d) are replaced by (b,c) and (a,d). Otherwise, report “Invalid Move”. See [Fredman, Johnson, McGeoch, & Ostheimer, “Data structures for traveling salesmen,” J. Algorithms 18 (1995), 432-479].

Array Representation abcdefghijklmnopqrstuvwxyz Tour Array of City Indices City Array of Tour Indices Next(c i ) = Tour[City[i]+1(mod N)] Prev(c i ) = Tour[City[i]-1(mod N)] (analogous) Between(c i, c j, c k ): (Straightforward)

Array Representation: Flip abcdefghijklmnopqrstuvwxyz Tour Flip(f,g,p,q) abcdefponmlkjihgqrstuvwxyz Flip(x,y,c,d) azydefponmlkjihgqrstuvwxcb

Array Representation: Costs Next, Prev:O(1) Between: θ (N) Flip: θ (N) Speed-up trick: If the segment to be flipped is greater than N/2, flip its complement.

Problem for Arrays For random Euclidean instances, 2-opt performs θ(N) moves and, even if we always flip the shorter segment, the average length of the segment being flipped, grows roughly as θ(N 0.7 ) [Bentley, 1992]. Doubly-linked lists suffer from the same problems. Can we do better with other tour representations? We can in fact do much better (theoretically). By representing the tour using a balanced binary tree, we can reduce the (amortized) time for Between and Flip to θ(log(N)) per operation, although the times for Next and Prev increase from constant to that amount. “Splay Trees” are especially useful in this context (and will be described in the next few slides). Significant further improvements are unlikely, however: Theorem [Fredman et al., 1995]. In the cell-probe model of computation, any tour representation must, in the worst case, take amortized time Ω (log(N)/loglog(N)) per operation.

Binary Tree Representation Cities are contained in a binary tree, with a bit at each internal node to tell whether the subtree rooted at that node should be reversed. (Bits lower down in the tree will locally undo the effect of bits at their ancestors.) To determine the tour represented by such a tree, simply push the reversal bits down the tree until they all disappear. An inorder traversal of the tree will then yield the tour. (To push a reversal bit at node x down one level, interchange the two children of x, complement their reversal bits, and turn off the reversal bit at x.)

Splay Trees [Sleator & Tarjan, “Self-adjusting binary search trees,” J. ACM 32 (1985), 652-686] Every time a vertex is accessed, it is brought to the root (splayed) by a sequence of rotations (local alterations of the tree that preserve the inorder traversal). Each rotation causes the vertex that is accessed to move upward in the tree, until eventually it reaches the root. The precise operation of a rotation depends on whether the vertex is the right or left child of its parent and whether the parent is the right or left child of its own parent. The change does not depend on any global properties of the subtrees involved, such as depth, etc. All the standard binary tree operations can be implemented to run in amortized worst-case time O(log(N)) using splays. In our Splay Tree tour representation, the process of splaying is made slightly more difficult by the reversal bits. We handle these by preceding each rotation by a step that pushes the reversal bits down out of the affected area. Neither the presence of the reversal bits nor the time needed to clear them affects the amortized time bound for splaying by more than a constant factor.

Splay Tree Tour Operations Next(a): 1.Splay a to the root of the tree. 2.Traverse down the tree (taking account of reversal bits) to find the successor of a. 3.Splay the successor to the root. Prev(a): Handled analogously. Between(a,b,c): 1.Splay b to the root, then a, then c. Note that [Sleator & Tarjan, 1985] shows that no rotation for a vertex x causes any vertex to increase its depth by more than 2. Thus, after these splays, c is the root (level 1), a is no deeper than level 3, and b is no deeper than level 5. They also show that if a is at level 3, then it either the left child of a left child or the right child of a right child. 2.Clear all the reversal bits from the top 5 levels of the tree. 3.Traverse upward from b in its new position in the tree. 4.The answer is yes if – we reach a first and arrive from the right, or – we reach b first and arrive from the left. 5.Otherwise, it is no.

a c a a a

Splay Tree Flip(a,b,c,d) Splay d to the root, then splay b to the root, and push all reversal bits down out of the top three levels. There are four possiblities (T i R represents the subtree with the reversal bit at its root complemented): Reverses the path from b to d.Reverses the path from d to b. b db d d d b b x x x d b d b d b x d b

Speedups (Lose theoretical guarantees for better performance in practice) No splays for Next and Prev – simply do tree traversals, taking into account the reversal bits. No need to splay b in the Between operation. Instead simply splay a and c, and then traverse up from b until either a or c is encountered (as before). Operation of Flip unchanged. Yields about a 30% speedup.

Advantages of Splay Trees Ease of implementing Flip compared to other balanced binary tree implementations. “Self-Organizing” properties: Cities most involved in the action stay relatively close to the root. And since typically most cities drop out of the action (get their don’t-look-bits set to 1 permanently) fairly early, this can significantly reduce the time per operation. Splay trees start beating arrays for random Euclidean instances on modern computers somewhere between N = 100,000 and N = 316,000. They are 40% faster when N = 1,000,000. For more sophisticated algorithms, like Lin-Kernighan (to be discussed later), the transition point is much earlier: Splay trees are 13 times faster when N = 100,000.

Beating Splay Trees in Practice: Approximately √N segments of length √N each The Two-Level-Tree

Splay Trees versus Two-Level Trees Two-Level Trees 2-3 times faster for N = 10,000 (not counting preprocessing time), declining to 11% at N = 1,000,000. But does this matter? In 1995, the time for N = 100,000 was 3 minutes versus 5 (Lin- Kernighan). Today it is 2.1 seconds versus 3.8. What is this “preprocessing”? We switched implementations in order to be able to compare tour representations – See next slide.

The Neighbor-List Implementation Can handle non-geometric instances. – TSP in graphs – X-ray crystallography – Video compression – Converted versions of asymmetric TSP instances Can exploit geometry when it is present. Because of the trade-offs it makes, it may be 0.4% worse for 2- opt than the Bentley’s purely geometric implementation, but it will be substantially faster for sophisticated algorithms like Lin- Kernighan, which otherwise would perform large numbers of fixed-radius searches.

The Neighbor-List Implementation Basic idea: Precompute, for each city, a list of the k closest other cities, ordered by increasing distance, and store the corresponding distances. If we set k = N, we should find tours as good as Bentley’s geometric code, but would take Θ(N 2 log(N)) preprocessing time and Θ(N 2 ) space. Tradeoff: Take much smaller k (default is k=20). For geometric instances, with a k-d tree constructed, we can compute the list for a given city in time “typically” O(logN + klogk)). No longer need to do a fixed-radius search for t 3 candidates. Merely examine cities on the list for city t 2 in order until a city x with d(t 2,x) > d(t 1,t 2 ) is reached. As soon as we find an improving move for a given t 1, we perform it and go on to the next choice for t 1 (first choice of an improving move rather than best, although given our ordering of t 3 candidates, it should tend to be better than a random improving move). Requires Θ(kN) space, but this is not a problem on modern computers. Also allows variants on the make-up of the neighbor-list that might be useful for non-uniform geometric instances.

Problem with Non-Uniform Geometric Instances Even if k = 80, the nearest neighbor graph (with an edge between two cities if either is on the other’s nearest neighbor list) is not connected.

Quad Neighbors Pick k/4 nearest neighbors in each quadrant centered at our city c. If any quadrants have a shortfall, bring the total to k by adding the nearest remaining unselected cities, irrespective of quadrant. This guarantees that the graph of nearest neighbors will be connected. For N = 10,000 clustered instances, yielded a 1-3% improvement in tours under 2-opt, with no running time penalty (and no tour penalty for uniform data). k = 16

Starting ToursOne More Thing… Starting Tour% Excess over HK 2-opt % excess Start Secs 2-opt Secs Total Secs Farthest Insertion13.011.97689165 Farthest Addition+13.211.8385290 Random Insertion14.812.35772129 Random Addition15.211.8163147 Approx. Christofides14.96.7244064 Greedy15.75.8143044 Nearest Neighbor24.28.742731 N = 10,000 [Bentley, 1992] Similar results for Savings under the neighbor-list implementation: Savings % Excess over HK: 11.8, 2-Opt % Excess with Savings Start: 8.6

Explanation? 1000 runs on on a fixed 1000-city instance using randomized versions of Greedy and Savings. X-axis is % excess for starting tour. Y-axis is % excess after 2-opting.

Microseconds/N Microseconds/N 1.25 Microseconds/NlogN Estimating Running-Time Growth Rate for 2-Opt (Neighbor List Implementation )

3-Opt: Look for improving 3-opt moves, where three edges are deleted and we choose the cheapest way to reconnect the segment into a tour. [Includes 2-opt moves as a special case. Naïve implementation is O(N 3 ) to find an improving move or confirm that none exists.] 2.5 Opt [Bentley, 1992]. When doing a ball search about t 2 to find a potential t 3 with d(t 2,t 3 ) < d(t 1,t 2 ), also consider the following three other possible moves: – Insert t 3 in the middle of edge {t 1,t 2 }, – Insert t 1 in the middle of tour edge ending with t 3, or – Insert t 1 in the middle of tour edge beginning with t 3. Note that these are degenerate 3-opt moves: Or-Opt: [Or, 1976]: Special case of 3-opt in which the moves are restricted to simply deleting a chain of 1, 2, or 3 consecutive tour vertices and inserting it elsewhere in the tour, possibly in the reverse direction for chains of 3 vertices. (Time O(N 2 ) to find an improving move or confirm that none exists.) But the next theorem suggests that 3-Opt need not take Ω (N 3 ) in practice. Beyond 2-Opt

If a sequence x 1, x 2, …, x k has a positive sum S > 0, then there is a cyclic permutation π of these numbers, all of whose prefix sums are positive, that is, for all j, 1 ≤ j ≤ k, it satisfies Proof: Suppose that our original sequence does not satisfy this constraint. Let M denote the largest value such that = -M for some j, and h be the largest j such that this holds. We claim that the cyclic permutation that starts with h+1 is our desired permutation. By the maximality of h, we must have, for all j, h M. Since, by definition of M, we have ≥ -M for all j, 1 ≤ j ≤ h, our chosen permutation will have all its prefix sums positive. Partial Sum Theorem

0 +M -M 1hk π(1)π(k)

Topological Issues (G* will be the value of the best move found so far.) For each t 1 in our neighbor-list implementation, we perform the first improving move found unless it is a 2-opt move, in which case we take the first extension found to a better 3-opt move, and if none is found, perform the 2-opt move.

Valid choices for t 5 are circled, for the cases where t 4 precedes t 3 (left) or follows t 3 (right). [Note: One choice of t 6 in the left case, two choices in the right.] Topology The Between(a,b,c) operation is needed in the second case to tell us which t 5 ’s are valid. (Omitting this case costs about 0.2% in tour quality.)

If G* > 0, this is more restrictive than the Theorem allows, but we’ve already found an improving move for this t 1 and so can afford to be aggressive -- this is a speed-up trick from [Lin & Kernighan, 1973] In neighbor-list implementation, perform move and go to next t 1. In neighbor-list implementation, if G* > 0, the current choices of t 2, t 3, t 4 must represent an improving 2-opt move. Perform it and go to next t 1.

Results Tour quality for Neighbor-List 3-opt with k = 20 is equivalent to that for Bentley’s geometric 3-opt (as opposed to 0.4% behind for 2-opt). Neighbor List Results (2-Level Tree Tour Representation): N =10 3 10 4 10 5 10 6 2-Opt [20]% Excess4.95.04.9 150 Mhz Secs*0.323.856.7928 3-Opt [20]% Excess3.13.0 150 Mhz Secs*3.84.666.11054 *Roughly half of time is spent generating neighbor lists and starting tour. Time on 3.06 Ghz Intel Core i3 processor at N = 10 6 : 25.4 sec (2-opt), 29.5 sec (3-opt)

Next Up 4-Opt Lin-Kernighan and beyond….

The Traveling Salesman Problem in Theory & Practice Lecture 7: Local Optimization 4 March 2014 David S. Johnson

Similar presentations

Presentation on theme: "The Traveling Salesman Problem in Theory & Practice Lecture 7: Local Optimization 4 March 2014 David S. Johnson"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Traveling Salesman Problem in Theory & Practice Lecture 7: Local Optimization 4 March 2014 David S. Johnson

Similar presentations

Presentation on theme: "The Traveling Salesman Problem in Theory & Practice Lecture 7: Local Optimization 4 March 2014 David S. Johnson"— Presentation transcript:

Similar presentations

About project

Feedback