Download presentation

Presentation is loading. Please wait.

Published byAlly Beavers Modified over 2 years ago

1
Algorithmic Frontiers of Doubling Metric Spaces Robert Krauthgamer Weizmann Institute of Science Based on joint works with Yair Bartal, Lee-Ad Gottlieb, Aryeh Kontorovich

2
The Traveling Salesman Problem: Low-dimensionality implies PTAS Robert Krauthgamer Weizmann Institute of Science Joint work with Yair Bartal and Lee-Ad Gottlieb TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A AAAA

3
Traveling Salesman Problem (TSP) Definition: Given a set of cities (points), find a minimum-length tour that visits all points Classic, well-studied NP-hard problem [Karp‘72; Papadimitriou-Vempala‘06] Mentioned in a handbook from 1832! Common benchmark for optimization methods Many books devoted to TSP… Numerous variants Closed/open tour Multiple tours Average visit time (repairman) Etc… Algorithmic Frontiers of Doubling Metric Spaces Optimal tour 3

4
Metric TSP Algorithmic Frontiers of Doubling Metric Spaces MST 4

5
Euclidean TSP Sanjeev Arora [JACM‘98] and Joe Mitchell [SICOMP‘99]: Euclidean TSP with fixed dimension admits a PTAS Find (1+ Ɛ )-approximate tour In time n∙(log n) Ɛ -Õ(dimension) where n = #points (Extends to other norms) They were awarded the 2010 Gödel Prize for this discovery Algorithmic Frontiers of Doubling Metric Spaces 5 5

6
PTAS Beyond Euclidean? To achieve a PTAS, two properties were assumed Euclidean space (at least approximately) Fixed dimension Are both these assumptions required? Fixed dimension is necessary No PTAS for (log n)-dimensions unless P=NP [Trevisan’00] Is Euclidean necessary? Consider metric spaces with low Euclidean intrinsic dimension… Algorithmic Frontiers of Doubling Metric Spaces 6 6

7
Doubling Dimension Definition: Ball B(x,r) = all points within distance r from x. The doubling constant (of a metric M) is the minimum value such that every ball can be covered by balls of half the radius First used by [Assoud‘83], algorithmically by [Clarkson‘97]. The doubling dimension is ddim(M)=log (M) [Gupta-K. -Lee‘03] M is called doubling if its doubling dimension is constant Packing property of doubling spaces A set with diameter D>0 and inter-point distance ≥a, contains at most (D/a) O(ddim) points Algorithmic Frontiers of Doubling Metric Spaces Here ≤7. 7

8
Applications of Doubling Dimension Nearest neighbor search [K.-Lee’04; HarPeled-Mendel’06; Beygelzimer-Kakade-Langford’06; Cole-Gottlieb‘06] Spanners, routing [Talwar’04; Kleinberg-Slivkines-Wexler’04; Abraham-Gavoille-Goldberg-Malkhi’05; Konjevod-Richa-Xia-Yu’07, Gottlieb-Roditty’08; Elkin-Solomon‘12;] Distance oracles [HarPeled-Mendel’06; Bartal-Gottlieb-Roditty-Kopelowitz-Lewenstein’11] Dimension reduction [Bartal-Recht-Schulman’11, Gottlieb-K.’11] Machine learning and statistics [Bshouty-Yi-Long‘09; Gottlieb-Kontorovich-K.’10,‘12; ] Algorithmic Frontiers of Doubling Metric Spaces 8 G 2 1 1 H 2 1 1 1 8

9
PTAS for Metric TSP? Does TSP on doubling metrics admit a PTAS? Arora and Mitchell made strong use of Euclidean properties “Most fascinating problem left open in this area” [James Lee, tcsmath blog, June ’10] Some attempts Quasi-PTAS [Talwar‘04] (First description of problem) Quasi-PTAS for TSP w/neighborhoods [Mitchell’07; Chan-Elbassioni‘11] Subexponential-TAS, under weaker assumption [Chan-Gupta‘08] Our result: TSP on doubling metrics admits a PTAS Find (1+ Ɛ )-approximate tour In time:n 2 O(ddim) 2 Ɛ -Õ(ddim) 2 O(ddim 2 ) log ½ n Euclidean (to compare): n∙(log n) Ɛ -Õ(dimension) Algorithmic Frontiers of Doubling Metric Spaces 9 Throughout, think of ddim and ε as constants 9

10
Metric Partition A quadtree-like hierarchy [Bartal’96, Gupta-K.-Lee’03, Talwar‘04] At level i: Algorithmic Frontiers of Doubling Metric Spaces Centers are 2 i -apart in arbitrary order Random radii R i 2 [2 i, 2·2 i ] 10

11
Metric Partition (2) Algorithmic Frontiers of Doubling Metric Spaces Random radii R i-1 2 [2 i-1, 2·2 i-1 ] 11 A quadtree-like hierarchy [Bartal’96, Gupta-K.-Lee’03, Talwar‘04] Recursively to level i-1: Caveat: log(n) hiearchical levels suffice Ignore tiny distances < 1/n 2

12
Dense Areas Key observation: The points (metric space) can be decomposed into sparse areas Call a level i ball “dense” if local tour weight (i.e. inside R i -ball) is ≥ R i / Ɛ Such a ball can be removed, solving each sub-problem separately Cost to join tours is relatively small: only R i Algorithmic Frontiers of Doubling Metric Spaces 12

13
Sparsification Sparse decomposition: Search hierarchy bottom-up for dense balls. Remove dense ball: Ball is composed of 2 O(ddim) sparse sub-balls So it’s barely dense, i.e. local tour weight ≤ 2 O(ddim) R i-1 / Ɛ Recurse on remaining point set But how do we know the local weight of the tour in a ball? Can be estimated using the local MST Modulo caveats like “long” edges… OPT B(u,R) ≤ O(MST(S)) OPT B(u,3R) ≥ Ω(MST(S)) - Ɛ -O(ddim) R Algorithmic Frontiers of Doubling Metric Spaces Henceforth, we assume the input is sparse 13

14
Light Tours Algorithmic Frontiers of Doubling Metric Spaces 2 i-1 /M 14 Definition: A tour is (m,r)-light on a hierarchy if it enters all cells (clusters) At most r times, and Only via m designated portals Choose portals as (2 i /M)–net points Then m = M O(ddim)

15
Optimizing over Light Tours Theorem [Arora‘98,Talwar‘04]: Given a hierarchical partition, a minimum-length (m,r)-light tour for it can be computed exactly In time m r∙O(ddim) n∙log n Via dynamic programming Join tours for small clusters into tour for larger cluster Algorithmic Frontiers of Doubling Metric Spaces Typically both m,r ≈ polylog(n/ε), thus m r ≈ n polylog n 15

16
Better Partitions and Lighter Tours Our Theorem: For every (optimal) tour T, there is a partition with an (m,r)-light tour T’ such that M = ddim∙log n/ Ɛ m = M O(ddim) = (log n/ Ɛ ) Õ(ddim) r = ε -O(ddim) loglog n And length(T’) ≤ (1+ Ɛ )∙length(T) If the partition were known, then a tour like T’ could be found in time m r O(ddim) n∙log n = n 2 Ɛ -Õ(ddim) loglog 2 n It remains to prove the Theorem, and show how to find the partition Algorithmic Frontiers of Doubling Metric Spaces Now m r ≈ poly(n) a bit later after that 16

17
Constructing Light Tours Algorithmic Frontiers of Doubling Metric Spaces 2 i-1 /M 17

18
Constructing Light Tours (2) Modify a tour to be (m,r)-light [Arora‘98, Talwar‘04] Part II: Focus on r (i.e. number of crossing edges) Reduce number of crossings Patching step: Reroute (almost all) crossings back into cluster Cost ≈ length of tour on the patched endpoints ≈ MST of these points MST Theorem [Talwar ‘04]: For a set S of points MST(S) ≤ diam(S)∙|S| 1-1/ddim Cost per point ≤ diam(S) / |S| 1/ddim Algorithmic Frontiers of Doubling Metric Spaces diam(S) 18

19
Constructing Light Tours (3) Modify a tour to be (m,r)-light [Arora‘98, Talwar‘04] Part II: Focus on r (i.e. number of crossing edges) Reduce number of crossings Expected cost to edge at level i-1 Radius R i-1 ≈ 2 i-1 Pr [edge is patched ] ≤ Pr[edge is cut ] Expected cost ≤ (R i-1 /r 1/ddim )(ddim/R i-1 ) = ddim/r 1/ddim As before, want this to be ≤ Ɛ /log n (because we sum over log n levels) Could take r = (ddim∙log n / Ɛ ) ddim But dynamic program runs in time m r QPTAS! [Talwar ‘04] Algorithmic Frontiers of Doubling Metric Spaces 2R i-1 Challenge: smaller value for r 19

20
Patching in Sparse Areas Algorithmic Frontiers of Doubling Metric Spaces R i-1 /M 20 Suppose a tour is q-sparse with respect to hierarchy Every R-ball contains weight qR (for all R=2 i ) Expectation: Random R-ball cuts weight Rq/R = q Cluster formed by cuts from many levels Expectation: weight q is cut per level If r = q∙2loglog n Expectation: level i-1 patching includes edges cut at much higher levels Charge only “top” half of patched edges Each charged about 2R i-1 Pr[edge is charged for patching] ≤ Pr[edge is cut at level i+loglog n] ≤ ddim/(R i-1 log n)

21
Wrapping Up (Patching Sparse Areas) Modify a tour to be (m,r)-light [Arora‘98, Talwar‘04] Part II: Focus on r (i.e. number of crossing edges) Reduce number of crossings Expected cost at level i-1 Expected cost ≤ (R i-1 /r 1/ddim )(ddim/R i-1 log n) = ddim/log n∙r 1/ddim As before, want this term to be equal to Ɛ /log n Take r = (ddim/ Ɛ ) ddim Obtain PTAS! Algorithmic Frontiers of Doubling Metric Spaces 2R i-1 21

22
Technical Subtleties R i-1 /M 22 Algorithmic Frontiers of Doubling Metric Spaces Outstanding problem: Previous analysis assumed ball cuts only q edges True in expectation… Not good enough Solution: try many hierarchies Choose at random log n radii for each ball and try all their combinations! WHP, some hierarchy cuts q edges in every ball Drives up runtime of dynamic program

23
Algorithmic Frontiers of Doubling Metrics Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb and Aryeh Kontorovich TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A AAAA

24
Large-margin classification in metric spaces [vonLuxburg-Bousquet’04] Unknown distribution D of labeled points (x,y) 2 M £ {-1,1} M is a metric space (generalizes R dim ) Labels are L-Lipschitz: |y i -y j | ≤ L∙d(x i,x j ) (generalizes margin) Resource: Sample of labeled points Goal: Build hypothesis f:M {-1,1} that has (1-ε)-agreement with D Statistical complexity: How many samples needed? Computational complexity: Running time? Extensions: Small fraction of labels are wrong(adversarial noise) Real-valued labels y 2 [-1,1](metric regression) Machine Learning in Doubling Metrics Algorithmic Frontiers of Doubling Metric Spaces 24 2/L +1 f

25
Generalization Bounds Our approach: Assume M is doubling and use generalized VC-theory [Alon-BenDavid-CesaBianchi-Haussler’97, Bartlett-ShaweTaylor’99] Example: Earthmover distance (EMD) in the plane between sets of size k has ddim ≤ O(k log k) Standard algorithm: pick hypothesis that fits all/most observed samples Theorem: Class of L-Lipschitz functions has fat-shattering dimension fsdim ≤ (c∙L∙diam(M)) ddim. Corollary: If f is L-Lipschitz and classifies n samples correctly, WHP Pr D [sgn(f(x)) ≠ y] ≤ O(fsdim∙(log n) 2 /n). Similarly, if f correctly classifies all but η-fraction, then WHP Pr D [sgn(f(x)) ≠ y] ≤ η + O(fsdim∙(log n) 2 /n) 1/2. Bounds incomparable to [vonLuxburg-Bousquet’04] Algorithmic Frontiers of Doubling Metric Spaces 25

26
Algorithmic Aspects (noise-free) Computing a hypothesis f from the samples (x i,y i ): Where S + and S - are the positively and negatively labeled samples Lemma (Lipschitz extension): If labels are L-Lipschitz, so is f. Evaluating f(x) requires solving Nearest Neighbor Search Explains a common classification heuristic, e.g. [Cover-Hart’67] But might require Ω(n) time… We show how to use (1+ε)-Nearest Neighbor Search This can be solved quickly in doubling metrics We prove similar generalization bound by sandwiching sgn(f(x)) Algorithmic Frontiers of Doubling Metric Spaces 26 +1 f ?

27
Extensions (noisy case) 1. A small fraction of labels are wrong(adversarial noise) How to compute a hypothesis? Build a bipartite graph (on S + [ S - ) of all violations to Lipschitz condition (edge between two points at distance < 2/L). Compute a minimum vertex cover (or faster: 2-approximation) 2. Real-valued labels y 2 [-1,1](metric regression) Minimize risk (expected loss) E x,y |f(x)-y| Extend the statistical framework by similar ideas But how to compute a hypothesis? Write LP: minimize Σ i |f(x i )-y i | subject to |f(x i )-f(x j )| ≤ L∙d(x i,x j ) 8 i,j Reduce #constraints from O(n 2 ) to O(ε -ddim n) using (1+ε)-spanner on x i ’s Apply fast approximate LP solver Algorithmic Frontiers of Doubling Metric Spaces 27

28
Conclusion General paradigm: low-dim. Euclidean spaces $ doubling metric spaces Mathematically– latter is different (strictly bigger) family Not even low-distortion embeddings [Laakso’00,’01] For algorithmic efficiency – strong analogy/similarity E.g., nearest neighbor search, distributed computing and networking, combinatorial optimization, machine learning Research directions: Other computational tasks or application areas? Particularly in machine learning, data structures Scenarios where analogy fails? E.g. [Indyk-Naor’05] which uses random projections Other metric models? E.g. hyperbolic … Algorithmic Frontiers of Doubling Metric Spaces 28

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google