Optimization in very large graphs László Lovász Eötvös Loránd University, Budapest December 20121
2 How is the graph given? - Graph is HUGE. - Not known explicitly, not even the number of nodes. Idealize: define minimum amount of info.
December Dense case: cn 2 edges. - We can sample a uniform random node a bounded number of times, and see edges between sampled nodes. Bounded degree ( d) - We can sample a uniform random node a bounded number of times, and explore its neighborhood to a bounded depth. How is the graph given? „Property testing”: Goldreich-Goldwasser-Ron, Arora-Karger-Karpinski, Rubinfeld-Sudan, Alon-Fischer-Krivelevich-Szegedy, Fischer, Frieze-Kannan, Alon-Shapira
December Algorithms for very large graphs Parameter estimation: edge density, triangle density, maximum cut Property testing: is the graph bipartite? triangle-free? perfect? Computing a structure: find a maximum cut, regularity partition,...
December The maximum cut problem maximize Applications: optimization, statistical mechanics… NP-hard, even with 6% error Hastad Polynomial-time computable with 13% error Goemans-Williamson
December Max cut in dense graphs How to estimate the density? cut with many edges Sampling O( -4 ) nodes the density of max cut in the induced subgraph is closer than to the density of the max cut in the whole graph with high probability. Alon-Fernandez de la Vega -Kannan-Karpinski
A graph parameter f can be estimated from samples iff (a) f(G(k)) is convergent as k (b)If V(G n )=V(H n )=[n] and then f(G n )-f(H n ) 0 (c) If |V(G n )| , v V(G n ), then f(G n )-f(G n \v) 0 Density of maximum cut is estimable. Estimable graph parameters December blow up every node cut distance Borgs-Chayes-L-Sós-Vesztergombi
Nondeterministic parameter estimation Divine help: coloring nodes and edges, orienting edges L: directed, (edge)-colored graph L’: forget orientation, delete some colors, forget coloring; “shadow of G” December f’(G) = max {f(L): L’ = G} ; “shadow of f” f: parameter defined on colored oriented graphs If f is estimable, then f’ is estimable. L.-Vesztergombi
Testable graph properties P testable: for every ε > 0 there is a k 0 ≥ 1 and a there is a „test property” P ’, such that (a)for every graph G ∈ P and every k ≥ k 0, (k,G) ∈ P ′ with probability at least 2/3, and (b) for every graph G with d 1 (G, P ) > ε and every k ≥ k 0 we have (k,G) ∈ P ′ with probability at most 1/3. P : graph property December 20129
Testable graph properties Example: triangle-free Removal Lemma: ’ if t( ,G)< ’, then we can delete n 2 edges to get a triangle-free graph. Ruzsa - Szemerédi G’: sampled induced subgraph G’ not triangle-free G not triangle free G’ triangle-free with high probability, G has few triangles December
Testable graph properties Every hereditary graph property is testable Alon-Shapira inherited by induced subgraphs December
Nondeterministically testable graph properties Q : property of directed, colored graphs Q ’={G’: G Q }; “shadow of Q ” P nondeterministically testable: P = Q ’, where Q is a testable property of colored directed graphs. December Divine help: coloring nodes and edges, orienting edges L: directed, (edge)-colored graph L’: forget orientation, delete some colors, forget coloring; “shadow of L”
Nondeterministically testable graph properties Examples: maximum cut contains at least n 2 /100 edges contains a spanning subgraph with a testable property P we can delete n 2 /100 edges to get a graph with a testable property P December
N=NP for dense property testing Nondeterministically testable graph properties Every nondeterministically testable graph property is testable. L-Vesztergombi Proof via graph limit theory: pure existence proof of an algorithm... December
December Computing a structure: representative set Representative set of nodes: bounded size, (almost) every node is “similar” to one of the nodes in the set When are two nodes similar? Neighbors? Same neighborhood?
December This is a metric, computable in the sampling model Similarity distance of nodes s t v w u Representative set U: for any two nodes in U, d sim (s,t) > for most nodes s, d sim (U,s)
December Every graph contains a representative set with at most nodes. Strong representative set U: for any two nodes in U, d sim (s,t) > for every node, d sim (U,v) Every graph contains a strong representative set with at most nodes. Alon Similarity distance of nodes
December Voronoi diagram = weak regularity partition Representative set and regularity partition
December Max cut in dense graphs What answer to expect? - Cannot list all nodes on one side For any given node, we want to tell on which side of the cut it lies (after some preprocessing)
December Construct representative set U How to compute a (weak) regularity partition? Each node is in same class as closest representative.
December Construct representative set U - Compute p ij = density between Voronoi cells V i and V j (use sampling) - Compute max cut (U 1,U 2 ) in complete graph on U with edge-weights p ij How to compute the maximum cut? (Different algorithm implicit by Frieze-Kannan.) Each node is on same side as closest representative.
December Bounded degree ( d) - We can sample a uniform random node a bounded number of times, and explore its neighborhood to a bounded depth. Algorithms for bounded degree graphs
December 2012 Bad examples G n : random 3-regular graph F n : random 3-regular bipartite graph H n : G n G n Large girth graphs (except for a small part) Expander graphs (except for a small part) 23
December Algorithms for bounded degree graphs Maximum cut cannot be estimated in this model (random d-regular graph vs. random bipartite d-regular graph) P NP in this model (random d-regular graph vs. union of two random d-regular graphs)
December Algorithms for bounded degree graphs Cellular automata (1970’s): network of finite automata Distributed computing (1980’s): agent at every node, bounded time Constant time algorithms (2000’s): bounded number of nodes sampled, explored at bounded depth
December Distributed computing model
December Algorithms for bounded degree graphs (Almost) maximum matching can be computed in bounded time. Nguyen-Onak (Almost) maximum flow and (almost) minimum cut can be computed in bounded time. Csóka
December Distributed computing model
December Algorithms for bounded degree graphs (Almost) maximum matching can be computed in bounded time. Nguyen-Onak (Almost) maximum flow and (almost) minimum cut can be computed in bounded time. Csóka need local random bits no random bits need global random bits
December Algorithms for bounded degree graphs Suppose you tell the agents any information about the graph. In the presence of global random bits, this does not help! Csóka
December Open problems ● Can an (almost) maximum independent node set be computed in a random d-regular graph in this model? ● Can an (almost) maximum cut be computed in a random d-regular graph in this model? (G n )/n tends to a limit with probability 1 MaxCut(G n )/n tends to a limit with probability 1 Bayati-Gamarnik-Tetali 2010
December