Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Slides:



Advertisements
Similar presentations
Hardness of testing 3- colorability in bounded degree graphs Andrej Bogdanov Kenji Obata Luca Trevisan.
Advertisements

Finding Cycles and Trees in Sublinear Time Oded Goldreich Weizmann Institute of Science Joint work with Artur Czumaj, Dana Ron, C. Seshadhri, Asaf Shapira,
Circuit and Communication Complexity. Karchmer – Wigderson Games Given The communication game G f : Alice getss.t. f(x)=1 Bob getss.t. f(y)=0 Goal: Find.
Lecture 22: April 18 Probabilistic Method. Why Randomness? Probabilistic method: Proving the existence of an object satisfying certain properties without.
1 NP-completeness Lecture 2: Jan P The class of problems that can be solved in polynomial time. e.g. gcd, shortest path, prime, etc. There are many.
NP-Completeness: Reductions
1 Partition Into Triangles on Bounded Degree Graphs Johan M. M. van Rooij Marcel E. van Kooten Niekerk Hans L. Bodlaender.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Combinatorial Algorithms
Approximation Algorithms for Unique Games Luca Trevisan Slides by Avi Eyal.
Christian Sohler | Every Property of Hyperfinite Graphs is Testable Ilan Newman and Christian Sohler.
Artur Czumaj Dept of Computer Science & DIMAP University of Warwick Testing Expansion in Bounded Degree Graphs Joint work with Christian Sohler.
Inapproximability from different hardness assumptions Prahladh Harsha TIFR 2011 School on Approximability.
Approximate Counting via Correlation Decay Pinyan Lu Microsoft Research.
Complexity 11-1 Complexity Andrei Bulatov NP-Completeness.
Asaf Shapira (Georgia Tech) Joint work with: Arnab Bhattacharyya (MIT) Elena Grigorescu (Georgia Tech) Prasad Raghavendra (Georgia Tech) 1 Testing Odd-Cycle.
Testing of ‘massively parametrized problems’ - Ilan Newman Haifa University Based on joint work with: Sourav Chakraborty, Eldar Fischer, Shirley Halevi,
Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron.
A Linear Round Lower Bound for Lovasz-Schrijver SDP relaxations of Vertex Cover Grant Schoenebeck Luca Trevisan Madhur Tulsiani UC Berkeley.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
Approximation Algorithms Lecture for CS 302. What is a NP problem? Given an instance of the problem, V, and a ‘certificate’, C, we can verify V is in.
Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron.
Implicit Hitting Set Problems Richard M. Karp Harvard University August 29, 2011.
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
On Proximity Oblivious Testing Oded Goldreich - Weizmann Institute of Science Dana Ron – Tel Aviv University.
EXPANDER GRAPHS Properties & Applications. Things to cover ! Definitions Properties Combinatorial, Spectral properties Constructions “Explicit” constructions.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.
1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.
2-Layer Crossing Minimisation Johan van Rooij. Overview Problem definitions NP-Hardness proof Heuristics & Performance Practical Computation One layer:
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.
CSE 421 Algorithms Richard Anderson Lecture 27 NP Completeness.
Lower Bounds for Property Testing Luca Trevisan U C Berkeley.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
1 Slides by Asaf Shapira & Michael Lewin & Boaz Klartag & Oded Schwartz. Adapted from things beyond us.
Hardness Results for Problems
Finding Cycles and Trees in Sublinear Time Oded Goldreich Weizmann Institute of Science Joint work with Artur Czumaj, Dana Ron, C. Seshadhri, Asaf Shapira,
Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.
Dana Moshkovitz, MIT Joint work with Subhash Khot, NYU.
Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Graph limit theory: Algorithms László Lovász Eötvös Loránd University, Budapest May
NP-complete Problems SAT 3SAT Independent Set Hamiltonian Cycle
Approximation Algorithms
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Testing the independence number of hypergraphs
Optimization in very large graphs László Lovász Eötvös Loránd University, Budapest December
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013.
Unit 9: Coping with NP-Completeness
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
1/19 Minimizing weighted completion time with precedence constraints Nikhil Bansal (IBM) Subhash Khot (NYU)
Lecture 6 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.
Non-Approximability Results. Summary -Gap technique -Examples: MINIMUM GRAPH COLORING, MINIMUM TSP, MINIMUM BIN PACKING -The PCP theorem -Application:
NP-Complete problems.
Artur Czumaj DIMAP DIMAP (Centre for Discrete Maths and it Applications) Computer Science & Department of Computer Science University of Warwick Testing.
Lecture 25 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
NP Completeness Piyush Kumar. Today Reductions Proving Lower Bounds revisited Decision and Optimization Problems SAT and 3-SAT P Vs NP Dealing with NP-Complete.
Final Review Chris and Virginia. Overview One big multi-part question. (Likely to be on data structures) Many small questions. (Similar to those in midterm.
CHAPTER SIX T HE P ROBABILISTIC M ETHOD M1 Zhang Cong 2011/Nov/28.
Property Testing (a.k.a. Sublinear Algorithms )
Lower Bounds for Property Testing
Polynomial integrality gaps for
Approximating the MST Weight in Sublinear Time
From dense to sparse and back again: On testing graph properties (and some properties of Oded)
Presentation transcript:

Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Sub-linear Time Algorithms This talk: – algorithms that run in less than linear time (cannot read entire input). –No pre-preprocessing. (Unstructured data) –Must be probabilistic and approximate For optimization problems: –Compute numerical apx of optimum cost (and implicit representation of apx solution?) For decision problems: –What is approximation for decision problems?

(Graph) Property Testing Testing a property P with accuracy  in adjacency matrix representation: Given graph G that has property P, accept with probability >3/4 Given graph G that is  -far from property P accept with probability <1/4  -far = must change  –fraction of adjacency matrix to get property P (add/remove >  n 2 edges)

Example [GGR,AK] Testing bipartiteness of a given graph G Pick (1/  )polylog(1/  ) vertices, and check if they induce a bipartite graph; if so accept otherwise reject If G is bipartite then alg accepts with prob 1 If G is  -far from bipartite, then whp algorithm discovers an odd cycle (non-trivial to prove) Running time: O ((1/   )polylog(1/  ))

Paleontologist’s approach

Lower Bounds [BT] Alon-Krivelevich’s algorithm – has one-sided error, is non-adaptive and has running time (1/  2 )polylog(1/  ) Lower Bounds: –  (1/  2 ) for non-adaptive algorithms –  (1/  1.5 ) for adaptive algorithms –Both results hold even for two-sided error

Two Distributions Gfar: every edge exists with probability  –whp it is  /3-far from bipartite Gbip: pick a random partition, then every edge that crosses the partition exists with probability 2  Indistinguishable by non-adaptive algorithms making o(1/  2 ) queries Indistinguishable by adaptive algorithms making o(1/  1.5 ) queries

Bounded Degree Graphs Testing a property P with accuracy  in adjacency lists representation: Given graph G that has property P, accept with probability >3/4 Given graph G that is  -far from property P accept with probability <1/4  -far = must change  –fraction of adjacency lists entries to get property P (add/remove >  dn edges)

Bipartiteness [GR] Testing bipartiteness Repeat polylog n times: –Start at random point, and pick sqrt(n) random walks of length polylog n, if two of them combine to form an odd cycle reject, otherwise accept Analysis: –in a graph where you need to remove constant fraction of edges to make it bipartite, algorithm finds odd cycle

Matching Lower Bound [GR] Define two distributions of graphs: –Gfar: a random hamiltonian circuit, plus a random matching (whp 1/100-far from bipartite) –Gbip: a random hamiltonian circuit, plus a random matching conditioned on making the graph bipartite Gfar and Gbip are indistinguishable by algorithms of query complexity o(sqrt(n)).

Sublinear Time Approximation Problems restricted to dense instances: Max CUT and other graph problems can be approximated within (1+  ) in graphs with at least  n 2 edges in time 2 poly(1/  ) [GGR] Max 3SAT can be approximated within (1+  ) in instances with at least  n 3 clauses in time 2 poly(1/  ) and similar results for other satisfiability problems [AFKK]

Sub-linear Time Approximation Problems on bounded-degree instances Minimum spanning tree –given a connected weighted graph of degree d with weights in range {1,…,w}, can approximate MST weight within (1+  ) in time about O(dw/  2 ) [Chazelle, Rubinfeld, T]

General Goals When looking for polynomial-time algorithms: –Several algorithmic techniques of general applicability –A general technique to “prove” impossibility (NP-completeness) For sublinear-time algorithms: –General algorithmic techniques? –Impossibility results?

Dense Graphs Some general algorithmic results All problems with a certain logical representation testable in time dependent only on  [AFKS] All regular languages testable in time dependent only on  [AFNS] Only one one-sided error algorithm [GT] (pick a random subgraph and check it is consistent with the property) –Adaptivity does not help –“Only one algorithm” result also for 2-sided error. Few lower bounds

Bounded-Degree Graphs Fewer and less general algorithms. Some results are different from dense case adaptivity helps –No property testable with o(sqrt(n)) queries non- adaptive queries. Several problems testable with O(1) adaptive queries. 2-sided better than 1-sided for natural monotone properties –Property “being a forest” has no o(sqrt(n)) one-sided algorithm, but has O(1) two-sided algorithm Few lower bounds

Testing 3-Colorability Easy in adjacency matrix representation NP-hard in adjacency list representation Only for small enough  –Can find 3-coloring good for 80% of the edges in a 3- colorable graph using SDP –NP-hard to find 3-coloring good for 98% (?) fraction of edges Implies non-tight, and conditional, lower bound for query complexity

Other problems The query complexity of following problems is equivalent to query complexity of testing 3col –Testing satisfiability of 3SAT instance Every variable occurs in O(1) clauses, “adjacency list” representation –Approximating max cut, vertex cover, independent set,..., in bounded-degree graphs –Approximating Max SAT, Max 2SAT,... Lower bound of sqrt(n) for all problems implied by [GR] lower bound for testing bipartiteness

Some Results from [BOT] For one-sided error algorithms: –  (n) query complexity to distinguish 3-colorable graphs from graphs that are (1/3 –  )-far –Lower bound applies to testing problems that are solvable in polynomial time For two-sided error algorithms: –For some ,  (n) query complexity to distinguish 3-colorable graphs from graphs that are  -far.

Additional Results Unconditionally, algorithms running in time o(n) cannot: –Approximate Max 3SAT better than 7/8 –Approximate Max Cut in bounded-degree graphs better than 16/17 –... Hastad’97 proved above problems are NP-hard

The 3-Coloring Lower Bound Consider first one-sided error algorithms It’s enough to find a graph G that is (1/3 –  )-far from 3-colorable, but every subgraph of size <  n is 3-colorable –(for every  there is an  such that...) Then an algorithm of query complexity <  n either accepts G (which is wrong) or rejects some 3-colorable graph (which means the algorithm has not one-sided error)

The Graph Pick a graph of degree O(1/  2 ) at random (pick so many random matchings) Then it is (1/3 –  )-far whp But, for some , whp, every subgraph induced by k <  n vertices contains <1.5k edges In a minimal non-3-colorable graph, every vertex has degree at least 3 Every subgraph induced by <  n vertices is 3- colorable [Erdos]

Explicit Construction Can the previous construction be derandomized? For constants d, , , and for every suff large n, we can explicitly construct a graph –on n vertices, with max degree d, –  -far from 3-colorable, –every subset of  n vertices induces a 3-colorable subgraph.

Explicit Construction We construct a 3SAT formula such that for constants k,  ’,  ’ –Every variable occurs k times –No assignment satisfies more than 1-  ’ fraction of clauses –Every  ’ fraction of clauses is satisfiable –Then we use (slightly new) reduction from 3SAT to 3Coloring

The Formula Fix a degree-d expander graph G=(V,E) such that for every cut (S,V-S) at least min{|S|,|V-S|} edges cross the cut (enough d=14) Have two variables x uv and x vu for each egde (u,v) For every vertex v have the (3SAT equivalent of) the constraint –  u x uv = 1 +  w x vw

Structure of the Analysis Impossible to satisfy more than a fraction 1/(d+1) of the constraints Can always satisfy half of the constraint –define an auxiliary network –show that the auxiliary network has no small cut because of expansion –then there is a large flow –use large flow to find assignment for subset of constraint

Flow Argument Want to satisfy constraints corresponding to vertices in C, with |C| < |V|/2 s t V-C C Construct flow network with new source s, sink t obtained by collapsing V-C, and vertices in C

Flow Argument s A C-A t |A| edges |C-A| edges Every cut has size at least |C| There is a 0/1 flow of cost at least |C| Interpreted as an assignment, satisfies all constraints in C

Two-Sided Error Algorithms Need to define two distributions of graphs Gcol and Gfar such that: Graphs in Gcol are (almost) always 3-colorable Graphs in Gfar are (almost) always far from 3-colorable To an algorithm of bounded query complexity, Gcol and Gfar look (almost) the same

Main Step Define two distributions Dsat and Dfar of instances of E3LIN-2 (systems over GF(2) with 3 variables per equation) –Systems in Dsat are always satisfiable –Systems in Dfar are (almost) always (1/2-  )-far from satisfiable –To an algorithm of bounded query complexity, Dsat and Dfar look the same We get Gcol and Gfar using reduction from approximate E3LIN-2 to approximate 3-coloring

E3LIN-2 X1 + X3 + X10 = 0 mod 2 X2 + X3 + X4 = 1 mod 2 X1 + X2 + X9 = 0 mod 2...

Main Building Block We show that for every c there is  such that there exists a left-hand side with –n variables, cn equations, 3 variables per equations, every variable occurs in 3c equations –every  n equations are linearly independent Pick the left-hand side at random –repeat 3c times: pick at random a set of n/3 disjoint triples of variables Explicit construction?

Distributions The left-hand side is always as before In Dsat, we pick a random assignment to the variables, and set right-hand side consistently –always satisfiable In Dfar, we pick the right-hand side uniformly at random –With high probability, (1/2 – O(1/sqrt c))-far

Indistinguishability Two distributions differ only in right-hand side In Dfar uniformly distributed In Dsat,  n-wise independent –Linear independence implies statistical independence Look the same to algorithm that sees less than  n equations

Conclusion of the Argument No algorithm of “query complexity” o(n) can distinguish satisfiable instances of E3LIN-2 from instances that are (1/2-  )-far from satisfiable For some , no algorithm of query complexity o(n) can distinguish 3-colorable graphs from graphs that  –far from 3-col. No algorithm of query complexity o(n) can approximate Max 3SAT better than 7/8...

Open Questions Show that distinguishing 3-colorable graphs from (1/3-  )-far graphs requires query complexity  (n) –we can only prove it for one-sided error Show that approximating Max SAT better than ¾ and Max CUT bettter than ½ requires query complexity  (n) –we only know  (sqrt(n)) [implicit in GR] –would “explain” why we need SDP

Some more open questions In adjacency matrix representation, most interesting problems solvable in constant (in  ) time For some problems (eg testing triangle- freeness) analysis uses Szemeredy’s regularity lemma, and constant is hyper-exponential in  Lower bound (1/  ) log 1/  and only and for one- sided error Alternative analysis / stronger lower bounds?