Extremal properties of polynomial threshold functions Ryan O’Donnell (MIT / IAS) Rocco Servedio (Columbia)

Slides:



Advertisements
Similar presentations
The Polynomial Method In Quantum and Classical Computing Scott Aaronson (MIT) OPEN PROBLEM.
Advertisements

SPEED LIMIT n Quantum Lower Bounds Scott Aaronson (UC Berkeley) August 29, 2002.
Lower Bounds for Local Search by Quantum Arguments Scott Aaronson.
Algebrization: A New Barrier in Complexity Theory
Impagliazzos Worlds in Arithmetic Complexity: A Progress Report Scott Aaronson and Andrew Drucker MIT 100% QUANTUM-FREE TALK (FROM COWS NOT TREATED WITH.
How to Solve Longstanding Open Problems In Quantum Computing Using Only Fourier Analysis Scott Aaronson (MIT) For those who hate quantum: The open problems.
Oracles Are Subtle But Not Malicious Scott Aaronson (no affiliation)
Scott Aaronson Institut pour l'Étude Avançée Le Principe de la Postselection.
Oracles Are Subtle But Not Malicious Scott Aaronson University of Waterloo.
Parikshit Gopalan Georgia Institute of Technology Atlanta, Georgia, USA.
Computing with Polynomials over Composites Parikshit Gopalan Algorithms, Combinatorics & Optimization. Georgia Tech.
Polynomial Interpolation over Composites. Parikshit Gopalan Georgia Tech.
On the degree of symmetric functions on the Boolean cube Joint work with Amir Shpilka.
New degree bounds for polynomials with prescribed signs Ryan ODonnell (MIT) Rocco Servedio (Harvard/Columbia)
LEARNIN HE UNIFORM UNDER DISTRIBUTION – Toward DNF – Ryan ODonnell Microsoft Research January, 2006.
Ryan ODonnell Carnegie Mellon University Karl Wimmer CMU & Duquesne Amir Shpilka Technion Rocco Servedio Columbia Parikshit Gopalan UW & Microsoft SVC.
Property Testing and Communication Complexity Grigory Yaroslavtsev
Lower Bounds for Testing Properties of Functions on Hypergrids Grigory Yaroslavtsev Joint with: Eric Blais (MIT) Sofya Raskhodnikova.
Computational Applications of Noise Sensitivity Ryan O’Donnell.
Lower bounds for small depth arithmetic circuits Chandan Saha Joint work with Neeraj Kayal (MSRI) Nutan Limaye (IITB) Srikanth Srinivasan (IITB)
Ryan O’Donnell - Microsoft Mike Saks - Rutgers Oded Schramm - Microsoft Rocco Servedio - Columbia.
Learning Juntas Elchanan Mossel UC Berkeley Ryan O’Donnell MIT Rocco Servedio Harvard.
1 Cell Probe Complexity - An Invitation Peter Bro Miltersen University of Aarhus.
Great Theoretical Ideas in Computer Science for Some.
Learning intersections and thresholds of halfspaces Adam Klivans (MIT/Harvard) Ryan O’Donnell (MIT) Rocco Servedio (Harvard)
Yi Wu (CMU) Joint work with Parikshit Gopalan (MSR SVC) Ryan O’Donnell (CMU) David Zuckerman (UT Austin) Pseudorandom Generators for Halfspaces TexPoint.
Learning, testing, and approximating halfspaces Rocco Servedio Columbia University DIMACS-RUTCOR Jan 2009.
On the Hardness of Graph Isomorphism Jacobo Tor á n SIAM J. Comput. Vol 33, p , Presenter: Qingwu Yang April, 2006.
On the Fourier Tails of Bounded Functions over the Discrete Cube Irit Dinur, Ehud Friedgut, and Ryan O’Donnell Joint work with Guy Kindler Microsoft Research.
Exact Learning of Boolean Functions with Queries Lisa Hellerstein Polytechnic University Brooklyn, NY AMS Short Course on Statistical Learning Theory,
A Brief Introduction To The Theory of Computer Science and The PCP Theorem By Dana Moshkovitz Faculty of Mathematics and Computer Science The Weizmann.
Complexity Theory: The P vs NP question Lecture 28 (Dec 4, 2007)
2.5 Zeros of Polynomial Functions
Complexity Theory Lecture 5 Lecturer: Moni Naor. Recap Last week: Probabilistic Space and Time Complexity Undirected Connectivity is in randomized logspace.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Multilinear NC 1  Multilinear NC 2 Ran Raz Weizmann Institute.
Quantum Computing MAS 725 Hartmut Klauck NTU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A.
Endre Szemerédi & TCS Avi Wigderson IAS, Princeton.
Logic Circuits Chapter 2. Overview  Many important functions computed with straight-line programs No loops nor branches Conveniently described with circuits.
Algebrization: A New Barrier in Complexity Theory Scott Aaronson (MIT) Avi Wigderson (IAS) 4xyw-12yz+17xyzw-2x-2y-2z-2w IP=PSPACE MA EXP  P/poly MIP=NEXP.
COMPSCI 102 Introduction to Discrete Mathematics.
3.4 Zeros of Polynomial Functions. The Fundamental Theorem of Algebra If f(x) is a polynomial of degree n, where n>0, then f has at least one zero in.
PhD Projects Rahul Santhanam University of Edinburgh.
My Favorite Ten Complexity Theorems of the Past Decade II Lance Fortnow University of Chicago.
Some Properties of Switching Games on oriented Matroids. Adrien Vieilleribière. LRI Université Paris-Sud. joint work with David Forge. LRI Université Paris-Sud.
Boolean Functions x 2 x 3 x f mapping truth table.
Computing Boolean Functions: Exact Quantum Query Algorithms and Low Degree Polynomials Alina Dubrovska, Taisia Mischenko-Slatenkova University of Latvia.
Lecture 14 Lower Bounds Decision tree model Linear-time reduction.
Smooth Boolean Functions are Easy: Efficient Algorithms for Low-Sensitivity Functions Rocco Servedio Joint work with Parikshit Gopalan (MSR) Noam Nisan.
Optimization/Decision Problems Optimization Problems – An optimization problem is one which asks, “What is the optimal solution to problem X?” – Examples:
Degrees of a Monomial. Degree of a monomial: Degree is the exponent that corresponds to the variable. Examples: 32d -2x 4 16x 3 y 2 4a 4 b 2 c 44 has.
1 IAS, Princeton ASCR, Prague. The Problem How to solve it by hand ? Use the polynomial-ring axioms ! associativity, commutativity, distributivity, 0/1-elements.
Random projections and depth hierarchy theorems
Umans Complexity Theory Lectures
College Algebra Chapter 3 Polynomial and Rational Functions
Dana Ron Tel Aviv University
Lecture 2-2 NP Class.
Circuit Lower Bounds A combinatorial approach to P vs NP
Tight Fourier Tails for AC0 Circuits
My Favorite Ten Complexity Theorems of the Past Decade II
Resolution over Linear Equations: (Partial) Survey & Open Problems
1. Use the quadratic formula to find all real zeros of the second-degree polynomial
CSE838 Lecture notes copy right: Moon Jung Chung
Learning, testing, and approximating halfspaces
including joint work with:
Algorithms Lecture #43 Dr.Sohail Aslam.
Algorithms Lecture #15 Dr.Sohail Aslam.
Switching Lemmas and Proof Complexity
Recent Structure Lemmas for Depth-Two Threshold Circuits
Presentation transcript:

Extremal properties of polynomial threshold functions Ryan O’Donnell (MIT / IAS) Rocco Servedio (Columbia)

Representing boolean functions Complexity theory studies dozens of different representations for boolean functions: circuits: boolean, algebraic, threshold; formulas, low-depth variants decision trees branching programs switching networks polynomials over various fields monotone span programs contact networks

Extremal bounds For each representation, one can ask, “What is the “size” of the hardest boolean function, or of a random function?” Often a fairly easy problem: upper bound by “trivial” construction, lower bound by counting. E.g., for circuit size: [Lupanov-58] Every function has a circuit of size (1+o(1))2 n /n. [Shannon-49] Almost every function requires circuits of size 2 n /n.

Polynomial Threshold Functions Let f : {+1,-1} n → {+1,-1} be a boolean fcn. Let p : R n → R a multilinear polynomial. We say that p is a polynomial threshold function (PTF) for f, or p sign-represents f, if: f(x) = sgn(p(x)) for all x  {+1,-1} n. -See the excellent survey “Slicing the hypercube” [Saks-93]. -PTFs correspond to the circuit class Threshold-Of-Parities.

PTF examples AND: x 1 + x 2 + · · · + x n + (n-1) OR: x 1 + x 2 + · · · + x n – (n-1) Majority:x 1 + x 2 + · · · + x n Parity:x 1 x 2 · · · x n (x 1  x 2 )  x 3 :100 x 1 x 2 + x 3 – 100 There are two main size measures for PTFs: degree – number of vbls. in biggest monomial (between 0 and n) density– number of monomials (between 1 and 2 n )

Why PTFs? natural algebraic model of complexity degree upper bounds:  machine learning algorithms [Klivans-S-01, O-S-03]  PP closed under intersection [Beigel-Reingold-Spielman-95] simultaneous degree/density lower bounds:  oracle separations (e.g., P NP ≠ PP A, [Beigel-94]) degree lower bounds:  quantum decision tree lower bounds A

The PTF extremal problem Also, the PTF extremal problem is interesting! Are there functions that require PTF degree n? Do most functions have PTF degree << n? Does every function have PTF density somewhat smaller than 2 n ? Are there functions that require PTF density close to 2 n ?

Results in this talk In this talk I will discuss two of our results: 1.Degree upper bound: Almost every boolean function has PTF degree at most n/2 + O(√n log n). 2.Density upper bound: Every boolean function has PTF density at most (1 – O(n) ) 2 n. 1

Results not in this talk def: We say p is a weak PTF for f if, for all x  {+1,-1} n, either p(x) = 0 or sgn(p(x)) = f(x). (Also, p is not allowed to be identically 0!) Saks asked whether almost all functions require weak PTF density (½ - ε) 2 n. In fact, we show every function has weak PTF density o(1)2 n (Ramsay theory). We show a couple other bounds…

PTF Degree Bounds

Degree bounds: previous results [Minsky-Papert-68]: Parity and its negation require PTF degree n. [Aspnes-Beigel-Furst-Rudich-94] show these are the only such functions. [Wang-Williams-91], [ABFR-94]: Conjecture: almost every function has PTF degree  n/2  or  n/2 . Lower bound of  n/2  shown by a counting argument [Anthony-92], [Alon-93] based on a result of [Cover-65].

Progress on the upper bound Towards the upper bound: [Razborov-Rudich-94] showed almost every function has PTF degree.95 n. [Alon-93] observed that the work of [Gotsman-89] implies a PTF degree upper bound of.89 n. We show the conjecture is true up to lower- order terms: Thm: Almost every function has PTF degree n/2 + O(√n log n).

Fourier detour It’s known that any function f : {+1,-1} n → R can be exactly represented as f(x) = Σ f (S) x S, where the f (S)’s are real constants, and the monomial x S is Π x i. This is known as the Fourier representation. Parseval’s identity: Σ f (S) 2 = Σ f(x) 2 / 2 n. S  [n] iSiS x  {+1,-1} n S  [n]

Our degree upper bound We actually show a stronger fact: Thm: Let S be any collection of (1-1/n )2 n monomials. Then a.e. function has a PTF over these monomials. Cor: Almost every function has a PTF of degree n/2 + O(√n log n). Proof: For each z  {+1,-1} n, let δ z : {+1,-1} n → R be the “Dirac delta function,” δ z (z) = 2 n, δ z (x) = 0 for x ≠ z.

Proof sketch continued Random ± 2 n functions are made by forming Σ f(z) δ z (x), where f(z)’s are coin tosses. The function δ z (x) has a simple Fourier representation: δ z (x) = Σ z S x S. Suppose we “approximate” each δ z by deleting the summands outside S : δ' z (x) = Σ z S x S. z  {+1,-1} n S  [n] S  S

2n2n 0 {+1,-1} n z δz(·)δz(·) |S ||S | ±(1/n) 2 n {+1,-1} n z δ'z(·)δ'z(·) 2 n δ z (x) = +1 +x 1 - x 2 +x 3 - x 1 x 2 +x 1 x 3 - x 2 x 3 + · · ·

Proof sketch continued We want to show that for any particular x, w.v.h.p, Σ f(z) δ z (x) and Σ f(z) δ' z (x) have the same sign. (Then union bd. over x.) Taking the z = x summand starts the sum off with | S | f(x) = (1-1/n)2 n f(x) – good shape so far. You get noise terms for all other z. But…! Key point: These are summed with random ± signs, so they get “dampened”. z  {+1,-1} n

Proof sketch completed To show that a random ± sum of quantities – { δ' z (x) : z ≠ x} – is small w.h.p., the key is to show a) the #’s are bounded, and b) the sum of squares (variance) is small. Both come easily: each # is at most (1/n)2 n (in abs. val.), and the sum of the squares is easily calculated exactly using Parseval’s equation: independently of x, it’s equal to (1/n–1/n 2 ) 2 2n. SD ≈ (1/√n)2 n. Hence Hoeffding  |error| <.5 2 n w.v.h.p.

PTF Density Bounds

Density bounds: previous results [Gotsman-89] showed that every boolean function has PTF density at most 2 n – 2 n/2. [Saks-93] observed that [Cover-65] implies that almost every boolean function requires PTF density at least.11 2 n. Our thm: Every boolean function has PTF density at most (1-1/O(n)) 2 n. We get to omit a 1/O(n) fraction of monomials, compared to [G89]’s 1/2 n/2.

Proof sketch: density upper bound Let f : {+1,-1} n → {+1,-1} be any boolean fcn. Let: L 1 (f) = Σ | f (S)|. Since Σ f (S) 2 = Σ f (S) 2 = 1 (Parseval), by Cauchy-Schwarz, L 1 (f) ≤ 2 n/2. [Bruck-Smolensky-92] shows that f always has a PTF of density 2n L 1 (f) 2. So we’re already done unless, say, L 1 (f) ≥ (1/n) 2 n/2. S  [n]

Proof sketch continued If L 1 (f) is very close to its upper bound, 2 n/2, then its coefficients must be very “spread out”: a handful may be “large,” but almost all must be close to 2 -n/2. Recall: f(x) = Σ f (S) x S. Let L be the set of coefficients that are “small.” Fix x. We show that if you omit a random selection of (1/O(n)) 2 n terms from L, the sum of what you omit is smaller than 1 w.p. 1 – 2 -2n. S  [n]

f(x) = Σ f (S) x S We’re adding up N ≈ 2 n numbers, f (S) x S. Each is not much more than ± 2 -n/2 = ± 1/√N. Their mean is very small – around ± log(N)/N: Had we summed over all S we would have gotten f(x) = ± 1; we omitted few terms. Hence (Hoeffding) if we sum a random subset of size N/log(N), the result has magnitude at most 1 w.p. at most 1/N 2. Proof sketch: completed S  L

Open problems For the problem of degree, the conjecture of Wang & Williams and ABFR is still open: Is the PTF degree of almost every function as low as  n/2  ? For the problem of density, we’re not even sure where the right answer lies:.11 2 n … (1-1/O(n)) 2 n. Our conjecture: Almost every function has PTF density.5 2 n.