Download presentation

Presentation is loading. Please wait.

Published byDeja Skillings Modified about 1 year ago

1
Extremal properties of polynomial threshold functions Ryan O’Donnell (MIT / IAS) Rocco Servedio (Columbia)

2
Representing boolean functions Complexity theory studies dozens of different representations for boolean functions: circuits: boolean, algebraic, threshold; formulas, low-depth variants decision trees branching programs switching networks polynomials over various fields monotone span programs contact networks

3
Extremal bounds For each representation, one can ask, “What is the “size” of the hardest boolean function, or of a random function?” Often a fairly easy problem: upper bound by “trivial” construction, lower bound by counting. E.g., for circuit size: [Lupanov-58] Every function has a circuit of size (1+o(1))2 n /n. [Shannon-49] Almost every function requires circuits of size 2 n /n.

4
Polynomial Threshold Functions Let f : {+1,-1} n → {+1,-1} be a boolean fcn. Let p : R n → R a multilinear polynomial. We say that p is a polynomial threshold function (PTF) for f, or p sign-represents f, if: f(x) = sgn(p(x)) for all x {+1,-1} n. -See the excellent survey “Slicing the hypercube” [Saks-93]. -PTFs correspond to the circuit class Threshold-Of-Parities.

5
PTF examples AND: x 1 + x 2 + · · · + x n + (n-1) OR: x 1 + x 2 + · · · + x n – (n-1) Majority:x 1 + x 2 + · · · + x n Parity:x 1 x 2 · · · x n (x 1 x 2 ) x 3 :100 x 1 x 2 + x 3 – 100 There are two main size measures for PTFs: degree – number of vbls. in biggest monomial (between 0 and n) density– number of monomials (between 1 and 2 n )

6
Why PTFs? natural algebraic model of complexity degree upper bounds: machine learning algorithms [Klivans-S-01, O-S-03] PP closed under intersection [Beigel-Reingold-Spielman-95] simultaneous degree/density lower bounds: oracle separations (e.g., P NP ≠ PP A, [Beigel-94]) degree lower bounds: quantum decision tree lower bounds A

7
The PTF extremal problem Also, the PTF extremal problem is interesting! Are there functions that require PTF degree n? Do most functions have PTF degree << n? Does every function have PTF density somewhat smaller than 2 n ? Are there functions that require PTF density close to 2 n ?

8
Results in this talk In this talk I will discuss two of our results: 1.Degree upper bound: Almost every boolean function has PTF degree at most n/2 + O(√n log n). 2.Density upper bound: Every boolean function has PTF density at most (1 – O(n) ) 2 n. 1

9
Results not in this talk def: We say p is a weak PTF for f if, for all x {+1,-1} n, either p(x) = 0 or sgn(p(x)) = f(x). (Also, p is not allowed to be identically 0!) Saks asked whether almost all functions require weak PTF density (½ - ε) 2 n. In fact, we show every function has weak PTF density o(1)2 n (Ramsay theory). We show a couple other bounds…

10
PTF Degree Bounds

11
Degree bounds: previous results [Minsky-Papert-68]: Parity and its negation require PTF degree n. [Aspnes-Beigel-Furst-Rudich-94] show these are the only such functions. [Wang-Williams-91], [ABFR-94]: Conjecture: almost every function has PTF degree n/2 or n/2 . Lower bound of n/2 shown by a counting argument [Anthony-92], [Alon-93] based on a result of [Cover-65].

12
Progress on the upper bound Towards the upper bound: [Razborov-Rudich-94] showed almost every function has PTF degree.95 n. [Alon-93] observed that the work of [Gotsman-89] implies a PTF degree upper bound of.89 n. We show the conjecture is true up to lower- order terms: Thm: Almost every function has PTF degree n/2 + O(√n log n).

13
Fourier detour It’s known that any function f : {+1,-1} n → R can be exactly represented as f(x) = Σ f (S) x S, where the f (S)’s are real constants, and the monomial x S is Π x i. This is known as the Fourier representation. Parseval’s identity: Σ f (S) 2 = Σ f(x) 2 / 2 n. S [n] iSiS x {+1,-1} n S [n]

14
Our degree upper bound We actually show a stronger fact: Thm: Let S be any collection of (1-1/n )2 n monomials. Then a.e. function has a PTF over these monomials. Cor: Almost every function has a PTF of degree n/2 + O(√n log n). Proof: For each z {+1,-1} n, let δ z : {+1,-1} n → R be the “Dirac delta function,” δ z (z) = 2 n, δ z (x) = 0 for x ≠ z.

15
Proof sketch continued Random ± 2 n functions are made by forming Σ f(z) δ z (x), where f(z)’s are coin tosses. The function δ z (x) has a simple Fourier representation: δ z (x) = Σ z S x S. Suppose we “approximate” each δ z by deleting the summands outside S : δ' z (x) = Σ z S x S. z {+1,-1} n S [n] S S

16
2n2n 0 {+1,-1} n z δz(·)δz(·) |S ||S | ±(1/n) 2 n {+1,-1} n z δ'z(·)δ'z(·) 2 n δ z (x) = +1 +x 1 - x 2 +x 3 - x 1 x 2 +x 1 x 3 - x 2 x 3 + · · ·

17
Proof sketch continued We want to show that for any particular x, w.v.h.p, Σ f(z) δ z (x) and Σ f(z) δ' z (x) have the same sign. (Then union bd. over x.) Taking the z = x summand starts the sum off with | S | f(x) = (1-1/n)2 n f(x) – good shape so far. You get noise terms for all other z. But…! Key point: These are summed with random ± signs, so they get “dampened”. z {+1,-1} n

18
Proof sketch completed To show that a random ± sum of quantities – { δ' z (x) : z ≠ x} – is small w.h.p., the key is to show a) the #’s are bounded, and b) the sum of squares (variance) is small. Both come easily: each # is at most (1/n)2 n (in abs. val.), and the sum of the squares is easily calculated exactly using Parseval’s equation: independently of x, it’s equal to (1/n–1/n 2 ) 2 2n. SD ≈ (1/√n)2 n. Hence Hoeffding |error| <.5 2 n w.v.h.p.

19
PTF Density Bounds

20
Density bounds: previous results [Gotsman-89] showed that every boolean function has PTF density at most 2 n – 2 n/2. [Saks-93] observed that [Cover-65] implies that almost every boolean function requires PTF density at least.11 2 n. Our thm: Every boolean function has PTF density at most (1-1/O(n)) 2 n. We get to omit a 1/O(n) fraction of monomials, compared to [G89]’s 1/2 n/2.

21
Proof sketch: density upper bound Let f : {+1,-1} n → {+1,-1} be any boolean fcn. Let: L 1 (f) = Σ | f (S)|. Since Σ f (S) 2 = Σ f (S) 2 = 1 (Parseval), by Cauchy-Schwarz, L 1 (f) ≤ 2 n/2. [Bruck-Smolensky-92] shows that f always has a PTF of density 2n L 1 (f) 2. So we’re already done unless, say, L 1 (f) ≥ (1/n) 2 n/2. S [n]

22
Proof sketch continued If L 1 (f) is very close to its upper bound, 2 n/2, then its coefficients must be very “spread out”: a handful may be “large,” but almost all must be close to 2 -n/2. Recall: f(x) = Σ f (S) x S. Let L be the set of coefficients that are “small.” Fix x. We show that if you omit a random selection of (1/O(n)) 2 n terms from L, the sum of what you omit is smaller than 1 w.p. 1 – 2 -2n. S [n]

23
f(x) = Σ f (S) x S We’re adding up N ≈ 2 n numbers, f (S) x S. Each is not much more than ± 2 -n/2 = ± 1/√N. Their mean is very small – around ± log(N)/N: Had we summed over all S we would have gotten f(x) = ± 1; we omitted few terms. Hence (Hoeffding) if we sum a random subset of size N/log(N), the result has magnitude at most 1 w.p. at most 1/N 2. Proof sketch: completed S L

24
Open problems For the problem of degree, the conjecture of Wang & Williams and ABFR is still open: Is the PTF degree of almost every function as low as n/2 ? For the problem of density, we’re not even sure where the right answer lies:.11 2 n … (1-1/O(n)) 2 n. Our conjecture: Almost every function has PTF density.5 2 n.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google