Sparse Kindler-Safra Theorem via agreement theorems

Sparse Kindler-Safra Theorem via agreement theorems
Prahladh Harsha Tata Institute of Fundamental Research joint work with Irit Dinur and Yuval Filmus

Main Contributions: Result: Structure theorem for low-degree polynomials on biased cube Kindler-Safra”-type structure theorem for p-biased hypercube p --- very small, sub-constant, possibly even 𝑝= 𝑂(1) 𝑛 Proof Paradigm: Application of (high-dimensional) agreement theorems to proving structure theorems for p-biased hypercube High-dimensional agreement theorem – generalization of direct product testing to larger dimensions

Boolean functions in the (standard) hypercube
Can be viewed as a real-valued function 𝑓: 0,1 𝑛 →ℝ The space of such functions is spanned by 𝜒 𝑆 𝑆⊂ 𝑛 , 𝑓= 𝑆 𝑓(𝑆) 𝜒 𝑆 f has degree ≤𝑑 iff 𝑓(𝑆) =0 ∀ 𝑆 >𝑑 Basic Junta theorem: If 𝑓: 0,1 𝑛 →{0,1} has degree ≤𝑑 [Nisan-Szegedy ‘94] ⟹ it depends on 𝑂 𝑑 (1) variables (= it is a junta) Boolean funciton – basic object in CS, and in complexity we want to understand how different measures relate to each other. Natural approach - look at f as a real valued function

Structure theorems, inverse theorems
Structure theorem: if “property” then “structure” often an “inverse” of very easy statement robust = stability version of the theorem General question: when does robust version exist ? robust almost

Robust versions of junta theorem
What would be a robust version of the basic junta theorem? [NS]: If 𝑓: 0,1 𝑛 →{0,1} has degree ≤𝑑 ⟹ it depends on 𝑂(1) variables Even simpler: If 𝑓: 0,1 𝑛 →{0,1} has degree ≤1 ⟹ it is dictator/anti-dictator/constant Put uniform measure on 0,1 𝑛 and talk about distance of f,g 𝑑𝑖𝑠𝑡 𝑓,𝑔 = 𝔼 𝑥∈ 0,1 𝑛 𝑓 𝑥 −𝑔 𝑥 2 𝜀-close to Boolean or to low degree or both ? 𝑓 Boolean and 𝜀-close to 𝑔 of deg⁡𝑑 ⇔ 𝑔 has deg⁡𝑑 and 𝜀-close to 𝑓 Boolean [Friedgut-Kalai-Naor]: If 𝑓: 0,1 𝑛 →ℝ has degree ≤1, and it is 𝜀-close to Boolean, then it is O(𝜀)-close to dictator/anti-dictator/constant [Bourgain, Kindler-Safra]: If 𝑓: 0,1 𝑛 →ℝ has degree ≤ 𝑑, and it is 𝜀-close to Boolean, then it is O(𝜀)-close to junta

From Boolean to A-valued
The robust junta theorems hold because low degree functions are smooth, not spiky (technically, this is proven via hypercontractivity) FKN: If 𝑓: 0,1 𝑛 → ℝ has degree ≤1, and it is 𝜀-close to Boolean, then it is O 𝜀 -close to a dictator What if f attains 3 values and not only 2 ? Example: 𝑓 𝑥 = 𝑥 1 + 𝑥 2 attains three values 0,1,2 yet is not a dictator (but it is still a junta) Theorem “A-valued robust junta theorem”: If 𝑓: 0,1 𝑛 →ℝ has degree ≤𝑑, and it is 𝜀−close to A- valued, then it is O(𝜀)−close to a junta. (𝐴⊂ℝ a finite set) Observe that if 𝑔 is a junta, then 𝑔 is A’-valued for some other finite A’ Explain why FKN is true, too many coefs mean that the function is like a Gaussian

From Boolean to A-valued
Alternative interpretation: Assume 𝑓: 0,1 𝑛 →ℝ has degree ≤ 𝑑. If 𝑓 is 𝜀-close to A-valued, then 𝑓 is O(𝜀)-close to A’-valued (𝐴′⊂ℝ a finite set) Parseval’s inequality implies 𝑓 is O(𝜀)-close to A’-valued ⟹ 𝑓 is a junta Explain why FKN is true, too many coefs mean that the function is like a Gaussian

p-biased hypercube 𝜇 𝑝 - product distribution, each bit is 1 independently with probability p 𝜇 𝑝 𝑥 := 𝑝 𝑖 𝑥 𝑖 1−𝑝 𝑛− 𝑖 𝑥 𝑖 Measure concentrates on 𝑥’s with ≈𝑝𝑛 1’s Studied in various contexts- Graph properties: sharp threshold phenomena in G(n,p) Reed Muller decoding from erasures Hardness of approximation

p-biased: Sharp thresholds
A graph on n vertices can be represented as a string 𝑥∈ 0,1 𝑁 where 𝑁= 𝑛 2 A graph property is a function 𝑓: 0,1 𝑛 2 →{0,1} Example: “the graph is connected”, “the graph has a triangle” Studying a graph property in G(n,p) is like studying f in the p- biased hypercube Friedgut-Kalai : all monotone graph properties have a narrow threshold Friedgut: k-sat has a sharp threshold Observe: 𝑝 here is very small, e.g. 1/ 𝑁 𝑐 for some constant c Removed:Friedgut’s Conjecture: monotone functions with low 𝜇 𝑝 -influence must have certain structure

p-biased: decoding from erasures
A recent result [KKMPSU’16] showed that Reed-Muller codes with constant rate achieve capacity for decoding from erasures. Key component = a structure theorem for monotone Boolean functions A different set of works [ASW’15, SSV’16] showed the same for non- constant rates, using very different ideas. For all in-between rates – we do not know. To extend further one needs perhaps a better grasp on smaller 𝑝 behavior.

p-biased: Hardness of approximation
The Boolean hypercube stars as the long-code gadget in many inapprox reductions p-biased version is used in hardness of vertex cover, but p=constant Recent works [KMS,DKKMS] use “the Grassmann graph” and introduce structural conjectures about functions on its vertices. This is also related to the “short code graph” [BKS]. The relevant parameters for the conjectures are analogous to 𝜇 𝑝 for very small p, 𝑝= 𝑂(1) 𝑛 . This was our motivation. In fact, our result if true for the grassmann would come close to proving the conjecture, but not quite…

(Nearly) Boolean low degree functions on the p-biased hypercube
Robust junta theorem applies also to 𝜇 𝑝 Error deteriorates as 𝑝→0 due to use of hypercontractivity Desire better dependence on 𝜖 even when 𝑝=𝑜(1) Prob[f=0] = 1- sqrt eps Prob[f=2] = eps

(Nearly) Boolean low degree functions on the p-biased hypercube
Consider 𝑓 𝑥 = 𝑥 1 + 𝑥 2 +…+ 𝑥 𝑠 where 𝑥 𝑖 ∈ 0,1 and 𝑠= 𝜖 𝑝 𝑓 has degree 1, clearly it is not a junta 𝑃𝑟𝑜𝑏 𝑓=0 = 1−𝑝 𝑠 ≈ 𝑒 −√𝜖 ≈ 1− 𝜖 𝑃𝑟𝑜𝑏 𝑓≥2 ≈ 𝑠 2 𝑝 2 ≈𝜖 𝑓 is 𝜖-close to Boolean 𝑓 is √𝜖-close to 0, but we want a more refined approximation The closest Boolean function is: 𝑔 𝑥 = max⁡(𝑥 1 , 𝑥 2 ,…, 𝑥 𝑠 ) 𝑑𝑖𝑠𝑡 (𝑓,𝑔) =𝑂(𝜖) Filmus’16: If h has degree 1 and 𝜇 𝑝 -close to Boolean, then it looks like 𝑓. want: If h has degree ≤𝑑 and 𝜇 𝑝 -close to Boolean, then it looks like ???. Let’s calculate… Prob[f=0] = (1-p)^s = 1- sqrt eps Prob[f=2] = s^2 p^2 = eps

Looking for structure…
Filmus’16: If h has degree 1 and 𝜇 𝑝 -close to Boolean, then it looks like 𝑓. want: If h has degree ≤𝑑 and 𝜇 𝑝 -close to Boolean, then it looks like ???. Naïve guess: maybe there are 𝜖 𝑝 coordinates that control the function? No: 𝑓 = 𝑥 1 𝑥 2 + 𝑥 3 𝑥 4 + … 𝑥 𝑛−1 𝑥 𝑛 is nearly Boolean for p=O 1 𝑛 Note that a random 𝑥∼𝜇 𝑝 leaves O(1) monomials “alive”

The monomial expansion
Consider the multilinear expansion in {0,1} variables i.e. 𝑓 𝑥 = 𝑠 𝑓 𝑠 𝑦 𝑠 where 𝑥 𝑖 ∈ 0,1 and 𝑦 𝑠 = 𝑖∈𝑆 𝑥 𝑖 (do not confuse with the Fourier functions: 𝑥 𝑖 ∈{−1,1} and 𝜒 𝑆 = 𝑖∈𝑆 𝑥 𝑖 ) The monomial-expansion is unique, but the 𝑦 𝑆 functions are not orthogonal Filmus’16: Let f be a degree 1 function. If f is close to {0,1}-valued, then 𝑓 is close to {-1,0,1}-valued Definition: 𝑓 is a quantized polynomial if there is a finite set 𝐴 such that 𝑓 𝑠 ∈𝐴 for every 𝑠⊆[𝑛]. Do not confuse this with

Quantized polynomials
Theorem: Let f be a function with degree ≤𝑑. If f is 𝜖-close under 𝜇 𝑝 to an A-valued function, then it is O 𝜖 -close to a quantized polynomial. For p=1/2 this is the Kindler-Safra robust junta theorem There’s more: a quantized polynomial q that is nearly Boolean (or A- valued) has further structure.

Sparse Juntas If a quantized polynomial is nearly Boolean –
It must be sparse Even after conditioning on few 𝑥 𝑖 =1, it must still be sparse Consider the hypergraph H on n vertices whose edges are the non- zero 𝑓 −coefficients H has branching factor 𝑏 if for all subsets 𝐴⊂[𝑛] and integers 𝑟≥0, there are at most 𝑏 𝑟 hyperedges in H of cardinality |A|+ r containing A . sparse junta = a quantized polynomial with branching factor 1/p.

Main Theorem: sparse Kindler-Safra Theorem
Theorem (main): Let 𝑓 be a function of degree ≤𝑑. If it is 𝜖-close under 𝜇 𝑝 to an A -valued function, then it is O(𝜖)-close to a sparse junta. So 𝑓 is an “empirical” junta : after selecting x, the number of 𝑓 coefficients that stay “alive” is O(1) [ compare to Hatami’s pseudo-juntas ] Thm is tight : { nearly- sparse juntas } = { nearly- low degree & A-valued }

Proof Given 𝑓 of degree ≤𝑑, 𝜖−close to Boolean.
Earlier structure theorems rely on hyper-contractivity. As 𝑝→0 hypercontractivity gets weaker and weaker Instead, we will “divide and conquer” – Divide: look at random restrictions of 𝑓 to small sub-cubes Conquer: obtain approximate structure on each subcube Reunite: recover a global structure

”Divide”: Choose a random subset 𝑆⊂ 𝑛 according to 𝜇 2𝑝 , place zeros outside

Choose a uniform 𝑥∈ 0,1 𝑆

Choose a uniform 𝑥∈ 0,1 𝑆 This describes 𝜇 𝑝 𝑛 as a convex combination of 𝜇 1/2 𝑚 where m is binomially distributed with mean 2pn.

The resulting string is distributed according to 𝜇 𝑝 𝑛
”Divide”: Choose a random subset 𝑆⊂ 𝑛 according to 𝜇 2𝑝 , place zeros outside Choose a uniform 𝑥∈ 0,1 𝑆 This describes 𝜇 𝑝 𝑛 as a convex combination of 𝜇 1/2 𝑚 where m is binomially distributed with mean 2pn. The resulting string is distributed according to 𝜇 𝑝 𝑛 This describes 𝜇 𝑝 𝑛 as a convex combination of 𝜇 1/2 𝑚 where m is binomially distributed with mean 2pn.

Proof outline ”Divide”: “Conquer” : “Reunite”: When can this work?
Let 𝑓 𝑆 be the function on 0,1 𝑆 obtained by restricting 𝑓 to inputs that are zero outside S “Conquer” : For typical 𝑆, 𝑓 𝑆 is close to Boolean, so we can apply “ 𝜇 junta theorem” of Kindler and Safra and get junta ℎ 𝑆 that approximates 𝑓 𝑆 . “Reunite”: Stitch ℎ 𝑆 together into one global function ℎ: 0,1 𝑛 →ℝ such that typically ℎ 𝑆 = ℎ 𝑆 When can this work? At the very least require local consistency, i.e, i.e. ℎ 𝑆 1 𝑆 1 ∩ 𝑆 2 = ℎ 𝑆 2 𝑆 1 ∩ 𝑆 2 But “Local consistency” ⇒ “Global Consistency” ???

Local to Global Agreement
Consider d=1 case Each ℎ 𝑆 represents a local linear function ℎ 𝑠 :{0,1 } 𝑆 →{0,1} Local Agreement: Typically, ℎ 𝑆 1 𝑆 1 ∩ 𝑆 2 = ℎ 𝑆 2 𝑆 1 ∩ 𝑆 2 I.e, for most pairs 𝑆 1 and 𝑆 2 , the corresponding two linear functions ℎ 𝑆 1 and ℎ 𝑆 2 agree Global Agreement: Does there exist a ”global” linear function ℎ: {0,1 } 𝑛 →{0,1} such that for most 𝑆, we have ℎ 𝑆 = ℎ 𝑆 Direct Product Testing [..., DS]: Local Agreement ⇒ Global Agreement For larger d, need a high dimensional analogue of this direct product testing

High dimensional agreement theorem
General d Each ℎ 𝑆 represents a degree d function ℎ 𝑠 :{0,1 } 𝑆 →{0,1} (or equivalently a labelled hypergraph with hyperedges of size at most d) Local Agreement: Typically, ℎ 𝑆 1 𝑆 1 ∩ 𝑆 2 = ℎ 𝑆 2 𝑆 1 ∩ 𝑆 2 I.e, Pr ℎ 𝑆 1 𝑆 1 ∩ 𝑆 2 = ℎ 𝑆 2 𝑆 1 ∩ 𝑆 2 ≥1−𝜖 Global Agreement: Does there exist a ”global” degree d function ℎ: {0,1 } 𝑛 →{0,1} (or equivalently a global hypergraph) such that for most 𝑆, we have Pr ℎ 𝑆 = ℎ 𝑆 ≥1−𝑂(𝜖) YES Furthermore, this global ℎ can be obtained by majority/plurality decoding

Proof summary Given 𝑓 of degree ≤𝑑, 𝜖−close to Boolean.
For typical 𝑆, 𝑓 𝑆 is close to Boolean, so we can apply “ 𝜇 junta theorem” of Kindler and Safra and get a junta ℎ 𝑆 that approximates 𝑓 𝑆 . Stitch the local juntas together to get a global function h (using the hypergraph agreement theorem) Prove that h is close to a sparse junta .

Applications Tail bound for sparse juntas
(implies same for nearly low-degree&A-valued) Sparse juntas must be very biased (implies that nearly low-degree&A-valued functions must be very biased) skip

Summarizing.. Theorem (main): Let 𝑓 be a function of degree ≤𝑑. If it is 𝜖-close under 𝜇 𝑝 to an A -valued function, then it is O(𝜖)-close to a sparse junta. Proof via a local-to-global agreement theorem (generalization of direct product testing to larger dimensions)

Thank You

Sparse Kindler-Safra Theorem via agreement theorems

Similar presentations

Presentation on theme: "Sparse Kindler-Safra Theorem via agreement theorems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sparse Kindler-Safra Theorem via agreement theorems

Similar presentations

Presentation on theme: "Sparse Kindler-Safra Theorem via agreement theorems"— Presentation transcript:

Similar presentations

About project

Feedback