Erasure-Resilient Property Testing

Erasure-Resilient Property Testing
Sofya Raskhodnikova Pennsylvania State University Joint work with Kashyap Dixit (Penn State) Abhradeep Thakurta (Yahoo! Labs → Apple) Nithin Varma (Penn State)

Goal: study of sublinear algorithms resilient to adversarial erasures in the input
Focus: property testing model [Rubinfeld Sudan 96, Goldreich Goldwasser Ron 98]

A Sublinear-Time Algorithm
A sublinear algorithm is typically modeled as follows. Its input is very long, and it can access it at arbitrary locations by making appropriate queries. In other words, we assume that the algorithm as oracle access. B L A - ? B ? L ? L ? A sublinear-time algorithm Is it always reasonable to assume oracle access to the input?

Distributional Assumptions on the Input Access
One way to avoid the oracle assumption is by making distributional assumptions on the input access. Rather than allowing the algorithm query arbitrary locations, we can assume that locations are sampled independently, according to a uniform or some other distribution, and supplied to the algorithm. Indeed this is what is frequently assumed in learning literature and, in the context of property testing, was first considered by GGR and investigated more thoroughly by subsequent work. In some contexts, it makes sense to consider correlated samples from the input. For example, if the input is an image, it is useful to be able to access uniformly random blocks of pixels. (Meiram Murzabulatov is going to present a poster on testing geometric properties.) Finally, there is a model called active testing that combines sample access with limited oracle access. B L A - B L L A sublinear-time algorithm Sample-based testers [Goldreich Goldwasser Ron 98, Goldreich Ron 14] can access only independent (labeled) samples from the domain Block-sample-based testers [Berman Murzabulatov Raskhodnikova 16] can access uniformly random blocks of pixels from input image Active testers [Balcan Blais Blum Yang 12] get a small sample of domain points can request labels only on points from the sample

Our Model: Erasure-Resilient Testing
In this work, we will go in a different direction. Instead of using distributional assumptions, we will allow our algorithm to query arbitrary locations. However, in some locations, the information might be inaccessible, say, because it was lost or for privacy reasons. In this case, the algorithm is told that this symbol is erased. ⊥ A - B L ? B ? L ? L ? sublinear-time algorithm ≤𝛼 fraction of the input is erased adversarially before algorithm runs Algorithm does not know in advance what’s erased Is it possible that the input satisfied the property?

Erasure-Resilient Testing for Property P
Any completion is -far from P Can be completed to satisfy P Erasure-Resilient Testing for Property P Ties together ideas that appeared in property testing with ideas that appeared in ECCs. ⊥ A - B L ? B ? L ? L ? ,𝛼∈(0,1) sublinear-time algorithm Accept with probability ≥𝟐/𝟑 Reject with probability ≥𝟐/𝟑 Don’t care ≤𝛼 fraction of the input is erased adversarially Algorithm does not know in advance what’s erased Is it possible that the input satisfied the property?

Relationship to Previous Models
Falls between tolerant property testing [Parnas Ron Rubinfeld 06] and standard testing Standard testing: All input values are present and correct. Tolerant testing: Some input values are adversarially corrupted. Erasure-resilient testing: Some input values are adversarially erased. ε-testable 𝛂-erasure-resiliently ε-testable (𝛂,𝛂+ε)-tolerantly testable

Overview of Our Results
A black box transformation of some simple standard testers to efficient erasure-resilient testers. Efficient erasure-resilient testers for cases where our transformation does not apply. A strong separation between testing in the presence of erasures and testing in the absence of erasures.

Our Results: A Black Box Transformation
Makes uniform testers for extendable properties 𝛼-erasure-resilient. Uses the original testers as black box. Applies, for example, to Monotonicity over poset domains [Fischer Lehman Newman Raskhodnikova Rubinfeld Samorodnitsky 02]. Convexity of black and white images [Berman Murzabulatov Raskhodnikova 16] Boolean arrays having at most 𝑘 alternations in values. O 1 1−α factor query complexity overhead for all α∈ 0,1 .

A Limitation of Our Black Box Transformation
Example: Testing sortedness of 𝑛-element arrays Every uniform tester requires Ω 𝑛 queries. Better (optimal) tester that makes 𝑂(log⁡𝑛) queries: Some known standard testers cannot be used as black box All known optimal sortedness testers [EKKRV00, BGJRW09, CS13a] break with just one erasure. 1 ⊥ 𝑛 random search index SortednessTest [Ergün Kannan Kumar Rubinfeld Viswanathan 00] Perform a binary search for a random search index. Reject if there are violations (i.e., elements out of order). Known optimal testers for monotonicity, Lipschitz property and convexity of functions [GGLRS00, DGLRRS99, EKKRV00, F04, CS13a, CS13b, CST14, BRY14, BRY14, CDST15, KMS15, BB16, JR13, CS13a, BRY14, BRY14, CDJS15, PRR03, BRY14] break on a constant number of erasures.

Our Results for More Challenging Properties
For functions over the line [n] 𝛼-erasure-resilient testers for monotonicity, Lipschitz property and convexity For functions over the hypergrid 𝒏 𝒅 𝛼-erasure-resilient testers for monotonicity, Lipschitz property Hard example for monotonicity and Lipschitz property Not enough to check violations only on axis-parallel lines if 𝛼 = Ω ε 𝑑 . O 1 1−α factor query complexity overhead for all α∈ 0,1 . O 1 1−α factor query complexity overhead for 𝛼 = 𝑂 ε 𝑑 .

Our Results: Separation
There exists a property P and a constant 𝑐>0 such that standard 𝜖-testing of P can be done with 𝑂 1 ε queries 𝛼-erasure-resilient testing of P requires Ω 𝑛 𝑐 queries for all constant 𝛼 and ε. Erasure-resilient testing is much harder than standard testing in general.

Summary of Our Contributions
A model of property testing that accounts for erasures in the input data. Black box transformation of some simple standard testers to efficient erasure-resilient testers. Efficient erasure-resilient testers for more challenging, well studied properties. A strong separation between testing in the presence of erasures and testing in the absence of erasures.

In the Rest of This Talk Theorem
Standard 𝜖-testing of sortedness uses Θ log 𝑛 ε queries [Ergün Kannan Kumar Rubinfeld Viswanathan 00, Fischer 01] Theorem There exists an α-erasure-resilient ε-tester for sortedness of 𝒏-element arrays that makes 𝐎 𝐥𝐨𝐠 𝒏 ε 𝟏−α queries for all α,ε∈ 0,1 .

Our Erasure-Resilient Sortedness Tester
Input: ε,α ∈ (0,1); query access to an array Repeat Θ(1/ε) times: Sample uniformly until you get a nonerased search point 𝒔. Binary search for 𝒔 with uniform nonerased split points. Reject if there are violations along the search path. Accept if no violations were found. 𝒑 𝟏 𝒔 𝒑 𝟐 ⊥ 9 ⊥ ⊥ 61 37 ⊥ 88 𝑛

Analysis of the Sortedness Tester
Array is sorted ⟹ tester accepts Array is ε-far from sorted ⟹ tester detects violation with probability ≥ ε in one iteration. Need to repeat only Θ(1/ε) times to get error probability 2/3 The expected number of queries per iteration is 𝑂 log 𝑛 1−𝛼 Tester traverses a uniformly random search path in a random binary search tree. The # of levels in a random binary search is 𝑂 log 𝑛 w.h.p. Claim. Expected # of queries to one level of binary search is 𝑂 1 1−𝛼

Expected Number of Queries in One Iteration
Mental experiment: first select a random tree and then the search point s. At level 𝑘 𝑸=# of queries 𝑛 Interval I 𝛼 𝐼 = fraction of erasures in I Pr [search point 𝒔 is in 𝐼] = # nonerased points in I total # nonerased points ≤ |I|(1 − α 𝐼 ) n(1 − α) intervals 𝐼 in level 𝑘 E [Q] = 𝐸 𝑄 𝑠∈𝐼]⋅Pr⁡[𝑠∈𝐼] = 𝐼 − α 𝐼 ⋅ |I| (1 − α 𝐼 ) n(1 − α) ≤ 1 1 − α

What We Proved Theorem There exists an α-erasure-resilient ε-tester for sortedness of 𝒏-element arrays that makes O 𝐥𝐨𝐠 𝒏 ε 𝟏−α queries for all α,ε∈ 0,1 .

Open Questions and Directions
Erasure-resilience for other models of sublinear algorithms. Is tolerant testing [PRR06] harder than erasure-resilient testing? Are there testers resilient to more erasures for monotonicity and Lipschitz property over hypergrid domains? Erasure-resilient testers for other properties: linearity, dictatorship, linear threshold functions...

Erasure-Resilient Property Testing

Similar presentations

Presentation on theme: "Erasure-Resilient Property Testing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Erasure-Resilient Property Testing

Similar presentations

Presentation on theme: "Erasure-Resilient Property Testing"— Presentation transcript:

Similar presentations

About project

Feedback