Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO.

Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO

Distributed Functional Monitoring C P1P1 P2P2 P3P3 PkPk … coordinator time sites Static case vs. Dynamic case Problems on x 1 + x 2 + … + x k : sampling, p-norms, heavy hitters, compressed sensing, quantiles, entropy Authors: Can, Cormode, Huang, Muthukrishnan, Patt-Shamir, Shafrir, Tirthapura, Wang, Yi, Zhao, many others Communication x1x1 x2x2 x3x3 xkxk inputs: Updates: x i Ã x i + e j Updates: x i Ã x i + e j

Motivation Data distributed and stored in the cloud –Impractical to put data on a single device Sensor networks –Communication very power-intensive Network routers –Bandwidth limitations

Problems Which functions f(x 1, …, x k ) do we care about? x 1, …, x k are non-negative length-n vectors x = i=1 k x i f(x 1, …, x k ) = |x| p = ( i=1 n x i p ) 1/p |x| 0 is the number of non-zero coordinates What is the randomized communication cost of these problems? I.e., the minimal cost of a protocol, which for every input, fails with probability < 1/3 Static case, Dynamic Case What is the randomized communication cost of these problems? I.e., the minimal cost of a protocol, which for every input, fails with probability < 1/3 Static case, Dynamic Case

Exact Answers An (n) communication bound for computing |x| p, p 1 Reduction from 2-Player Set-Disjointness (DISJ) Alice has a set S µ [n] of size n/4 Bob has a set T µ [n] of size n/4 with either |S Å T| = 0 or |S Å T| = 1 Is S Å T = ; ? |X Å Y| = 1 ! DISJ(X,Y) = 1, |X Å Y| = 0 ! DISJ(X,Y) = 0 [KS, R] (n) communication Prohibitive for applications

Approximate Answers f(x 1, …, x k ) = (1 ± ε) |x | p What is the randomized communication cost as a function of k, ε, and n? Ignore log(nk/ε) factors

Previous Results Lower bounds in static model, upper bounds in dynamic model (underlying vectors are non-negative) |x| 0 : (k + ε -2 ) and O(k ¢ ε -2 ) |x| p : (k + ε -2 ) |x| 2 : O(k 2 /ε + k 1.5 /ε 3 ) |x| p, p > 2: O(k 2p+1 n 1-2/p ¢ poly(1/ε))

Our Results Lower bounds in static model, upper bounds in dynamic model (underlying vectors are non-negative) |x| 0 : (k + ε -2 ) and O(k ¢ ε -2 ) (k ¢ ε -2 ) |x| p : (k + ε -2 ) (k p-1 ¢ ε -2 ). Talk will focus on p = 2 |x| 2 : O(k 2 /ε + k 1.5 /ε 3 ) O(k ¢ poly(1/ε)) |x| p, p > 2: O(k 2p+1 n 1-2/p ¢ poly(1/ε)) O(k p-1 ¢ poly(1/ε)) First lower bounds to depend on product of k and ε - 2 Upper bound doesnt depend polynomially on n

Talk Outline Lower Bounds –Non-zero elements –Euclidean norm Upper Bounds –p-norm

Previous Lower Bounds Lower bounds for any p-norm, p != 1 [CMY] (k) [ABC] (ε -2 ) Reduction from Gap-Orthogonality (GAP-ORT) Alice, Bob have u, v 2 {0,1} ε -2, respectively | ¢ (u, v) – 1/(2ε 2 )| 2/ε [CR, S] (ε -2 ) communication

Lower Bound for Distinct Elements Improve bound to optimal (k ¢ ε -2 ) Simpler problem: k-GAP-THRESH –Each site P i holds a bit Z i –Z i are i.i.d. Bernoulli( ¯ ) –Decide if i=1 k Z i > ¯ k + ( ¯ k) 1/2 or i=1 k Z i < ¯ k - ( ¯ k) 1/2 Otherwise dont care Rectangle property: for any correct protocol transcript ¿, Z 1, Z 2, …, Z k are independent conditioned on ¿

A Key Lemma Lemma: For any protocol ¦ which succeeds w.pr. >.9999, the transcript ¿ is such that w.pr. > 1/2, for at least k/2 different i, H(Z i | ¿ ) < H(.01 ¯ ) Proof: Suppose ¿ does not satisfy this –With large probability, ¯ k - O( ¯ k) 1/2 i=1 k Z i | ¿ ] < ¯ k + O( ¯ k) 1/2 –Since the Z i are independent given ¿, i=1 k Z i | ¿ is a sum of independent Bernoullis –Since most H(Z i | ¿ ) are large, by anti-concentration, both events occur with constant probability: i=1 k Z i | ¿ > ¯ k + ( ¯ k) 1/2, i=1 k Z i | ¿ < ¯ k - ( ¯ k) 1/2 So ¦ cant succeed with large probability

Composition Idea C P1P1 P2P2 P3P3 PkPk … Z3Z3 Z2Z2 Z1Z1 ZkZk The input to P i in k-GAP-THRESH, denoted Z i, is the output of a 2-party Disjointness (DISJ) instance between C and S i - Let X be a random set of size 1/(4ε 2 ) from {1, 2, …, 1/ε 2 } - For each i, if Z i = 1, then choose Y i so that DISJ(X, Y i ) = 1, else choose Y i so that DISJ(X, Y i ) = 0 - Distributional complexity (1/ε 2 ) [Razborov] DISJ Can think of C as a player

Putting it All Together Key Lemma ! For most i, H(Z i | ¿ ) < H(.01 ¯ ) Since H(Z i ) = H( ¯ ) for all i, for most i protocol ¦ solves DISJ(X, Y i ) with constant probability Since the Z i | ¿ are independent, solving DISJ requires communication (ε -2 ) on each of k/2 copies Total communication is (k ¢ ε -2 ) Can show a reduction: –|x| 0 > 1/(2ε 2 ) + 1/ε if i=1 k Z i > ¯ k + ( ¯ k) 1/2 –|x| 0 < 1/(2ε 2 ) - 1/ε if i=1 k Z i < ¯ k - ( ¯ k) 1/2

Lower Bound for Euclidean Norm Improve (k + ε - ) bound to optimal (k ¢ ε -2 ) Base problem: Gap-Orthogonality (GAP-ORT(X, Y)) –Consider uniform distribution on (X,Y) We observe information lower bound for GAP-ORT Sherstovs lower bound for GAP-ORT holds for uniform distribution on (X,Y) [BBCR] + [Sherstov] ! for any protocol ¦ and t > 0, I(X, Y; ¦ ) = (1/(ε 2 log t)) or ¦ uses t communication

Information Implications By chain rule, I(X, Y ; ¦ ) = i=1 1/ε 2 I(X i, Y i ; ¦ | X < i, Y < i ) = (ε -2 ) For most i, I(X i, Y i ; ¦ | X < i, Y < i ) = (1) Maximum Likelihood Principle: non-trivial advantage in guessing (X i, Y i )

2-BIT k-Party DISJ Choose a random j 2 [k 2 ] –j doesnt occur in any T i –j occurs only in T 1, …, T k/2 –j occurs only in T k/, …, T k –j occurs in T 1, …, T k All j j occur in at most one set T i (assume k ¸ 4) We show (k) information cost P1P1 P2P2 …PkPk P3P3 T1T1 T2T2 T3T3 T k 2 [k 2 ] We compose GAP-ORT with a variant of k-Party DISJ

Rough Composition Idea 2-BIT k-party DISJ instance … { 1/ε 2 Show (k/ε 2 ) overall information is revealed Bits X i and Y i in GAP- ORT determine output of i-th 2-BIT k-party DISJ instance An algorithm for approximating Euclidean norm solves GAP-ORT, therefore solves most 2-BIT k-party DISJ instances GAP -ORT - Information adds (if we condition on enough helper variables) - P i participates in all instances - Information adds (if we condition on enough helper variables) - P i participates in all instances

Algorithm for p-norm We get k p-1 poly(1/ε), improving k 2p+1 n 1-2/p poly(1/ε) for general p and O(k 2 /ε + k 1.5 /ε 3 ) for p = 2 Our protocol is the first 1-way protocol, that is, all communication is from sites to coordinator Focus on Euclidean norm (p = 2) in talk Non-negative vectors Just determine if Euclidean norm exceeds a threshold θ

The Most Naïve Thing to Do x i is Site is current vector x = i=1 k x i Suppose Site i sees an update x i Ã x i + e j Send j to Coordinator with a certain probability that only depends on k and θ?

Sample and Send P1P1 P2P2 …PkPk P3P3 C 1…10…00…0………0…01…10…00…0………0…0 0…01…10…0………0…00…01…10…0………0…0 0…00…01…1………0…00…00…01…1………0…0 ……………………………………………………………………………… 0…00…00…0………1…10…00…00…0………1…1 |x| 2 = k 2 { k |x| 2 = 2k 2 1 1 1 1 1 Send each update with probability at least 1/k Communication = O(k), so okay Send each update with probability at least 1/k Communication = O(k), so okay Suppose x has k 4 coordinates that are 1, and may have a unique coordinate which is k 2, occurring k times on each site - Send update with probability 1/k 2 - Will find the large coordinate - But communication is (k 2 ) - Send update with probability 1/k 2 - Will find the large coordinate - But communication is (k 2 )

What Is Happening? Sampling with probability ¼ 1/k 2 is good to get a few samples from heavy item But all the light coordinates are in the way, making the communication (k 2 ) Suppose we put a barrier of k, that is, sample with probability ¼ 1/k 2 but only send an item if it has occurred at least k times on a site Now communication is O(1) and found heavy coordinate But light coordinates also contribute to overall |x| 2 value

Sample at different scales with different barriers Use public coin to create O(log n) groups T 1, …, T log n of the n input coordinates T z contains n/2 z random coordinates Suppose Site i sees the update x i Ã x i + e j For each T z containing j If x i j > (θ/2 z ) 1/2 /k then with probability (2 z /θ) 1/2 ¢ poly(ε -1 log n), send (j, z) to the coordinator Algorithm for Euclidean Norm Expected communication O~(k) If a group of coordinates contributes to |x| 2, there is a z for which a few coordinates in the group are sampled multiple times

Conclusions Improved communication lower and upper bounds for estimating |x| p Implies tight lower bounds for estimating entropy, heavy hitters, quantiles Implications for data stream model –First lower bound for |x| 0 without Gap-Hamming –Useful information cost lower bound for Gap-Hamming, or protocol has very large communication –Improve (n 1-2/p /ε 2/p ) bound for estimating |x| p in a stream to (n 1-2/p /ε 4/p )

Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO.

Similar presentations

Presentation on theme: "Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO.

Similar presentations

Presentation on theme: "Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO."— Presentation transcript:

Similar presentations

About project

Feedback