Download presentation

Presentation is loading. Please wait.

Published byConnor Brennan Modified over 4 years ago

1
Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

2
The Streaming Model [AMS96] 0113734 … Stream of elements a 1, …, a q each in {1, …, m} Want to compute statistics on stream Elements arranged in adversarial order Algorithms given one pass over stream Goal: Minimum space algorithm

3
Frequency Moments Notation q = stream size, m = universe size f i = # occurrences of item i Why are frequency moments important? F 0 = # of Distinct elements F 1 = q F 2 = repeat rate k-th moment

4
Applications Estimating # distinct elts. w/ low space Estimate selectivity of queries to DB w/o expensive sort Routers gather # distinct destinations w/limited memory. Estimating F 2 estimates size of self-joins: Bobx Alicey Bobz a Aliceb Bobc, Aliceby Bobax az cx cz

5
The Best Determininistic Algorithm Trivial algorithm for F k Store/update f i for each item i, sum f i k at end Space = O(mlog q): m items i, log q bits to count f i Negative Results [AMS96]: Compute F k exactly => (m) space Any deterministic alg. outputs x with |Fk – x| < must use (m) space What about randomized algorithms?

6
Randomized Approx Algs for F k Randomized alg. -approximates F k if outputs x s.t. Pr[|Fk – x| 2/3 Can -approximate F0 [BJKST02], F2 [AMS96], Fk [CK04], k > 2 in space: (big-Oh notation suppresses polylog(1/, m, q) factors) Ideas: Hashing: O(1)-wise independence Sampling

7
Example: F 0 [BJKST02] Idea: For random function h:[m] -> [0,1] and distinct elts b 1, b 2, …, b F 0, expect min i h(b i ) ¼ 1/F 0 Algorithm: Choose 2-wise indep. hash function h: [m] -> [m 3 ] Maintain t = (1/ 2 ) distinct smallest values h(b i ) Let v be t-th smallest value Output tm 3 /v as estimate for F 0 Success prob up to 1- => take median O(log 1/ ) copies Space: O((log 1/ )/ 2 )

8
Example: F 2 [AMS99] Algorithm: Choose 4-wise indep. hash function h:[m] -> {-1,1} Maintain Z = i in [m] f i ¢ h(i) Output Y = Z 2 as estimate for F 2 Correctness: Chebyshevs inequality => O(1/ 2 ) space

9
Previous Lower Bounds: [AMS96] 8 k, –approximating F k => (log m) space [Bar-Yossef] -approximating F 0 => (1/ ) space [IW03] -approximating F 0 => space if Questions: Does the bound hold for k 0? Does it hold for F 0 for smaller ?

10
Our First Result Optimal Lower Bound: 8 k 1, any = (m -.5 ), -approximate F k => ( -2 ) bits of space. F 1 = q trivial in log q space F k trivial in O(m log q) space, so need = (m -.5 ) Technique: Reduction from 2-party protocol for computing Hamming distance (x,y) Use tools from communication complexity

11
Lower Bound Idea x 2 {0,1} m y 2 {0,1} m Stream s(x) Stream s(y) (1 § ) F k algorithm A Internal state of A Compute (1 § ) F k (s(x) ± s(y)) w.p. > 2/3 Idea: If can decide f(x,y) w.p. > 2/3, space used by A at least randomized 1-way comm. Complexity of f S AliceBob

12
Randomized 1-way comm. complexity Boolean function f: X £ Y ! {0,1} Alice has x 2 X, Bob y 2 Y. Bob wants f(x,y) Only 1 message m sent: must be from Alice to Bob Communication cost = max x,y E coins [|m|] -error randomized 1-way communication complexity R (f), is cost of optimal protocol computing f with probability ¸ 1- Ok, but how do we lower bound R (f)?

13
Shatter Coefficients [KNR] F = {f : X ! {0,1}} function family, f 2 F length-|X| bitstring For S µ X, shatter coefficient SC(f S ) of S : |{f | S } f 2 F | = # distinct bitstrings when F restricted to S SC(F, p) = max S µ X, |S| = p SC(f S ). If SC(f S ) = 2 |S|, S shattered Treat f: X £ Y ! {0,1} as function family f X : f X = { f x (y) : Y ! {0,1} | x 2 X }, where f x (y) = f(x,y) Theorem [BJKS]: For every f: X £ Y ! {0,1}, every integer p, R 1/3 (f) = (log(SC(f X, p)))

14
Warmup: (1/ ) Lower Bound [Bar-Yossef] Alice input x 2 R {0,1} m, wt(x) = m/2 Bob input y 2 R {0,1} m, wt(y) = m s(x), s(y) any streams w/char. vectors x, y PROMISE: (1) wt(x Æ y) = 0 OR (2) wt(x Æ y) = m f(x,y) = 0 f(x,y) = 1 F 0 (s(x) ± s(y)) = m/2 + m F 0 (s(x) ± s(y)) = m/2 R 1/3 (f) = (1/ ) [Bar-Yossef] (uses shatter coeffs) (1+)m/2 < (1 -)(m/2 + m) for = ( ) Hence, can decide f ! F 0 alg. uses (1/ ) space Too easy! Can replace F 0 alg. with a Sampler!

15
Our Reduction: Hamming Distance Decision Problem (HDDP) Lower bound R 1/3 (f) via SC(f X, t), but need a lemma Set t = (1/ 2 ) x 2 {0,1} t y 2 {0,1} t AliceBob Promise Problem : (x,y) · t/2 – (t 1/2 ) (x,y) > t/2 f(x,y) = 0 OR f(x,y) = 1

16
Main Lemma S µ{0,1} n y = T = S-T 9 S µ {0,1} n with |S| = n s.t. exist 2 (n) good sets T µ S s.t. 9 y 2 {0,1} n s.t 8 t 2 T, (y, t) · n/2 – cn 1/2 for some c > 0 8 t 2 S – T, (y,t) > n/2

17
Lemma Resolves HDDP Complexity Theorem: R 1/3 (f) = (t) = ( -2 ). Proof: Alice gets y T for random good set T applying main lemma with n = t. Bob gets random s 2 S Let f: {y T } T £ S ! {0,1}. Main Lemma =>SC(f) = 2 (t) [BJKS] => R 1/3 (f) = (t) = ( -2 ) Corollary: (1/ 2 ) space for randomized 2-party protocol to approximate (x,y) between inputs First known lower bound in terms of !

18
Back to Frequency Moments Use -approximator for F k to solve HDDP y 2 {0,1} t s 2 S µ {0,1} t F k Alg State ayay asas i-th universe element included exactly once in stream a y iff y i = 1 (a s same)

19
Solving HDDP with F k Alice/Bob compute -approx to F k (a y ± a s ) F k (a y ± a s ) = 2 k wt(y Æ s) + 1 k (y,s) For k 1, Conclusion: -approximating F k (a y ± a s ) decides HDDP, so space for F k is (t) = ( -2 ) Alice also transmits wt(y) in log m space.

20
Back to the Main Lemma Recall: show 9 S µ {0,1} n with |S| = n s.t. 2 (n) good sets T µ S s.t: 9 y 2 {0,1} n s.t 1. 8 t 2 T, (y, t) · n/2 – cn 1/2 for some c > 0 2. 8 t 2 S – T, (y,t) > n/2 Probabilistic Method Choose n random elts in {0,1} n for S Show arbitrary T µ S of size n/2 is good with probability > 2 -zn for constant z < 1. Expected # good T is 2 (n) So exists S with 2 (n) good T

21
Proving the Main Lemma T ={t 1, …, t n/2 } µ S arbitrary Let y be majority codeword of T What is probability p that both: 1. 8 t 2 T, (y, t) · n/2 – cn 1/2 for some c > 0 2. 8 t 2 S – T, (y,t) > n/2 Put x = Pr[8 t 2 T, (y,t) · n/2 – cn 1/2 ] Put y = Pr[8 t 2 S-T, (y,t) > n/2] = 2 -n/2 Independence => p = xy = x2 -n/2

22
The Matrix Problem Wlog, assume y = 1 n (recall y is majority word) Want lower bound Pr[8 t 2 T, (y,t) · n/2 – cn 1/2 ] Equivalent to matrix problem: t1 -> t2 -> … t n/2 -> 101001000101111001 100101011100011110 001110111101010101 101010111011100011 For random n/2 x n binary matrix M, each column majority 1, what is probablity each row ¸ n/2 + cn 1/2 1s?

23
A First Attempt Set family A µ 2^{0,1} n monotone increasing if S 1 2 A, S 1 µ S 2 => S 2 2 A For uniform distribution on S µ {0,1} n, and A, B monotone increasing families, [Kleitman] Pr[A Å B] ¸ Pr[A] ¢ Pr[B] First try: Let R be event M ¸ n/2 + cn 1/2 1s in each row, C event M majority 1 in each column Pr[8 t 2 T, (y,t) · n/2 – cn 1/2 ] = Pr[R | C] = Pr[R Å C]/Pr[C] M characteristic vector of subset of [.5n 2 ] => R,C monotone increasing => Pr[R Å C]/Pr[C] ¸ Pr[R]Pr[C]/Pr[C] = Pr[R] < 2 -n/2 But we need > 2 -zn/2 for constant z < 1, so this fails…

24
A Second Attempt Second Try: R 1 : M ¸ n/2 + cn 1/2 1s in first m rows R 2 : M ¸ n/2 + cn 1/2 1s in remaining n/2-m rows C: M majority 1 in each column Pr[8 t 2 T, (y,t) · n/2 – cn 1/2 ] = Pr[R 1 Å R 2 | C] = Pr[R 1 Å R 2 Å C]/Pr[C] R 1, R 2, C monotone increasing => Pr[R 1 Å R 2 Å C]/Pr[C] ¸ Pr[R 1 Å C]Pr[R 2 ]/Pr[C] = Pr[R 1 | C] Pr[R 2 ] Want this at least 2 -zn/2 for z < 1 Pr[ X i > n/2 + cn 1/2 ] > ½ - c (2/pi) 1/2 [Stirling] Independence => Pr[R 2 ] > (½ - c(2/pi) 1/2 ) n/2 - m Remains to show Pr[R 1 | C] large.

25
Computing Pr[R 1 | C] Pr[R 1 | C] = Pr[M ¸ n/2 + cn 1/2 1s in 1st m rows | C] Show Pr[R 1 | C] > 2 -zm for certain constant z < 1 Ingredients: Expect to get n/2 + (n 1/2 ) 1s in each of 1 st m rows | C Use negative correlation of entries in a given row => show n/2 + (n 1/2 ) 1s in a given row w/good probability for small enough c A simple worst-case conditioning argument on these 1 st m rows shows they all have ¸ n/2 + cn 1/2 1s

26
Completing the Proof Recall: what is probability p = xy, where 1. x = Pr[ 8 t 2 T, (y, t) · n/2 – cn 1/2 ] 2.y = Pr[ 8 t 2 S – T, (y,t) > n/2] = 2 -n/2 3.R 1 : M ¸ n/2 + cn 1/2 1s in first m rows 4.R 2 : M ¸ n/2 + cn 1/2 1s in remaining n/2-m rows 5.C: M majority 1 in each column x ¸ Pr[R 1 | C] Pr[R 2 ] ¸ 2 -zm (½ - c(2/pi) 1/2 ) n/2 – m Analysis shows z small so this ¸ 2 -zn/2, z < 1 Hence p = xy ¸ 2 -(z+1)n/2 Hence expected # good sets 2 n-O(log n) p = 2 (n) So exists S with 2 (n) good T

27
Bipartite Graphs Matrix Problem Bipartite Graph Counting Problem: How many bipartite graphs exist on n/2 by n vertices s.t. each left vertex has degree > n/2 + cn 1/2 and each right vertex degree > n/2? ……

28
Our Result on # of Bipartite Graphs Bipartite graph count: Argument shows at least 2 n^2/2 – zn/2 –n such bipartite graphs for constant z < 1. Main lemma shows # bipartite graphs on n + n vertices w/each vertex degree > n/2 is > 2 n^2-zn-n Can replace > with < Previous knowncount: 2 n^2-2n [MW – personal comm.] Follows easily from Kleitman inequality

29
Summary Results: Optimal F k Lower Bound: 8 k 1 and any = (m -1/2 ), any -approximator for F k must use ( -2 ) bits of space. Communication Lower Bound of ( -2 ) for one- way communication complexity of (, )- approximating (x, y) Bipartite Graph Count: # bipartite graphs on n + n vertices w/each vertex degree > n/2 at least 2 n^2-zn-n for constant z < 1.

Similar presentations

OK

Finding Frequent Items in Data Streams [Charikar-Chen-Farach-Colton] Paper report By MH, 2004/12/17.

Finding Frequent Items in Data Streams [Charikar-Chen-Farach-Colton] Paper report By MH, 2004/12/17.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google