Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005

Similar presentations


Presentation on theme: "1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005"— Presentation transcript:

1 1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005 http://www.ee.technion.ac.il/courses/049011

2 2 Data Stream Space Lower Bounds

3 3 Data Streams: Reminder Data (n is very large) Computer Program Stream through the data; Use limited memory Approximation of How do we prove memory lower bounds for data stream algorithms?

4 4 CC(g) = min  :  computes g cost(  ) Communication Complexity [Yao 79] Alice Bob m1m1 m2m2 m3m3 m4m4 Referee cost(  ) =  i |m i |  a,b) “transcript”

5 5 CC 1 (g) = min  :  computes g cost(  ) 1-Way Communication Complexity Alice Bob Referee cost(  ) = |m 1 | + |m 2 |  a,b) “transcript” m1m1 m2m2

6 6 Data Stream Space Lower Bounds f: A n  B: a function DS(f): data stream space complexity of f  Minimum amount of memory used by any data stream algorithm computing f g: A n/2 × A n/2  B: g(x,y) = f(xy) CC 1 (g): 1-way communication complexity of g Proposition: DS(f) ≥  (CC 1 (g)) Corollary: In order to obtain a lower bound on DS(f), suffices to prove a lower bound on CC 1 (g).

7 7 Reduction: 1-Way CC to DS Proof of reduction:  S = DS(f)  M: an S-space data stream algorithm for f Lemma: There is a 1-way CC protocol  for g whose cost is S + log(|B|). Proof:  works as follows:  Alice runs M on x  Alice sends state of M to Bob (S bits)  Bob continues execution of M on y  Bob sends output of M to Alice

8 8 Distinct Elements: Reminder Input: a vector x 2 {1,2,…,m} n DE(x) = number of distinct elements of x  Example: if x = (1,2,3,1,2,3), then DE(x) = 3\ Space complexity of DE:  Randomized approximate: O(log m) space  Deterministic approximate:  (m) space  Randomized exact:  (m) space

9 9 The Equality Function EQ: X × X  {0,1} EQ(x,y) = 1 iff x = y Theorem: CC 1 (EQ) ≥  (log |X|) Proof:   : any 1-way protocol for EQ   A (x): Alice’s message on input x   B (m,y): Bob’s message when receiving message m from Alice and input y   (x,y) = (  A (x),  B (  A (x),y)): transcript on (x,y)

10 10 Equality Lower Bound Proof (cont.):  Suppose |  A (x)| < log|X| for all x  X  Then, # of distinct messages of Alice < 2 log|X| = |X|  By pigeonhole principle, there exist two inputs x,x’  X s.t.  A (x) =  A (x’)  Therefore,  (x,x) =  (x’,x)  But EQ(x,x) ≠ EQ(x’,x). Contradiction.

11 11 Combinatorial Designs 1.For each i, |T i | = |U|/4 2.For each i,j, |T i  T j | ≤ |U|/8. T1T1 T2T2 T3T3 U A family of subsets T 1,…,T k of a universe U s.t. Fact: There exist designs of size k = 2  (|U|). (Constant rate, constant relative minimum distance binary error-correcting codes).

12 12 Reduction from EQ to DE U = {1,2,…,m} X = { T 1,….,T k }: design of size k = 2  (m) EQ: X × X  {0,1} Note:  If x = y, then DE(xy) = m/2  If x ≠ y, then DE(xy) ≥ 3m/4 Therefore:  deterministic data stream algorithm that approximates DE with space S  1-way protocol for EQ with cost S + O(log m) Conclusion: DS(DE) ≥  (m)

13 13 Randomized Communication Complexity Alice & Bob are allowed to use random coin tosses For any inputs x,y, the referee needs to find g(x,y) w.p. 1 -  RCC(g) = minimum cost of a randomized protocol that computes g with error ¼. RCC 1 (g) = minimum cost of a randomized 1-way protocol that computes g with error ¼.  What is RCC 1 (EQ)? RDS(f) = minimum amount of memory used by any randomized data stream algorithm computing f with error ¼. Lemma: RDS(f) ≥  (RCC 1 (g))

14 14 Set Disjointness U = { 1,2,…,m } X = 2 U = {0,1} m Disj: X × X  {0,1} Disj(x,y) = 1 iff x  y ≠  Equivalently: Theorem [Kalyanasundaram-Schnitger 88] RCC(Disj) ≥  (m)

15 15 Reduction from Disj to DE U = {1,2,…,m} X = 2 U Disj: X × X  {0,1} Note: DE(xy) = |x  y| Hence,  If x  y = , then DE(xy) = |x| + |y|  If x  y ≠ , then DE(xy) < |x| + |y| Therefore:  randomized data stream algorithm that computes DE exactly with space S  1-way randomized protocol for Disj with cost S + O(log m) Conclusion: RDS(DE) ≥  (m)

16 16 Information Theory Primer X: random variable on U H(X) = entropy of X = amount of “uncertainty” in X (in bits)  Ex: if X is uniform, then H(X) = log(|U|)  Ex: if X is constant, then H(X) = 0 H(X | Y) = conditional entropy of X given Y = amount of uncertainty left in X after knowing Y  Ex: H(X | X) = 0  Ex: If X,Y are independent, H(X | Y) = H(X) I (X ; Y) = H(X) – H(X | Y) = H(Y) – H(Y | X) = mutual information between X and Y H(X) H(Y) H(X|Y) H(Y|X) I (X;Y) H(X,Y )

17 17 Information Theory Primer (cont.) Sub-additivity of entropy: H(X,Y) ≤ H(X) + H(Y). Equality iff X,Y independent. Conditional mutual information: I (X ; Y | Z) = H(X | Z) – H(X | Y,Z)

18 18 Information Complexity g: A × B  C: a function  : a communication complexity protocol that computes g  : distribution on A × B (X,Y): random variable with distribution  Information cost of  : icost(  ) = I (X,Y;  (X,Y)) Information complexity of g: IC  (g) = min  :  computes g icost(  ) Lemma: For any , RCC(g) ≥ IC  (g)

19 19 Direct Sum for Information Complexity We want to:  Find a distribution  on {0,1} m  {0,1} m  Show that IC  (Disj) ≥  (m) Recall that Disj(x,y) = OR i = 1 to m (x i AND y i ) We will prove a “direct sum” theorem for IC  (Disj):  Disj is a “sum” of m independent copies of AND  Hence, information complexity of Disj is m times information complexity of AND We will define a distribution on {0,1}  {0,1} and then define  = m “Theorem” [Direct Sum]: IC  (Disj) ≥ m  IC (AND) It would then suffice to prove an  (1) lower bound on IC (AND)

20 20 Conditional Information Complexity Cannot prove direct sum directly for information complexity Recall:   : distribution on {0,1} m  {0,1} m  (X,Y): random variable with distribution  (X,Y) is product, if X,Y are independent Z: some auxiliary random variable on domain S (X,Y) is product conditioned on Z, if for any z  S, X,Y are independent conditioned on the event { Z = z }. Conditional information complexity of g given Z: CIC  (g | Z) = min  :  computes g I (X,Y ;  (X,Y) | Z) Lemma: For any  and Z, IC  (g) ≥ CIC  (g | Z)

21 21 Input Distributions for Set Disjointness : distribution on pairs {0,1}  {0,1} (U,V): random variable with distribution (U,V) are generated as follows:  Choose a uniform bit D  If D = 0, choose U uniformly from {0,1} and set V = 0  If D = 1, choose V uniformly from {0,1} and set U = 0 We define  = m Note:   is not product  Conditioned on Z = D m,  is product  For every (x,y) in the support of , Disj(x,y) = 0.

22 22 Direct Sum for IC [Bar-Yossef, Jayram, Kumar, Sivakumar 02] Theorem: CIC  (Disj | D m ) ≥ CIC (AND | D) Proof outline:  Decomposition step: I (X,Y;  (X,Y) | D m ) ≥  i I ((X i,Y i ) ;  (X,Y) | D m )  Reduction step: I ((X i,Y i ) ;  (X,Y) | D m ) ≥ CIC (AND | D)

23 23 Decomposition Step I (X,Y;  (X,Y) | D m ) = = H(X,Y | D m ) – H(X,Y |  (X,Y), D m ) ≥  i H(X i,Y i | D m ) -  i H(X i,Y i |  (X,Y), D m ) (by independence of (X 1,Y 1 ),…,(X m,Y m ) and by sub-additivity of entropy). =  i I ((X i,Y i ) ;  (X,Y) | D m )

24 24 Reduction Step Want to show: I ((X i,Y i ) ;  (X,Y) | D m ) ≥ CIC (AND | D) I ((X i,Y i ) ;  (X,Y) | D m ) =  d -i Pr(D -i = d -i ) I ((X i,Y i ) ;  (X,Y) | D i,D -i = d -i ) A protocol for computing AND(x i,y i ):  For all j  i, Alice and Bob select X j and Y j independently using d j  Alice and Bob run the protocol  on X = (X 1,…,X i-1,x i,X i+1,…,X m ) and Y = (Y 1,…,Y i-1,y i,Y i+1,…,Y m )  Note that Disj(X,Y) = AND(x i,y i )  icost of this protocol = I ((X i,Y i ) ;  (X,Y) | D i,D -i = d -i )

25 25 End of Lecture 13


Download ppt "1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005"

Similar presentations


Ads by Google