Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Streaming Computation of Combinatorial Objects Ziv Bar-Yossef U.C. Berkeley Omer Reingold AT&T Labs – Research Ronen.

Similar presentations


Presentation on theme: "1 Streaming Computation of Combinatorial Objects Ziv Bar-Yossef U.C. Berkeley Omer Reingold AT&T Labs – Research Ronen."— Presentation transcript:

1 1 Streaming Computation of Combinatorial Objects Ziv Bar-Yossef U.C. Berkeley http://www.cs.berkeley.edu/~zivi Omer Reingold AT&T Labs – Research Ronen Shaltiel Weizmann Institute of Science Luca Trevisan U.C. Berkeley

2 2 The Streaming Model [HRR98, AMS96, FKSV99] x1x1 x2x2 x3x3 xnxn Streaming Algorithm memory y1y1 y2y2 y3y3 ymym Input stream Output stream One-way (“online”) access to the input Sub-linear space As usual, one-way output

3 3 Algorithmic Motivation Computations over massive data sets Database –One pass algorithms for large database relations [ AMS96, GM98] Networking –Processing IP packets at ISP routers [FKSV99, FS00, Indyk00] Information Retrieval –Processing search engine query logs [CCF02]

4 4 Complexity Theoretic Motivation Randomized algorithms: one-way access to random bits Space-bounded randomized algorithms: “streaming algorithms” w.r.t. random inputs De-randomization procedures for space-bounded computations: streaming algorithms as “adversaries” sometimes need a “streaming” implementation themselves

5 5 Combinatorial Objects De-randomization primitives Potential building blocks in de-randomization of space-bounded computations (e.g., RL/BPL) Streaming implementation may be needed then Extractors Universal Hash Functions Dispersers Error-Correcting Codes Special case Leftover hash lemmas [HILL98,IZ89] [M98] [TZ01] [T99,TZS01,SU01]

6 6 Dispersers and Extractors [Sipser 88, NZ96] Weak random source x  {0,1} n Random-like output y  {0,1} m Short random seed r  {0,1} d E Definition Disperser / Extractor  distributions X on {0,1} n containing k bits of randomness, Disperser:  large enough S  {0,1} m, Pr (E(X,U d )  S) > 0 Extractor: E(X,U d ) is close to uniform Every extractor is a disperser

7 7 Online Dispersers & Extractors Two types of streaming algorithms for E: All-seed: 1-way input: x 1-way output: E(x,r),  r  {0,1} d Single-seed: Two separate 1-way inputs: x, r 1-way output: E(x,r) Any all-seed streaming algorithm for a disperser requires  (m) space. Matching construction of online “weak extractors”. Theorem (limitations of deterministic amplification in logspace) [BGW99]

8 8 Online Dispersers & Extractors: Our Results Theorem 1 Any single-seed streaming algorithm for an extractor requires  (m) space. Matching constructions of several online extractors. Theorem 2 A construction of a disperser that admits a single- seed streaming algorithm with poly-log(m) space. Surprising separation of extractors and dispersers that are otherwise similar in behavior

9 9 Universal Hash Functions [CW79] Definition (  -almost) universal hash functions A family H = { h: {0,1} n  {0,1} m }, s.t. universal:  x  x’, Pr h (h(x) = h(x’))  1/2 m  -almost:  x  x’, Pr h (h(x) = h(x’))   Lemma [Leftover hash lemmas [HILL98,IZ89] ] For appropriately chosen parameters, H is an (  -almost) universal family of hash functions E(x,h) = h(x), for h  H, is a “strong” extractor

10 10 Online Universal Hash Functions Streaming algorithm for H: Two separate 1-way inputs: h, x 1-way output: h(x) Theorem 3 (corollary from our Theorem 1) Any streaming algorithm for an  -almost universal family of hash functions requires  (m) space. Theorem [MNT93,BTY94] Any streaming algorithm for a universal family of hash functions requires  (m) space.

11 11 Online Error-Correcting Codes Definition ECC C:{0,1} k  {0,1} n s.t.  w  w’  {0,1} k, |C(w)–C(w’)|  d d – minimum distance k/n – rate Encoding streaming algorithm: 1-way input: w  {0,1} k 1-way output: C(w) Decoding streaming algorithm: 1-way input: r  {0,1} n 1-way output: w for which |C(w) – r| is minimum

12 12 Online ECC Lower Bounds Theorem 4 Both encoding and decoding streaming algorithms for any code require  (d · k/n) space. (minimum distance times rate) Matching constructions using simple constant-rate “block codes”.

13 13 The Extractor Lower Bound Theorem For k  n/2, any single-seed streaming algorithm for an extractor requires  (m) space. Notation: E: {0,1} n x {0,1} d  {0,1} m : an extractor A: a single-seed streaming algorithm for E S: space used by A Goal: show that S  m – d

14 14 Intuition of the Proof Two extreme input distributions: X 1 : X is uniform on the last k bits of the input and is otherwise fixed X 2 : X is uniform on the first k bits of the input and is otherwise fixed X 1,X 2 contain k bits of randomness, implying E(X 1,U d ),E(X 2,U d ) are close to uniform. Divide the execution of A into two phases: Phase 1: A reads the first n – k input bits Phase 2: A reads the last k input bits

15 15 Intuition of the Proof (cont.) Case 1: A outputs in Phase 1 at least one bit that depends on the source X  If X = X 1 (that is, its first n – k bits are fixed), then this output bit is fixed, implying E(X,U d ) is far from uniform. Case 2: A does not output in Phase 1 any bits that depend on the source X  If X = X 2 (that is, its first k bits are uniform) then after Phase 1, A has  S bits of X’s randomness in memory. Therefore, m  S + d.

16 16 The Disperser Construction Theorem For any t, there is a disperser with seed length t · poly-log(n) and a single-seed streaming algorithm that runs in m/t + O(log n) space. By choosing t = m/poly-log(m), we obtain an online disperser with poly-log(m) space. Theorem exhibits a tradeoff between space and seed length.

17 17 Outline of the Proof A t-partition of the input: 1 = i 0 < i 1 < … < i t = n + 1 Assume X is a “bit-fixing” source: uniform on S  {0,1} n of size k and fixed otherwise S Input: i0i0 i1i1 i2i2 i t-1 itit · · · A good t-partition of X:  j, |S ∩ [i j-1,i j )| = k/t

18 18 ExtEE Outline of the Proof (cont.) Ext Input:Seed: Choose a partition: Extractor seeds Use part of the seed to choose a random partition. With probability > 0, the partition is good. Use the optimal online extractor to extract randomness from each block Extraction in each block: m/t output bits  m/t space

19 19 Open Problems Lower bound for online dispersers. –Is the tradeoff between seed length and space inherent? Generalize the streaming lower bounds to arbitrary time-space tradeoffs

20 20 Thank You!


Download ppt "1 Streaming Computation of Combinatorial Objects Ziv Bar-Yossef U.C. Berkeley Omer Reingold AT&T Labs – Research Ronen."

Similar presentations


Ads by Google