Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff.

Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff

Turnstile Streaming Model Underlying n-dimensional vector x initialized to 0 n Long stream of updates x Ã x + e i or x Ã x - e i for standard unit vector e i At end of the stream, x 2 {-m, -m+1, …, m-1, m} n for some bound m · poly(n) Output an approximation to f(x) whp Goal: use as little space (in bits) as possible

Example: Norms Suppose you want |x| p p = Ʃ i=1 n |x i | p Want Z for which (1- Ɛ ) |x| p p · Z · (1+ Ɛ ) |x| p p Many applications p = 2 –Geometry, linear algebra p = 1 –Distances between distributions, network monitoring

Algorithm for 2-Norm Let r = 1/ Ɛ 2 Choose an r x n matrix A of i.i.d. N(0,1/r) normal random variables (with precision 1/poly(n)) Maintain Ax in the stream Output |Ax| 2 2 Proof: Johnson-Lindenstrauss Lemma

Algorithm for 1-Norm [Indyk] Let r = 1/ Ɛ 2 Choose an r x n matrix A of i.i.d. Cauchy random variables (with precision 1/poly(n)) Maintain Ax in the stream Output median(|Ax 1 |, …, |Ax r |) Proof: 1-stability of Cauchy distribution –If C 1, C 2 are independent Cauchy r.v.s, then a*C 1 + b*C 2 » (|a| + |b|) C 3 for Cauchy r.v. C 3

Common Features Algorithms for 2-norm and 1-norm have the following form: 1.Choose a random matrix A independent of x 2.Maintain Ax in the stream 3.Output a function of Ax Question (?!): does the optimal algorithm for approximating any function in the turnstile model have this form? Some functions f(x) may be weird: What is x x x 1 Some functions f(x) may be weird: What is x x x 1

Our Results Yes, up to a factor of log n Theorem: for computing a relation f for x in {-m, -m+1, …, m} n in the turnstile model, there is a correct (whp) algorithm which: 1.samples an integer matrix A uniformly from O(n log m) hardwired matrices, independent of x 2.outputs a function of Ax Logarithm of the number of states of Ax, for x in {-m, -m+1, …, m} n, plus amount of randomness, is optimal up to a log n factor

Consequences a 2 {0,1} n Create stream s(a) b 2 {0,1} n Create stream s(b) Lower Bound Technique 1. Run Alg on s(a), transmit state of Alg(s(a)) to Bob 2. Bob computes Alg(s(a), s(b)) 3. If Bob solves g(a,b), space complexity of Alg at least the 1- way communication complexity of g

Consequences a 2 {0,1} n Create stream s(a) b 2 {0,1} n Create stream s(b) Our main theorem implies: If Bob can solve g(a,b), then space of Alg at least the simultaneous communication complexity of g Weaker public-coin model in which Alice and Bob simultaneously send a message to a referee

The log n Factor Loss Main Theorem: The logarithm of the number of states of Ax, as x ranges over {-m, -m+1, …, m} n, plus the amount of randomness to store A, is optimal up to a log n factor The log n loss is necessary Consider f(x) = x 1 mod 2

Non-Uniformity Restriction Careful wording: “samples an integer matrix A uniformly from O(n log m) hardwired matrices, independent of x” Algorithm is non-uniform –Output of each state for each A also hardwired Alternatively, allow algorithm to use more space to process a stream update, provided it only retains Ax and its randomness –Regenerate A during each stream update

Comment on the Model For each random seed, algorithm is a deterministic automaton with a finite number of states Main theorem only requires correctness for x 2 {-m, -m+1, …, m} n It counts the number of states as x varies in this range While processing the stream, may have |x| 1 > m The algorithm can’t abort if this happens. It must still be correct at the end of the stream for x in {-m, -m+1, …, m} n

Related Work Ganguly –Deterministic algorithms –Specific to heavy hitters problem –Shows algorithm might as well be a linear sketch over the reals –Dimension lower bound over the reals

Talk Outline Proof of Main Theorem 1.Reduction to path-independent automata 2.From path-independent automata to linear sketches Applications and Open Questions

Start +e 1 -e 1, +e 2 … -e n -e 1 +e 1 +e n … +e 5 … … …… Stream Automaton for Fixed Randomness Want each state of the automaton to only depend on x, not how it got there 0 n in two different states

Path-Independent Automaton Undirected connected graph Each x 2 Z n in a unique state For each randomness, can we modify the automaton to make it path-independent? Rule out algorithms that remember how they arrived at a state, e.g., an algorithm that stores the last 5 stream updates

Path-Reversible Automaton Path-reversible: 8 states s, if σ is a stream (+e i1, -e i2, -e i3, …,+e ir ) of updates, resulting in a state t, then from t the stream σ -1 = (-e ir, …,+e i3,+e i2, -e i1 ) returns us to s s1 s1 s2 s2 s3 s3 s4 s4 +e 2 -e 1 +e 5 -e 5 +e 1 -e 2 Path-reversible does not imply path-independent

Strategy Arbitrary Automaton Path- Reversible Automaton Path- Independent Automaton For stream σ, freq(σ) 2 Z n is “net update” to each coordinate Idea: 1. if in a state s, and update by a stream σ, with freq(σ) = 0, answers ought to be similar 2. collapse all states s, s’ for which s+σ = s’ and freq(σ) = 0 for some stream σ Issue: how to define new output and transition function?

Zero-Frequency Graph Directed graph G = (V,E) V = states of old automaton A old (for fixed randomness) (s,t) 2 E if there is a stream σ with s+σ=t and freq(σ) = 0 –Finite number of streams to consider Terminal equivalence class: strongly connected component with no outgoing edge –Path in G lands in a terminal equivalence class –States of new automaton A new = terminal equivalence classes

New Transition Function Suppose in terminal equivalence class C Given an update e i Let v 2 C be an arbitrary node Compute v+e i using transition function of A old Walk from v+e i until reach terminal equivalence class C’ –C’ is unique Does not depend on choice of v Only one terminal equivalence class reachable –A new is path-reversible

Terminal equivalence class uv +e i Terminal equivalence class freq(σ) = 0 freq(σ’) = 0

Output Function of A new In each terminal equivalence class C, sample node u from stationary distribution from random walk in C (add self-loops) –Output of A new on C = Output of A old on u If v is starting vertex of A old, –take a random walk in G from v –let starting vertex of A new be terminal equivalence class C reached Why is it correct?

Correctness Let ¦ be an arbitrary distribution on streams ¾ Choose fixed randomness so A old correct on ¦ ’: –Long sequence of zero streams, –Followed by ¾ sampled from ¦, –Followed by long sequence of zero streams Output of A new on ¦ statistically close to output of A old on ¦ ’ => for every ¦ there is a path-reversible A new correct on ¦

Arbitrary Automaton Path- Reversible Automaton Path- Independent Automaton Undirected zero-frequency graph G New automaton states = connected components of G 8 x 2 Z n, only one connected component of G contains states containing x Uses path-reversibility => Well-defined transition function Random walk in components to choose outputs

Path Independent Automata and Submodules Let o be the initial state M = {x 2 Z n such that x in o} 0 n 2 M If x 2 M, then –x 2 M If x, y 2 M, then x+y 2 M M is a free submodule of Z n (a lattice) M has a basis Any two bases have the same cardinality

States of automaton are elements (cosets) of the quotient module Z n /M Space of automaton is log of the number of cosets containing an x 2 {-m, …, m} n Goal: build a sketching algorithm A ¢ x –A is fixed for this automaton –Space of A ¢ x ¼ space of automaton –Injection from states of automaton to states of A ¢ x –Will replace {-m, …, m} n with {-m/n, …, m/n} n Path Independent Automata and Sketches

Smith Normal Form Z n /M examples: –Z n /e 1 is free. It remembers all but first coordinate –Z n /(2e 1, 2e 2, …, 2e n ) not free. It remembers coordinate parities Smith Normal Form: 9 a basis y 1, …, y n of Z n for which the generators of M are q i ¢ y i for i = 1, …, r, where q i | q i+1 are positive integers, and r = rank(M) If q s = 1 but q s+1 > 1, the generators of Z n /M are y s+1 + M, …, y n + M, and Z n /M is isomorphic to Z/q s+1 © … © Z/q r © Z n-r

Counting States Define n x n matrix B, where i-th column B i is coefficients of e i in basis y 1, …, y n State = Bx mod q, after removing first s rows For each i, there are x != x’ 2 {-m/n, …, m/n} n with 1. (Bx) i != (Bx’) i mod q i 2. (Bx) j = (Bx’) j mod q j 8 j != i Proof: otherwise delete row i Corollary: # states ¸ 2 n-s

Removing Torsion Let sketch be Bx without mod q –After reducing entries in B mod q For each old state Bx mod q, at most (mn) n-s new states # new states <= #(old states)*(mn) n-s log(# new states) <= log(# old states)*log(mn)

Handling Large Entries in B Want B in Z (n-s) x s to have integer entries of value at most poly(n) Removing states from M outside of {-m, …, m} n, can assume q i · exp(poly(n)) Take random linear combinations of rows of B, reduce each row mod a random prime Whp if Bx != By, after this transformation to B, Bx != By

Simpler proof of  ~(n 1-2/p ) bit lower bound for estimating F p, p > 2 –No communication complexity Many dimension lower bounds known for sketching norms over the reals –F p, matrix norms, adaptive sketching –Do these give turnstile streaming lower bounds with finite precision?

Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff.

Similar presentations

Presentation on theme: "Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff.

Similar presentations

Presentation on theme: "Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff."— Presentation transcript:

Similar presentations

About project

Feedback