Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Similar presentations


Presentation on theme: "The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff."— Presentation transcript:

1 The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff MIT

2 The Problem Stream of elements a 1, …, a n 2 Algorithm given one pass over stream Problem: Compute the longest increasing subsequence (LIS) – in this case answer is (3,7)

3 Previous Work Let k be the length of the LIS of the stream There exists an algorithm which computes the LIS with O(k 2 log | |) space [LNVZ05] Trivial (k) lower bound Our first result: Improve both bounds to a tight (k 2 log | |/k)

4 Our Lower Bound AliceBob Reduction from indexing function: x 2 {0,1} n i 2 [n] = {1, 2, …, n} Randomized 1-way communication is (n) What is x i ?

5 AliceBob x 2 {0,1} n i 2 [n] = {1, 2, …, n} What is x i ? Construct a stream AConstruct a stream B 1.From LIS(A, B), Bob can get x i 2. |LIS(A, B)| = k, where k is input parameter

6 Alice Alice uses x to create k-1 increasing sequences A 1, …, A k-1 For each j, A j has length j. Each bit of x is encoded in some sequence A j Every element in A k-1 is larger than every element in A k-2, every element in A k-2 larger than every element in A k-3, etc. Set A = A k-1,…, A 2, A 1 x 2 {0,1} n A: A1A1 A2A2 A k-1 … Value Position in stream

7 Bob i 2 [n] Bob uses i to recover A j, the sequence encoding x i Bob creates an increasing subsequence B of length k-j, Every element in B is greater than A r if r j A j-1 A j+1 Value Position in stream AjAj B: B

8 AliceBob x 2 {0,1} n i 2 [n] What is x i ? A = A k-1, …, A 2, A 1 B A j-1 A j+1 Value Position in stream AjAj B LIS(A, B) = A j, B, and |LIS(A, B)| = k But x i encoded in A j, so Bob recovers x i

9 Thus, any streaming algorithm must use (n) space. But what is n? We need to construct k increasing sequences that are different for different x in {0,1} n Assume | | large. Divide into k-1 blocks of size | |/(k-1) Let A j be a random increasing sequence of length j in block j. The space to represent A j is (k log | |/k) for j > k/2 Set n = (k 2 log | |/k).

10 Our Upper Bound When processing the stream, keep lists A[1], A[2], …, A[k]. A[j] is an LIS of length j in the stream with minimal last element. Let L[1], L[2], …, L[k] be last elements of A[1], A[2], …, A[k] To process item x, find i for which L[i] < x < L[i+1], and replace A[i+1] with A[i], x

11 So we have k arrays A[1], …, A[k], each of length at most k. Naively, this takes O(k 2 log | |) space. But the A i are increasing, so can compress the list by storing differences. Total space is O(k 2 log | |/k).

12 This talk First result: a tight space bound for the LIS problem Second result: tight bounds for longest common subsequence (LCS)

13 LCS Bounds Problem: Alice has a permutation of [N], Bob has a permutation of [N]. Decide if |LCS(, )| ¸ k. Previous space bound: (k) [LNVZ05] Our space bound: (N) for 3 · k · N/2 (holds for randomized O(1)-pass algorithms)

14 LCS Bounds Why can we only prove (N) for 3 · k · N/2? If k = 2, reduces to equality test. If k large, there are at most O(N 2(N-k) ) permutations with |LCS(, )| > k, so just use an equality test with error O(1/N 2(N-k) )

15 Our Lower Bound Padding lemma: if for k = 3 the randomized communication complexity is (N), then its (N) for all k · N/2 Proof: just pad each of the inputs by some common subsequence of length k-3

16 AliceBob Remains to show high complexity for k =3. We reduce from disjointness x 2 {0,1} n y 2 {0,1} n Randomized multi-way communication is (n) Is there an i such that x i = y i = 1?

17 AliceBob x 2 {0,1} N/3 y 2 {0,1} N/3 Construct Want |LCS(, )| ¸ 3 iff x and y are disjoint Is there an i such that x i = y i = 1?

18 Alice x 2 {0,1} N/3 Divide 1, …, N into N/3 groups G 1 = (1, 2, 3), G 2 = (4, 5, 6), …, G N/3 = (N-2, N-1, N). Use x to choose 1, …, N/3 i acts on G i i acts on G i If x i = 0, i (m+1, m+2, m+3) = (m+1, m+2, m+3). If x i = 1, i (m+1, m+2, m+3) = (m+1, m+3, m+2). = 1, 2, …, N/3

19 Bob y 2 {0,1} N/3 = N/3, …, 1 Divide 1, …, N into N/3 groups G 1 = (1, 2, 3), G 2 = (4, 5, 6), …, G N/3 = (N-2, N-1, N). Use y to choose 1, …, N/3 i acts on G i If y i = 0, i (m+1, m+2, m+3) = (m+3, m+2, m+1). If y i = 1, I (m+1, m+2, m+3) = (m+1, m+3, m+2).

20 1 (G 1 ) 2 (G 2 ) 3 (G 3 ) N/3 (G N/3 ) … 3 (G 3 ) 2 (G 2 ) 1 (G 1 ) … Claim: |LCS(, )| · 3. Proof: Use the fact that LCS(, ) intersects at most one G i Claim: |LCS(, )| = 3 iff there is some i with x i = y i = 1 Proof: Use the way we defined i and i Thus, can decide disjointness, so (N) communication.

21 Other results Tight space bounds for computing the LIS length. Generalization to approximate LIS and LCS. Still many gaps here. Example: approximate LIS length, we have (1/ ) and O(k log | |). Recent work [GJKK07] has shown O(sqrt(N/ ) log | |), but still large gap.

22 Conclusion First result: a tight bound for the LIS Second result: an (N) space bound for the LCS k-decision problem for 3 · k · N/2 Other results for approximation problems Another open question: extend our lower bound for LIS to randomized multi-round


Download ppt "The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff."

Similar presentations


Ads by Google