Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington.

Similar presentations


Presentation on theme: "Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington."— Presentation transcript:

1 Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington & UT Austin

2 Data Stream Model of Computation X 1 X 2 X 3 …X n Input Storage Single pass. Small storage space, update time. Surprisingly powerful [Alon-Matias-Szegedy, …]

3 Estimated Sortedness on Data-Streams Cannot sort efficiently. Can we tell if the data needs to be sorted? [Ajtai-Jayram-Kumar-Sivakumar, Gupta-Zane, Cormode-Muthukrishnan-Sahinalp, LibenNowell-Vee-Zhu, Woodruff-Sun, G.-Jayram-Kumar-Sivakumar] Measuring Sortedness: Length of Longest Increasing Subsequence. Ulam/Edit distance Inversion/Kendall Tau distance

4 LIS( ): Length of Longest Increasing Subsequence. 5 7 8 1 4 2 10 3 6 9 Longest Increasing Subsequence

5 LIS( ): Length of Longest Increasing Subsequence. 5 7 8 1 4 2 10 3 6 9 Studied in statistics, biology, computer science … [Gusfeld, Pevzner, Aldous-Diaconis…] Longest Increasing Subsequence

6 Prior Work Exact Computation of LIS( ) : –Patience Sorting [Ross,Mallows] O(n) space, 1-pass streaming algorithm. – n) space lower bound. [G.-Jayram-Krauthgamer- Kumar07, Woodruff-Sun07] Approximating LIS( ) : –Deterministic, O(n/ ) 1/2 space, (1 + )-approx. [G.-Jayram-Krauthgamer-Kumar07] Conjecture [GJKK]: Every 1-pass deterministic algorithm that gives a 1.1-approximation to LIS( ) requires n) space.

7 Our Results Thm: Any det. O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space (n/ ). Tight bounds in n,. Proof via direct sum approach. Direct sum for maximum communication in the private messages model. Separation between communication models.

8 A Communication Problem Consider the following problem: t players, t numbers each. Goal: Approximate length of the LIS. Enough to show a lower bound of (t) on maximum message size. 1.6 2.8 3.5 4.61.8 2.9 3.7 4.9 1 2 3.2 4.2

9 A Communication Problem Consider the following problem – t players, t numbers each. Goal: Approximate length of the LIS. Enough to show a lower bound of (t) on maximum message size. 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 123.24.2 P1P2…PtP1P2…Pt

10 A Communication Problem 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 123.24.2 No Yes 1.72.83.44.8 1.62.63.54.6 1.32.53.14.5 1.12.13.94.2 P1P2…PtP1P2…Pt [GJKK]: Consider the following decision problem –

11 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 123.24.2 1.72.83.44.8 1.62.63.54.6 1.32.53.14.5 1.12.13.94.2 No Yes All columns non-increasing P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem –

12 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 123.24.2 1.72.83.44.8 1.62.63.54.6 1.32.53.14.5 1.12.13.94.2 No Yes P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem – All columns non-increasing

13 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 123.24.2 1.72.83.44.8 1.62.63.54.6 1.32.53.14.5 1.12.13.94.2 No Yes Some column increasing P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem – All columns non-increasing

14 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 123.24.2 1.72.83.44.8 1.62.63.54.6 1.32.53.14.5 1.12.13.94.2 No Yes Some column increasing P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem – All columns non-increasing

15 Direct Sum Paradigm x1x1 y1y1 p(x 1, y 1 ) Primitive Problem:

16 Direct Sum Paradigm x 1,…,x n y 1,…,y n Ç i p(x i, y i ) Can run n copies of protocol for p. Direct-Sum Question: Is this the best possible? Set-Disjointness, Inner Product… Techniques for proving direct-sum theorems: [KN,CKSW,BJKS,SS…] Direct Sum Problem:

17 Primitive Problem 0.7 0.5 0.3 0.2 0.4 0.5 0.4 0.9 No Yes P1P2…PtP1P2…Pt

18 Direct Sum of Primitive Problems 0.7 0.5 0.3 0.2 No Yes P1P2…PtP1P2…Pt 0.9 0.8 0.5 0.2 0.9 0.6 0.5 0.2 0.8 0.6 0.3 0.0 0.7 0.6 0.3 0.1 0.8 0.6 0.5 0.1 0.4 0.5 0.1 0.9 0.8 0.6 0.5 0.2 All No instances

19 Direct Sum of Primitive Problems 0.7 0.5 0.3 0.2 No Yes P1P2…PtP1P2…Pt 0.9 0.8 0.5 0.2 0.9 0.6 0.5 0.2 0.8 0.6 0.3 0.0 0.7 0.6 0.3 0.1 0.8 0.6 0.5 0.1 0.4 0.5 0.1 0.9 0.8 0.6 0.5 0.2 All No instancesOne Yes instance

20 Direct Sum of Primitive Problems No Yes P1P2…PtP1P2…Pt 0.80.90.70.9 0.60.80.50.6 0.30.50.30.5 0.00.2 0.70.80.40.9 0.6 0.50.6 0.30.50.10.5 0.1 0.90.2

21 Direct Sum of Primitive Problems No Yes P1P2…PtP1P2…Pt 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 1.02.23.24.2 1.72.83.44.9 1.62.63.54.6 1.32.53.14.5 1.12.13.94.2 Techniques for proving direct-sum theorems: [KN,CSWY,BJKS,SS,…]

22 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.72.83.44.8 1.72.83.54.8 1.72.83.44.8 1.72.83.94.8 No Yes [GG] An Easier Problem Hope: Some player distinguishes between many No instances.

23 BlackBoard Model of One-Way Communication Players speak in order. Every message seen by all. Last player outputs answer.

24 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.72.83.44.8 1.72.83.54.8 1.72.83.44.8 1.72.83.94.8 NoYes Problem is Easy in the BlackBoard model BlackBoard protocol with max. communication 2 log(m).

25 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.72.83.44.8 1.72.83.54.8 1.72.83.44.8 1.72.83.94.8 NoYes Problem is Easy in the BlackBoard model BlackBoard protocol with max. communication 2 log(m).

26 Private Messages Model Messages seen by next player only. Suffices for streaming lower bound. Requires non-standard techniques.

27 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.82.93.74.9 1.72.83.44.8 1.72.83.54.8 1.72.83.44.8 1.72.83.94.8 No Yes Strong lower bound for maximum communication in the private messages model. Thm: Any det. O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space (n/ ). Private Messages Model Separation between blackboard and private messages.

28 Proof Outline Step 1: Primitive Problem (one round). Step 2: Direct-sum Problem (one-round). Multi-round Protocols.

29 Primitive Problem 3.4 3.5 3.4 3.9 No Yes P1P2…PtP1P2…Pt Alphabet of size m > t. Yes Case: LIS( ) > t/2. Easy: Bound of (log m)/t on max communication. Thm: Max communication is at least log (m/t).

30 Lower Bound for Primitive Problem aaa a P i s message is specified by prefix x 1 …x i. M i (a): Prefixes where P i sends the same message as a…a. q i (a): Length of longest IS in M i (a) ending below a. a…a x 1 …x i a a…a

31 Lower Bound for Primitive Problem M i (a): Inputs where P i sends the same message as a…a. q i (a): Length of longest IS in M i (a) ending below a. i q i (a) Monotone x 1 …x i 2 M i (a) ) x 1 …x i a 2 M i+1 (a) Bounded by t/2 Correctness. aaa a

32 Lower Bound for Primitive Problem M i (a): Inputs where P i sends the same message as a…a. q i (a): Length of longest IS in M i (a) ending below a. i q i (a) Map a to first i s.t q i-1 (a) = q i (a). Some i occurs m/t times. aaa a

33 Lower Bound for Primitive Problem P i-1 PiPi x 1 < … < x i-1 = a Claim: P i-1 must distinguish a…a from b…b from c…c. a…a x 1 …x i-1 b…b c…c y 1 …y i-1 z 1 …z i-1 m/t y 1 < … < y i-1 = b z 1 < … < z i-1 = c

34 Lower Bound for Primitive Problem Hence P i-1 must distinguish a…a from b…b from c…c. Gives log(m/t) lower bound. a…a x 1 …x i-1 b…b y 1 …y i-1 a…ab x 1 …x i-1 b b…bb y 1 …y i-1 b x 1 · … · x i-1 = a · b But q i (b) = i-1. Contradiction. P i-1 PiPi

35 Lower Bound for General Problem a 1 …a t M i (a 1 …a t ): i £ t prefixes where P i sends the same message as (a 1 …a t ) i. q i,j (a 1 …a t ): Length of longest IS in column j ending at/before a j. a 1 …a t x 1,1 x 1,2 …x 1,t ………… x i,1 x i,2 …x i,t a1a1 a2a2 …atat ………… a1a1 a2a2 …atat

36 M i (a 1 …a t ): i £ t prefixes where P i sends the same message as (a 1 …a t ) i. q i,j (a 1 …a t ): Length of longest IS in column j ending at/before a j.... q i,1 (a) q i,t (a) Lower Bound for General Problem a 1 …a t

37 M i (a 1 …a t ): i £ t prefixes where P i sends the same message as (a 1 …a t ) i. q i,j (a 1 …a t ): Length of longest IS in column j ending at/before a j. Lower Bound for General Problem a 1 …a t... q i,1 (a) q i,t (a)

38 Part I: By pigeonhole, find 1.A good player P i 2.A good set S µ [t] of columns 3.A good set I µ [m] t of (m/t) t inputs where... q i,1 (a) q i,t (a) Lower Bound for General Problem a 1 …a t

39 Part II: Show that P i-1 distinguishes between inputs in I of (m/t) t inputs. Gives a lower bound of log(|I|) t log (m/t) Lower Bound for General Problem a 1 …a t

40 Part I: Messages sent by P i in round 2 and beyond depend on entire input. Need to change defn. of M i (a 1 …a t ). Lower Bound for Many Rounds a 1 …a t

41 Part I: Messages sent by P i in round 2 and beyond depend on entire input. Need to change defn. of M i (a 1 …a t ). Part II: Reduce to 2-player protocol involving P i-1 and P t. Thm: Any deterministic O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space (n/ ). Lower Bound for Many Rounds a 1 …a t

42 Conclusions Exact Computation of LIS( ) : –Patience Sorting [Ross,Mallows] –O(n) space, 1-pass streaming algorithm. – n) space lower bound. [G.-Jayram-Krauthgamer- Kumar, Woodruff-Sun] Approximating LIS( ) : –O(n/ ) 1/2 space, deterministic 1-pass algorithm. [G.- Jayram-Krauthgamer-Kumar] –This paper: The bound is tight for deterministic, O(1)-pass algorithms. –[Ergun-Jowhari08]: Different proof.

43 Randomized Complexity of LIS Problem: Is the a randomized streaming algorithm to approximate the LIS using space o(n) ? [Woodruff-Sun] O(log m) lower bound [Chakrabarti]: Randomized private-messages protocol for the direct-sum problem.

44 Prior Work Exact Computation of LIS( ) : –Patience Sorting [Ross,Mallows ]

45 Patience Sorting [Ross,Mallows] Track best inc. seq. of length i, for all i. A[i]: Smallest number ending an IS of length i. Patience Sorting: Dynamic program to compute A[i].

46 Approximate Patience Sorting [GJKK] Track best inc. seq. of length i, for all i. A[i]: Smallest number ending an IS of length i. Patience Sorting: Dynamic program to compute A[i]. Approx. Patience Sorting: Store A[i] for at most n values of i.

47 Lower Bounds for approximating the LIS Conjecture [GJKK]: For some ε 0 > 0, every 1-pass deterministic algorithm that gives a (1 + ε 0 ) approximation to LIS( ) requires n) space. Candidate Hard Instances: 1.82.93.74.9 1.62.83.54.6 1.32.53.34.5 123.24.2 P1P2…PtP1P2…Pt

48 Protocol for BlackBoard model 1.22.43.64.85.26.4 1.22.43.64.85.26.1 1.22.43.64.85.26.4 1.22.43.64.85.26.4 1.22.43.64.85.26.6 1.22.43.64.85.26.4

49 Protocol for BlackBoard model 1.22.43.64.85.26.4 1.22.43.64.85.26.1 1.22.43.64.85.26.4 1.22.43.64.85.26.4 1.22.43.64.85.26.6 1.22.43.64.85.26.4

50 Protocol for BlackBoard model 1.22.43.64.85.26.4 1.22.43.64.85.26.1 1.22.43.64.85.26.4 1.22.43.64.85.26.4 1.22.43.64.85.26.6 1.22.43.64.85.26.4

51 Primitive Problem 3.4 3.5 3.4 3.9 No Yes P1P2…PtP1P2…Pt Does the direct sum property hold for this problem?


Download ppt "Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington."

Similar presentations


Ads by Google