Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington.

Slides:



Advertisements
Similar presentations
EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
Advertisements

Números.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Variations of the Turing Machine
PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
Estimating Distinct Elements, Optimally
The Data Stream Space Complexity of Cascaded Norms T.S. Jayram David Woodruff IBM Almaden.
Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO Based on a paper in STOC, 2012.
Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO.
Optimal Space Lower Bounds for All Frequency Moments David Woodruff MIT
Fill in missing numbers or operations
EuroCondens SGB E.
Worksheets.
Estimating the Sortedness of a Data Stream Parikshit GopalanU T Austin T. S. JayramIBM Almaden Robert KrauthgamerIBM Almaden Ravi KumarYahoo! Research.
Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.
An Optimal Algorithm for the Distinct Elements Problem
Optimal Bounds for Johnson- Lindenstrauss Transforms and Streaming Problems with Sub- Constant Error T.S. Jayram David Woodruff IBM Almaden.
Xiaoming Sun Tsinghua University David Woodruff MIT
Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.
Subspace Embeddings for the L1 norm with Applications Christian Sohler David Woodruff TU Dortmund IBM Almaden.
Sequential Logic Design
Addition and Subtraction Equations
By John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman
1 When you see… Find the zeros You think…. 2 To find the zeros...
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
1 1  1 =.
1  1 =.
CHAPTER 18 The Ankle and Lower Leg
Introduction to Turing Machines
The 5S numbers game..
突破信息检索壁垒 -SciFinder Scholar 介绍
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
1 OFDM Synchronization Speaker:. Wireless Access Tech. Lab. CCU Wireless Access Tech. Lab. 2 Outline OFDM System Description Synchronization What is Synchronization?
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
The basics for simulations
EE, NCKU Tien-Hao Chang (Darby Chang)
MM4A6c: Apply the law of sines and the law of cosines.
Figure 3–1 Standard logic symbols for the inverter (ANSI/IEEE Std
1 Prediction of electrical energy by photovoltaic devices in urban situations By. R.C. Ott July 2011.
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
Progressive Aerobic Cardiovascular Endurance Run
Area under curves Consider the curve y = f(x) for x  [a, b] The actual area under the curve is units 2 The approximate area is the sum of areas.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Artificial Intelligence
When you see… Find the zeros You think….
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Subtraction: Adding UP
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
Resistência dos Materiais, 5ª ed.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Distributed Computing 9. Sorting - a lower bound on bit complexity Shmuel Zaks ©
WARNING This CD is protected by Copyright Laws. FOR HOME USE ONLY. Unauthorised copying, adaptation, rental, lending, distribution, extraction, charging.
A Data Warehouse Mining Tool Stephen Turner Chris Frala
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Introduction Embedded Universal Tools and Online Features 2.
The Communication Complexity of Approximate Set Packing and Covering
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Longest Increasing Subsequence and Distance to Monotonicity in Data Stream Model Hossein Jowhari Simon Fraser University Joint work with Funda Ergun Dagstuhl.
Data Stream Algorithms Lower Bounds Graham Cormode
The Message Passing Communication Model David Woodruff IBM Almaden.
Information Complexity Lower Bounds
New Characterizations in Turnstile Streams with Applications
Presentation transcript:

Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington & UT Austin

Data Stream Model of Computation X 1 X 2 X 3 …X n Input Storage Single pass. Small storage space, update time. Surprisingly powerful [Alon-Matias-Szegedy, …]

Estimated Sortedness on Data-Streams Cannot sort efficiently. Can we tell if the data needs to be sorted? [Ajtai-Jayram-Kumar-Sivakumar, Gupta-Zane, Cormode-Muthukrishnan-Sahinalp, LibenNowell-Vee-Zhu, Woodruff-Sun, G.-Jayram-Kumar-Sivakumar] Measuring Sortedness: Length of Longest Increasing Subsequence. Ulam/Edit distance Inversion/Kendall Tau distance

LIS( ): Length of Longest Increasing Subsequence Longest Increasing Subsequence

LIS( ): Length of Longest Increasing Subsequence Studied in statistics, biology, computer science … [Gusfeld, Pevzner, Aldous-Diaconis…] Longest Increasing Subsequence

Prior Work Exact Computation of LIS( ) : –Patience Sorting [Ross,Mallows] O(n) space, 1-pass streaming algorithm. – n) space lower bound. [G.-Jayram-Krauthgamer- Kumar07, Woodruff-Sun07] Approximating LIS( ) : –Deterministic, O(n/ ) 1/2 space, (1 + )-approx. [G.-Jayram-Krauthgamer-Kumar07] Conjecture [GJKK]: Every 1-pass deterministic algorithm that gives a 1.1-approximation to LIS( ) requires n) space.

Our Results Thm: Any det. O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space (n/ ). Tight bounds in n,. Proof via direct sum approach. Direct sum for maximum communication in the private messages model. Separation between communication models.

A Communication Problem Consider the following problem: t players, t numbers each. Goal: Approximate length of the LIS. Enough to show a lower bound of (t) on maximum message size

A Communication Problem Consider the following problem – t players, t numbers each. Goal: Approximate length of the LIS. Enough to show a lower bound of (t) on maximum message size P1P2…PtP1P2…Pt

A Communication Problem No Yes P1P2…PtP1P2…Pt [GJKK]: Consider the following decision problem –

No Yes All columns non-increasing P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem –

No Yes P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem – All columns non-increasing

No Yes Some column increasing P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem – All columns non-increasing

No Yes Some column increasing P1P2…PtP1P2…Pt A Communication Problem [GJKK]: Consider the following decision problem – All columns non-increasing

Direct Sum Paradigm x1x1 y1y1 p(x 1, y 1 ) Primitive Problem:

Direct Sum Paradigm x 1,…,x n y 1,…,y n Ç i p(x i, y i ) Can run n copies of protocol for p. Direct-Sum Question: Is this the best possible? Set-Disjointness, Inner Product… Techniques for proving direct-sum theorems: [KN,CKSW,BJKS,SS…] Direct Sum Problem:

Primitive Problem No Yes P1P2…PtP1P2…Pt

Direct Sum of Primitive Problems No Yes P1P2…PtP1P2…Pt All No instances

Direct Sum of Primitive Problems No Yes P1P2…PtP1P2…Pt All No instancesOne Yes instance

Direct Sum of Primitive Problems No Yes P1P2…PtP1P2…Pt

Direct Sum of Primitive Problems No Yes P1P2…PtP1P2…Pt Techniques for proving direct-sum theorems: [KN,CSWY,BJKS,SS,…]

No Yes [GG] An Easier Problem Hope: Some player distinguishes between many No instances.

BlackBoard Model of One-Way Communication Players speak in order. Every message seen by all. Last player outputs answer.

NoYes Problem is Easy in the BlackBoard model BlackBoard protocol with max. communication 2 log(m).

NoYes Problem is Easy in the BlackBoard model BlackBoard protocol with max. communication 2 log(m).

Private Messages Model Messages seen by next player only. Suffices for streaming lower bound. Requires non-standard techniques.

No Yes Strong lower bound for maximum communication in the private messages model. Thm: Any det. O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space (n/ ). Private Messages Model Separation between blackboard and private messages.

Proof Outline Step 1: Primitive Problem (one round). Step 2: Direct-sum Problem (one-round). Multi-round Protocols.

Primitive Problem No Yes P1P2…PtP1P2…Pt Alphabet of size m > t. Yes Case: LIS( ) > t/2. Easy: Bound of (log m)/t on max communication. Thm: Max communication is at least log (m/t).

Lower Bound for Primitive Problem aaa a P i s message is specified by prefix x 1 …x i. M i (a): Prefixes where P i sends the same message as a…a. q i (a): Length of longest IS in M i (a) ending below a. a…a x 1 …x i a a…a

Lower Bound for Primitive Problem M i (a): Inputs where P i sends the same message as a…a. q i (a): Length of longest IS in M i (a) ending below a. i q i (a) Monotone x 1 …x i 2 M i (a) ) x 1 …x i a 2 M i+1 (a) Bounded by t/2 Correctness. aaa a

Lower Bound for Primitive Problem M i (a): Inputs where P i sends the same message as a…a. q i (a): Length of longest IS in M i (a) ending below a. i q i (a) Map a to first i s.t q i-1 (a) = q i (a). Some i occurs m/t times. aaa a

Lower Bound for Primitive Problem P i-1 PiPi x 1 < … < x i-1 = a Claim: P i-1 must distinguish a…a from b…b from c…c. a…a x 1 …x i-1 b…b c…c y 1 …y i-1 z 1 …z i-1 m/t y 1 < … < y i-1 = b z 1 < … < z i-1 = c

Lower Bound for Primitive Problem Hence P i-1 must distinguish a…a from b…b from c…c. Gives log(m/t) lower bound. a…a x 1 …x i-1 b…b y 1 …y i-1 a…ab x 1 …x i-1 b b…bb y 1 …y i-1 b x 1 · … · x i-1 = a · b But q i (b) = i-1. Contradiction. P i-1 PiPi

Lower Bound for General Problem a 1 …a t M i (a 1 …a t ): i £ t prefixes where P i sends the same message as (a 1 …a t ) i. q i,j (a 1 …a t ): Length of longest IS in column j ending at/before a j. a 1 …a t x 1,1 x 1,2 …x 1,t ………… x i,1 x i,2 …x i,t a1a1 a2a2 …atat ………… a1a1 a2a2 …atat

M i (a 1 …a t ): i £ t prefixes where P i sends the same message as (a 1 …a t ) i. q i,j (a 1 …a t ): Length of longest IS in column j ending at/before a j.... q i,1 (a) q i,t (a) Lower Bound for General Problem a 1 …a t

M i (a 1 …a t ): i £ t prefixes where P i sends the same message as (a 1 …a t ) i. q i,j (a 1 …a t ): Length of longest IS in column j ending at/before a j. Lower Bound for General Problem a 1 …a t... q i,1 (a) q i,t (a)

Part I: By pigeonhole, find 1.A good player P i 2.A good set S µ [t] of columns 3.A good set I µ [m] t of (m/t) t inputs where... q i,1 (a) q i,t (a) Lower Bound for General Problem a 1 …a t

Part II: Show that P i-1 distinguishes between inputs in I of (m/t) t inputs. Gives a lower bound of log(|I|) t log (m/t) Lower Bound for General Problem a 1 …a t

Part I: Messages sent by P i in round 2 and beyond depend on entire input. Need to change defn. of M i (a 1 …a t ). Lower Bound for Many Rounds a 1 …a t

Part I: Messages sent by P i in round 2 and beyond depend on entire input. Need to change defn. of M i (a 1 …a t ). Part II: Reduce to 2-player protocol involving P i-1 and P t. Thm: Any deterministic O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space (n/ ). Lower Bound for Many Rounds a 1 …a t

Conclusions Exact Computation of LIS( ) : –Patience Sorting [Ross,Mallows] –O(n) space, 1-pass streaming algorithm. – n) space lower bound. [G.-Jayram-Krauthgamer- Kumar, Woodruff-Sun] Approximating LIS( ) : –O(n/ ) 1/2 space, deterministic 1-pass algorithm. [G.- Jayram-Krauthgamer-Kumar] –This paper: The bound is tight for deterministic, O(1)-pass algorithms. –[Ergun-Jowhari08]: Different proof.

Randomized Complexity of LIS Problem: Is the a randomized streaming algorithm to approximate the LIS using space o(n) ? [Woodruff-Sun] O(log m) lower bound [Chakrabarti]: Randomized private-messages protocol for the direct-sum problem.

Prior Work Exact Computation of LIS( ) : –Patience Sorting [Ross,Mallows ]

Patience Sorting [Ross,Mallows] Track best inc. seq. of length i, for all i. A[i]: Smallest number ending an IS of length i. Patience Sorting: Dynamic program to compute A[i].

Approximate Patience Sorting [GJKK] Track best inc. seq. of length i, for all i. A[i]: Smallest number ending an IS of length i. Patience Sorting: Dynamic program to compute A[i]. Approx. Patience Sorting: Store A[i] for at most n values of i.

Lower Bounds for approximating the LIS Conjecture [GJKK]: For some ε 0 > 0, every 1-pass deterministic algorithm that gives a (1 + ε 0 ) approximation to LIS( ) requires n) space. Candidate Hard Instances: P1P2…PtP1P2…Pt

Protocol for BlackBoard model

Protocol for BlackBoard model

Protocol for BlackBoard model

Primitive Problem No Yes P1P2…PtP1P2…Pt Does the direct sum property hold for this problem?