# 1 Estimating the longest increasing sequence in polylogarithmic time C. Seshadhri (Sandia National Labs) Joint work with Michael Saks (Rutgers University)

## Presentation on theme: "1 Estimating the longest increasing sequence in polylogarithmic time C. Seshadhri (Sandia National Labs) Joint work with Michael Saks (Rutgers University)"— Presentation transcript:

1 Estimating the longest increasing sequence in polylogarithmic time C. Seshadhri (Sandia National Labs) Joint work with Michael Saks (Rutgers University) Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04- 94AL85000

2 The problem Given array f:[n] N, find (length of) Longest Increasing Subsequence (LIS) Rather self-explanatory By now, textbook dynamic programming problem [CLRS 01] Chapter 15.4 (Longest Common Subsequence), Starred Problem 15.4-6 [Schensted 61, Fredman 75] O(n log n) algorithm 42410915172018419341015171819

3 Too much to read Array f is extremely large, so cant read all of it What can we say about LIS length, if we see very little? |LIS| = LIS length Read only poly(log n) positions Obviously randomized Algorithm 574892 |LIS| is in range [0.4n, 0.6n]

4 Uniform sample says nothing Choose uniform random sample of poly(log n) size |LIS| = n/2, but random sample always increasing So not really that easy to learn about |LIS|… 2143658710949

5 Our result We want range to be small 1 n |LIS| Algorithm |LIS| in this range

6 Our result We want range to be small [This work] For any (constant) δ > 0 Algorithm gives additive δ n approximation to |LIS| Running time is (1/ δ ) 1/ δ (log n) c 1 n |LIS| Algorithm |LIS| in this range δnδn

7 Our result We want range to be small [This work] For any (constant) δ > 0 Algorithm gives additive δ n approximation to |LIS| Running time is (1/ δ ) 1/ δ (log n) c [Ailon Chazelle Liu S 03] [Parnas Ron Rubinfeld 03] Previous best: δ = ½ 1 nn/2 |LIS| δnδn Ad alert!

8 Our result We want range to be small [This work] For any (constant) δ > 0 Algorithm gives additive δ n approximation to |LIS| Running time is (1/ δ ) 1/ δ (log n) c We get (1+ δ )-approx to distance to monotonicity Previously best was factor 2 1 nn/2 |LIS| δnδn Ad alert!

9 Prelims: the array in space 4201091541015 4 10 15 20 123

10 Prelims: the array in space Input is points in plane, given as array (LIS is longest chain in partial order) Increasing sequence Violation

11 A hard example The decision for a point depends on small scale properties of far away portions k k k 3k k k k 10 points in each |LIS| = 4k|LIS| = 2k

12 A hard example Random samples in neighborhoods of points are identical! Can we really estimate LIS in polylog time? Is it time for some heavy work? I mean, time for lbs (lower bounds). kk k k 3k k k

Outline (or lack thereof) Will I show proofs? No Will I show the algorithm? Maybe I will try to demonstrate the main insight By a series of thought experiments 13

14 The dynamic program Closest LIS point to left gives splitter Find LIS is each blue region. Piece together! So we break up original problem into subproblems n/2 Splitter Closest LIS point to left

15 The dynamic program But we dont know right splitter. So try all possible! Only n different choices Choose the one that gives the largest sum of LISs Max S (|LIS-below-S| + |LIS-above-S|) n/2 S

16 The dynamic program If you LIS in all small boxes, you can build LIS for bigger boxes Not the most efficient DP… So our sublinear algo will mimic this process n/2

17 The IP n/2 Is this point on LIS? Where is the splitter? It is there. LIS is in blue region Splitter

18 The IP n/2 Where is the splitter? It is there. LIS is in blue region This point NOT on LIS

19 The IP n/2 I wish we knew the splitter in that region It is there. 3n/4

20 The IP n/2 I think I know what will happen next Youre lucky Im here 3n/4 5n/8

21 The IP n/2 I think I know what will happen next Youre lucky Im here 3n/4 5n/8

22 The IP n/2 I think I know what will happen next Youre lucky Im here 3n/4 5n/8

23 The interactive protocol If point stays in blue region till very end, then it is good (on LIS). Otherwise, bad. This takes (log n) steps, with the help of the wizard

24 The interactive protocol If point stays in blue region till very end, then it is good (on LIS). Otherwise, bad. This takes (log n) steps, with the help of the wizard If we could simulate the wizard…

25 The interactive protocol If point stays in blue region till very end, then it is good (on LIS). Otherwise, bad. This takes (log n) steps, with the help of the wizard If we could simulate the wizard… What?? If you could simulate the wizard, you know the LIS!

26 Find a splitter n/2 Finding splitter may be hard, so try for approximate versions…? But how do we determine the number of LIS points? If very few LIS points outside blue, this is not a bad splitter

27 Find a splitter n/2 If μ < 1/(100 log n), being against health care conservative is good enough Total no. of points outside blue< μn Conservative splitter

28 Easy to check n/2 Count fraction of sample outside blue poly(log n) samples checks this accurately

29 Getting a conservative splitter n/2 We can sample (log n) different candidates and check which of them disbelieves evolution is conservative What if no conservative splitter exists?

30 A liberal paradise n/2 So we know that |LIS| < (1- μ ) n Leads to the next idea. Boosting approximations! Given δ -approx to LIS, can we get improve to δ ? Choose any lineNo. of points outside at least μn

31 Boosting approximations n/2 Take sum of outputs as total LIS estimate |LIS| = |LIS 1 | + |LIS 2 |, Est = Est 1 + Est 2 |Est 1 – LIS 1 | < δ n 1 |Est 2 – LIS 2 | < δ n 2 So |Est – LIS| < δ (n 1 + n 2 ) n 1 +n 2 < (1-μ)n, so |Est – LIS| < δ (1-μ)n ! Real splitter No. of points outside at least μn Run δ n-approx on points in box Run δ n-approx on points in box n1n1 n2n2

32 Putting it together n/2 Check if each is conservative splitter If it is, were found right subproblems Otherwise… Conservative splitter?

33 Putting it together n/2 One of these is close enough to real splitter Est(S) = Left-Est(S) + Right-Est(S) Run δ n-approx on points in box Run δ n-approx on points in box S

34 Putting it together n/2 One of these is close enough to real splitter Est(S) = Left-Est(S) + Right-Est(S) Final Estimate = max S Est(S) Looks like a great idea! We go from δ n to δ (1- μ)n. Recur to keep improving approximation Run δ n-approx on points in box Run δ n-approx on points in box S

35 It fails, miserably As we go up each level, approx gets better by (1-μ). So to get δ 0 = ¼, how many levels needed? ¼ = ½ (1-μ) t So t = 1/μ We have running time at least 2 1/μ. So, μ needs to be > 1/log log n. Alg δ 0 = δ 1 (1-μ) δ1δ1 δ2δ2 Alg ½ 1/μ

36 Find a splitter n/2 If μ < 1/(100 log n), being against health care conservative is good enough Total no. of points outside blue< μn Conservative splitter

37 The basic dichotomy P We find splitter Continue IP Cannot find splitter The Dynamic Programming phase The Interactive Protocol phase For IP, we need μ < 1/log n μn is error in each level of IP For DP, we need μ > 1/log log n (1-μ) is decrease in approximation

38 The basic dichotomy P We find splitter Continue IP Cannot find splitter The Dynamic Programming phase The Interactive Protocol phase For IP, we need μ < 1/log n μn is error in each level of IP For DP, we need μ > 1/log log n (1-μ) is decrease in approximation Weaken Strengthen

39 Reducing to smaller DP! Run δ -approx on all poly(log n) such boxes n/(log n) Run δ -approx to get LIS estimate inside box

40 Reducing to smaller DP! Run δ -approx on all poly(log n) such boxes Use Dynamic Program to find chain with largest sum of estimates Longest path in DAG Can solve in poly(log n) time Chain n/(log n)

Dichotomy theorem 41 Either it is easy to find the right subproblems One can go from δ- approx to ( δ-δ 2 ) - approx by a (log n) sized DP OR

42 The algorithm, in one slide Overall running time becomes (log n) 1/ δ *&^#\$% miracle that the math works out P We find splitter Continue IP Cannot find splitter Make poly(log n) calls to δ -approx. Solve DP of poly(log n) size.

43 The even better version Dont exactly solve this dynamic program! Use our sublinear algo to approximately solve in (loglog n) time. Then do it recursively… Its painful Its all Greek: α β γ δ ε ζ λ μ ξ We had ν, but got rid of it

44 What next? Sublinear dynamic programming! We get (1/ δ) 1/ δ (log n) c time. Can we get (log n)/ δ time? Would be extremely cool. Completely optimal Applications for other dynamic programs? How does one find the right subproblems in sublinear time? Generalize the dichotomy Longest common subsequence/edit distance…?

Ask and you shall know… 45

Download ppt "1 Estimating the longest increasing sequence in polylogarithmic time C. Seshadhri (Sandia National Labs) Joint work with Michael Saks (Rutgers University)"

Similar presentations