Improvements on the Range-Minimum-Query- Problem

Slides:



Advertisements
Similar presentations
The LCA Problem Revisited Michael A. BenderMartin Farach-Colton Latin American Theoretical Informatics Symposium, pages 8894, Speaker:
Advertisements

An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
Gerth Stølting Brodal University of Aarhus Monday June 9, 2008, IT University of Copenhagen, Denmark International PhD School in Algorithms for Advanced.
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Longest Common Subsequence
Applied Algorithmics - week7
Dynamic Graph Algorithms - I
Heaps1 Part-D2 Heaps Heaps2 Recall Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value)
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Introduction to Bioinformatics Algorithms Divide & Conquer Algorithms.
A New Compressed Suffix Tree Supporting Fast Search and its Construction Algorithm Using Optimal Working Space Dong Kyue Kim 1 andHeejin Park 2 1 School.
Constant-Time LCA Retrieval
Suffix Sorting & Related Algoritmics Martin Farach-Colton Rutgers University USA.
Tries Standard Tries Compressed Tries Suffix Tries.
Refining Edits and Alignments Υλικό βασισμένο στο κεφάλαιο 12 του βιβλίου: Dan Gusfield, Algorithms on Strings, Trees and Sequences, Cambridge University.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
CS Data Structures Chapter 10 Search Structures (Selected Topics)
1 Nearest Commen Ancestors (NCA) Discrete Range Maximum Cartesian Tree [Vuillemin 1980] i j max(i, j) i.
Branch and Bound Similar to backtracking in generating a search tree and looking for one or more solutions Different in that the “objective” is constrained.
Lowest Common Ancestors Two vertices (u, v) Lowest common ancestors, lca (u, v) Example lca (5, 6) = 4 lca (3, 7) = 2 lca (7, 8) = 1 l(v):
Dynamic Programming Technique. D.P.2 The term Dynamic Programming comes from Control Theory, not computer science. Programming refers to the use of tables.
Lowest common ancestors. Write an Euler tour of the tree LCA(1,5) = 3 Shallowest node.
CSC 213 Lecture 18: Tries. Announcements Quiz results are getting better Still not very good, however Average score on last quiz was 5.5 Every student.
Dynamic Text and Static Pattern Matching Amihood Amir Gad M. Landau Moshe Lewenstein Dina Sokol Bar-Ilan University.
Obtaining Provably Good Performance from Suffix Trees in Secondary Storage Pang Ko & Srinivas Aluru Department of Electrical and Computer Engineering Iowa.
6/26/2015 7:13 PMTries1. 6/26/2015 7:13 PMTries2 Outline and Reading Standard tries (§9.2.1) Compressed tries (§9.2.2) Suffix tries (§9.2.3) Huffman encoding.
Chapter 9: Huffman Codes
Document Retrieval Problems S. Muthukrishnan. Storyline Zvi Galil gave a talk on the 13 th on 13 open problems he posed 13 years ago in string matching.
11-1 Matrix-chain Multiplication Suppose we have a sequence or chain A 1, A 2, …, A n of n matrices to be multiplied –That is, we want to compute the product.
Dynamic Programming Introduction to Algorithms Dynamic Programming CSE 680 Prof. Roger Crawfis.
Priority Queues, Heaps & Leftist Trees
Mike 66 Sept Succinct Data Structures: Techniques and Lower Bounds Ian Munro University of Waterloo Joint work with/ work of Arash Farzan, Alex Golynski,
An Online Algorithm for Finding the Longest Previous Factors Daisuke Okanohara University of Tokyo Karlsruhe, Sep 15, 2008 Kunihiko.
Introduction n – length of text, m – length of search pattern string Generally suffix tree construction takes O(n) time, O(n) space and searching takes.
© 2004 Goodrich, Tamassia Tries1. © 2004 Goodrich, Tamassia Tries2 Preprocessing Strings Preprocessing the pattern speeds up pattern matching queries.
CS 5243: Algorithms Dynamic Programming Dynamic Programming is applicable when sub-problems are dependent! In the case of Divide and Conquer they are.
The LCA Problem Revisited Michael A.Bender & Martin Farach-Colton Presented by: Dvir Halevi.
CS Data Structures Chapter 10 Search Structures.
The LCA Problem Revisited
Constant-Time LCA Retrieval Presentation by Danny Hermelin, String Matching Algorithms Seminar, Haifa University.
Suffix trees. Trie A tree representing a set of strings. a b c e e f d b f e g { aeef ad bbfe bbfg c }
Segment Trees Basic data structure in computational geometry. Computational geometry.  Computations with geometric objects.  Points in 1-, 2-, 3-, d-space.
Everything is String. Closed Factorization Golnaz Badkobeh 1, Hideo Bannai 2, Keisuke Goto 2, Tomohiro I 2, Costas S. Iliopoulos 3, Shunsuke Inenaga 2,
1 Nearest Common Ancestors (NCA) Discrete Range Maximum Cartesian Tree [Vuillemin 1980] i j max(i, j) i.
Interval S = [3,10]  {x | 3 ≤ x ≤ 10} Closed segment S = (3,10)  {x | 3 < x < 10} Opened segment S = [3,3]  {3} Point.
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-3.
Succinct Data Structures
Leftist Trees Linked binary tree.
Tries 4/16/2018 8:59 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
Andreas Klappenecker [partially based on the slides of Prof. Welch]
Tries 07/28/16 11:04 Text Compression
Tries 5/27/2018 3:08 AM Tries Tries.
Discrete Methods in Mathematical Informatics
Two equivalent problems
Advanced Algorithms Analysis and Design
The Greedy Method and Text Compression
Ariel Rosenfeld Bar-Ilan Uni.
The Greedy Method and Text Compression
Heaps © 2010 Goodrich, Tamassia Heaps Heaps
Minimum Spanning Tree Verification
Part-D1 Priority Queues
Segment Trees Basic data structure in computational geometry.
Discrete Methods in Mathematical Informatics
The LCA Problem Revisited
Discrete Range Maximum
CS210- Lecture 13 June 28, 2005 Agenda Heaps Complete Binary Tree
Presentation transcript:

Improvements on the Range-Minimum-Query- Problem Johannes Fischer Volker Heun Universität München, Institut für Informatik

Introduction

RMQA(l,r) = argminl≤i≤r A[i] Introduction given: array A of size n Task: preprocess A such that RMQA(l,r) = argminl≤i≤r A[i] can be answered efficiently l l r r 1 2 3 4 5 6 7 8 9 10 11 A = 1 2 3 4 5 6 7 8 9 10 11 A = 1 2 3 4 5 6 7 8 9 10 11 A = min ⇒ return 1 min ⇒ return 5 Break ties to the left

Applications Lowest Common Ancestors (LCA) A = I D B E J F K C G H A = B C C 1 D E F G G H 2 I J J K 3 A = I D B E J F K C G H A = I D B E J F K C G H H = 3 2 1 H = 3 2 1

Applications Longest common extensions of strings (LCE) abba x abba z i j RMQs on the LCP-table of suffix array Other applications Document Retrieval (Muthukrishnan SODA’02) Suffix links in ESA (Abouldhoda et al. WABI’02) Maximum-Sum Queries (Chen/Chao ISAAC‘04) … ⇒ basic ingredient! for suffixes ti..n and ti‘..n return max{k : ti..k = ti‘..k}

Previous Results for RMQ Berkman/Vishkin FOCS‘89: Preprocessing O(n) Query time O(1) Rediscovered & simplified by Bender/Farach-Colton (LATIN’00) Reduction Chain: RMQ ➾ LCA ➾ ±1RMQ 4-Russians Trick Euler Tour 4-Russians Trick Cartesian Tree cf. suffix array vs. suffix tree text ➾ suffix tree ➾ suffix array

Cartesian Tree Cartesian Tree for A[1,n]: Root: minimal element of A[1,n] at pos i Left Child: Cartesian Tree for A[1,i-1] Right Child: Cartesian Tree for A[i+1,n] 1 2 3 4 5 6 7 8 9 10 11 A = 5 O(n2) 1 9 3 7 11 2 4 6 8 10

The New Algorithm

Overview Divide A into blocks B1,…,Bn/s of size s = log(n)/4 Answer queries seperately Long queries than span several blocks (O(1)) Short in-block-queries (O(1)) return position where overall minimum occurs (O(1))

Answering Long Queries (B/F-C’00) Bi Bn/s M[i,0] M[i,1] M[i,2] M[i,3] Precompute all RMQs that span 2k blocks M[i][k] = position of min in Bi,…,Bi+2^k-1 Filled in optimal time with Dyn. Prog. Query: select 2 blocks covering interval Size of M: n/s · log(n/s) =O(n/logn·log(n/logn)) =O(n)

Answering In-block-queries n/s·s2 =O(n logn) Answering In-block-queries Computing the in-block-queries for all n/s occurring blocks is too much Really necessary? 3 4 2 8 11 10 -5 1 -4 4 4 1 6 1 6 3 5 7 3 5 7 2 2 Fact: B and B‘ have the same answers to all RMQs iff they have the same Cartesian Tree.

Answering In-block-queries Number of unlabelled bin. trees with n nodes: n’th Catalan number Cn Cn=O(4n/n3/2) Theorem: We can store answers to all in-block-queries in space O(n) Proof: O(4s/s3/2)·s2 = O(22s·s1/2) = O(2log(n)/2·log1/2n) = O(n1/2·log1/2n)

Answering In-block-queries One problem remains: For each block Bi we need to know its type in time O(s) Type: bijection t from arrays of size s to {0,…,Cs-1} with t(B)= t(B’) iff B and B’ have same Cartesian Tree build Cartesian Tree for each block Bj give tree a number 0 ≤ t(Bj) < Cs

O(n)-Algo for Cartesian Tree Let Ti be the Cartesian Tree for B[1,i] Ti obtained from Ti-1 as follows: B[x] ≤ B[i] x x > B[i] i y ⇒ y

Computing the block type Don’t have to calculate tree! just keep “rightmost path” p on stack compute sequence of numbers l1,…,ls: li=# nodes deleted from p in step i l1,…,ls satisfies “prefix property” 0 ≤ ∑1≤k≤i lk<i ...because one cannot delete more elements than have been inserted… … and each element is removed from p at most once!

Computing the block type l1,…,ls with ∑1≤k≤i lk<i corresponds to path from to in s s 0 0 In step i: Go up li cells, go one to the left 0 0 0 1 0 2 0 3 1 1 1 2 1 3 2 2 2 3 3 3 Cn,n= Cn # paths from to given by Cp,q= Cp-1,q + Cp,q-1 (“ballot numbers”) p q 0 0

Computing the block type q Paths with greater numbers than path q: at some point above q ⇒ add # paths from current cell before going upwards

Computing the block type Precompute ballot numbers up to s=logn/4. For all blocks Bj: let S be an empty stack, push(S,-∞) q ← s, N ← 0 for i ← 1,…, s while top(S)>Bj[i] pop(S) N ← N + C(s-i) q q ← q - 1 push(S, Bj[i]) return N

Summary and Outlook Direct construction algorithm for RMQ no dynamic data structures never uses more space than in the end not the first… see Alstrup et al. SPAA’02 Our method can be augmented with techniques from Sadakane SODA’02 to give a succinct data structure (2n+o(n) bits) with direct construction algorithm

Any Questions?