Download presentation

Presentation is loading. Please wait.

Published byLillian Gotts Modified over 3 years ago

1
**Succinct Data Structures for Permutations, Functions and Suffix Arrays**

Ian Munro University of Waterloo Joint work with F. Fich, M. He, J. Horton, A. López-Ortiz, S. Srinivasa Rao, Rajeev Raman, Venkatesh Raman How do we encode a permutation or generalization … function or specialization … suffix array in a small amount of space and still perform queries in constant time ???

2
**Permutations: a Shortcut Notation**

Let P be a simple array giving π; P[i] = π[i] Also have B[i] be a pointer t positions back in (the cycle of) the permutation; B[i]= π-t[i] .. But only define B for every tth position in cycle. (t is a constant; ignore cycle length “round-off”) So array representation P = [ x x 3 x 2 x 10 1] 2 4 5 13 1 8 3 12 10

3
**Representing Shortcuts**

In a cycle there is a B every t positions … But these positions can be in arbitrary order Which i’s have a B, and how do we store it? Keep a vector of all positions 0 indicates no B 1 indicates a B Rank gives the position of B[“i”] in B array So: π(i) and π -1(i) in O(1) time & (1+ε)n lg n bits Theorem: Under a pointer machine model with space (1+ ε) n references, we need time 1/ε to answer π and π -1 queries; i.e. this is as good as it gets.

4
**Getting n lg n Bits: an Aside**

This is the best we can do for O(1) operations But using Benes networks: 1-Benes network is a 2 input/2 output switch r+1-Benes network … join tops to tops 1 2 3 4 5 6 7 8 3 5 7 8 1 6 4 2 R-Benes Network R-Benes Network

5
**A Benes Network Realizing the permutation (3 5 7 8 1 6 4 2) 1 2 3 4 5**

6
What can we do with it? Divide into blocks of lg lg n gates … encode their actions in a word. Taking advantage of regularity of address mechanism and also Modify approach to avoid power of 2 issue Can trace a path in time O(lg n/(lg lg n) This is the best time we are able get for π and π-1 in minimum space. Observe: This method “violates” the pointer machine lower bound by using “micropointers”.

7
**Back to the main track: Powers of π**

Consider the cycles of π ( )( )( ) Keep a bit vector to indicate the start of each cycle ( ) Ignoring parentheses, view as new permutation, ψ. Note: ψ-1(i) is position containing i … So we have ψ and ψ-1 as before Use ψ-1(i) to find i, then bit vector (rank, select) to find πk or π-k

8
**Functions Now consider arbitrary functions [n]→[n]**

“A function is just a hairy permutation” All tree edges lead to a cycle

9
Challenges here Essentially write down the components in a convenient order and use the n lg n bits to describe the mapping (as per permutations) To get fk(i): Find the level ancestor (k levels up) in a tree Or Go up to root and apply f the remaining number of steps around a cycle

10
**Level Ancestors There are several level ancestor techniques using**

O(1) time and O(n) WORDS. Adapt Bender & Farach-Colton to work in O(n) bits But going the other way …

11
**f-k is a set Moving Down the tree requires care f-3( ) = ( )**

The trick: Report all nodes on a given level of a tree in time proportional to the number of nodes, and Don’t waste time on trees with no answers

12
**Final Function Result Given an arbitrary function f: [n]→[n]**

With an n lg n + O(n) bit representation we can compute fk(i) in O(1) time and f-k(i) in time O(1 + size of answer).

13
**Back to Text … And Suffix Arrays**

Text T[1..n] over (a,b)*# (a<#<b) There are 2n-1 such texts, which of the n! suffix arrays are valid? SA= is a b b a a b a # SA-1= M= isn’t ..why?

14
**Ascending to Max M is a permutation so M-1 is its inverse**

i.e. M-1[i] says where i is in M Ascending-to-Max: 1 i n-2 M-1[i] < M-1[n] and M-1[i+1] < M-1[n] M-1[i] < M-1[i+1] M-1[i] > M-1[n] and M-1[i+1] > M-1[n] M-1[i] > M-1[i+1] OK NO

15
**Non-Nesting Non-Nesting: 1 i,j n-1 and M-1[i]<M-1[j]**

M-1[i] < M-1[i+1] and M-1[j] < M-1[j+1] M-1[i+1] < M-1[j+1] M-1[i] > M-1[i+1] and M-1[j] > M-1[j+1] M-1[i+1] < M-1[j+1] OK NO

16
**Characterization Theorem for Suffix Arrays on Binary Texts**

Theorem: Ascending to Max & Non-nesting Suffix Array Corollary: Clean method of breaking SA into segments Corollary: Linear time algorithm to check whether SA is valid

17
**Cardinality Queries T= a b a a a b b a a a b a a b b #**

Remember lengths longest run of a’s and of b’s SA (broken by runs, but not stored explicitly) 8 3 | | |16 | |6 14 Ba, bit vector .. If SA-1[i-1] in an “a” section store 1 in Ba,[SA-1[i]], else 0 Ba Create rank structure on Ba, and similarly Bb, (Note these are reversed except at #) Algorithm Count(T,P) s ← 1; e ←n; i ← m; while i>0 and se do if P[i[=a then s← rank1(Ba,s-1)+1; e←rank1(Ba,e) else s← na rank1(Bb,s-1); e←na + 1 +rank1(Bb,e) i ← i-1 Return max(e-s+1,0) Time: O(length of query)

18
**Listing Queries Complex methods**

Key idea: for queries of length at least d, index every dth position .. For T and forT(reversed) So we have matches for T[i..n] and T[1,i-1] View these as points in 2 space (Ferragina & Manzini and Grossi & Vitter) Do a range query (Alstrup et al) Variety of results follow

19
General Conclusion Interesting, and useful, combinatorial objects can be: Stored succinctly … O(lower bound) +o() So that Natural queries are performed in O(1) time (or at least very close) This can make the difference between using them and not …

Similar presentations

OK

A Categorization Theorem on Suffix Arrays with Applications to Space Efficient Text Indexes Meng He, J. Ian Munro, and S. Srinivasa Rao University of Waterloo.

A Categorization Theorem on Suffix Arrays with Applications to Space Efficient Text Indexes Meng He, J. Ian Munro, and S. Srinivasa Rao University of Waterloo.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To ensure the functioning of the site, we use **cookies**. We share information about your activities on the site with our partners and Google partners: social networks and companies engaged in advertising and web analytics. For more information, see the Privacy Policy and Google Privacy & Terms.
Your consent to our cookies if you continue to use this website.

Ads by Google

Ppt on sustainable rural development Ppt on types of software system Ppt on inhabiting other planets found Ppt on asian continent animals Ppt on festivals of india Download ppt on mind controlled robotic arms for humans Ppt on first conditional worksheet Download ppt on power sharing in sri lanka and belgium Ppt on fibonacci numbers in flowers Ppt on inertial frame of reference