Succincter Mihai P ă trașcu unemployed ??. Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently Plank, 2005.

Slides:



Advertisements
Similar presentations
Changing Base without Losing Space Yevgeniy DodisMihai P ă trașcuMikkel Thorup STOC10.
Advertisements

Cell-Probe Lower Bounds for Succinct Partial Sums Mihai P ă trașcuEmanuele Viola.
I/O and Space-Efficient Path Traversal in Planar Graphs Craig Dillabaugh, Carleton University Meng He, University of Waterloo Anil Maheshwari, Carleton.
Succinct Representations of Dynamic Strings Meng He and J. Ian Munro University of Waterloo.
Succinct Data Structures for Permutations, Functions and Suffix Arrays
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
A simple example finding the maximum of a set S of n numbers.
Greedy Algorithms Amihood Amir Bar-Ilan University.
BY Lecturer: Aisha Dawood. Heapsort  O(n log n) worst case like merge sort.  Sorts in place like insertion sort.  Combines the best of both algorithms.
A New Compressed Suffix Tree Supporting Fast Search and its Construction Algorithm Using Optimal Working Space Dong Kyue Kim 1 andHeejin Park 2 1 School.
Succinct Indexes for Strings, Binary Relations and Multi-labeled Trees Jérémy Barbay, Meng He, J. Ian Munro, University of Waterloo S. Srinivasa Rao, IT.
Succinct Data Structures Ian Munro University of Waterloo Joint work with David Benoit, Andrej Brodnik, D, Clark, F. Fich, M. He, J. Horton, A. López-Ortiz,
Succinct Representations of Trees S. Srinivasa Rao Seoul National University.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Wavelet Trees Ankur Gupta Butler University. Text Dictionary Problem The input is a text T drawn from an alphabet Σ. We want to support the following.
Engineering a Sorted List Data Structure for 32 Bit Keys Roman Dementiev Lutz Kettner Jens Mehnert Peter Sanders MPI für Informatik, Saarbrücken.
Generalized Hashing with Variable-Length Bit Strings Michael Klipper With Dan BlandfordGuy Blelloch Original source: D. Blandford and G. E. Blelloch. Storing.
CS Lecture 9 Storeing and Querying Large Web Graphs.
CS728 Lecture 16 Web indexes II. Last Time Indexes for answering text queries –given term produce all URLs containing –Compact representations for postings.
The Encoding Complexity of Two Dimensional Range Minimum Data Structures European Symposium on Algorithms, Inria, Sophia Antipolis, France, September 3,
Data Structures Hashing Uri Zwick January 2014.
Mike 66 Sept Succinct Data Structures: Techniques and Lower Bounds Ian Munro University of Waterloo Joint work with/ work of Arash Farzan, Alex Golynski,
Succinct Representations of Trees
Space Efficient Data Structures for Dynamic Orthogonal Range Counting Meng He and J. Ian Munro University of Waterloo.
Compressed suffix arrays and suffix trees with applications to text indexing and string matching.
Introduction n – length of text, m – length of search pattern string Generally suffix tree construction takes O(n) time, O(n) space and searching takes.
© 2004 Goodrich, Tamassia Tries1. © 2004 Goodrich, Tamassia Tries2 Preprocessing Strings Preprocessing the pattern speeds up pattern matching queries.
Summer School '131 Succinct Data Structures Ian Munro.
Succinct Geometric Indexes Supporting Point Location Queries Prosenjit Bose, Eric Y. Chen, Meng He, Anil Maheshwari, Pat Morin.
Succinct Orthogonal Range Search Structures on a Grid with Applications to Text Indexing Prosenjit Bose, Carleton University Meng He, Unversity of Waterloo.
90-723: Data Structures and Algorithms for Information Processing Copyright © 1999, Carnegie Mellon. All Rights Reserved. 1 Lecture 9: Searching Data Structures.
Succinct Data Structures Ian Munro University of Waterloo Joint work with David Benoit, Andrej Brodnik, D, Clark, F. Fich, M. He, J. Horton, A. López-Ortiz,
Constant-Time LCA Retrieval Presentation by Danny Hermelin, String Matching Algorithms Seminar, Haifa University.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
15-853:Algorithms in the Real World
Algorithm Analysis CS 400/600 – Data Structures. Algorithm Analysis2 Abstract Data Types Abstract Data Type (ADT): a definition for a data type solely.
1 CS 430: Information Discovery Lecture 4 Files Structures for Inverted Files.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
GStore: Answering SPARQL Queries Via Subgraph Matching Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao Peking University, 2 Hong.
Succinct Ordinal Trees Based on Tree Covering Meng He, J. Ian Munro, University of Waterloo S. Srinivasa Rao, IT University of Copenhagen.
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
Lecture 8 : Priority Queue Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University.
Index construction: Compression of postings Paolo Ferragina Dipartimento di Informatica Università di Pisa Reading 5.3 and a paper.
1Computer Sciences. 2 HEAP SORT TUTORIAL 4 Objective O(n lg n) worst case like merge sort. Sorts in place like insertion sort. A heap can be stored as.
DATA STRUCTURES (CS212D) Overview & Review Instructor Information 2  Instructor Information:  Dr. Radwa El Shawi  Room: 
Stephen Alstrup (U. Copenhagen) Esben Bistrup Halvorsen (U. Copenhagen) Holger Petersen (-) Aussois – ANR DISPLEXITY - March 2015 Cyril Gavoille (LaBRI,
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
Succinct Data Structures
Tries 4/16/2018 8:59 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
Succinct Data Structures
Tries 07/28/16 11:04 Text Compression
Succinct Data Structures
Succinct Data Structures
Tries 5/27/2018 3:08 AM Tries Tries.
Succinct Data Structures
Succinct Data Structures
Two equivalent problems
Succinct Data Structures: Upper, Lower & Middle Bounds
Data Structures and Algorithms for Information Processing
Tries 9/14/ :13 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
13 Text Processing Hongfei Yan June 1, 2016.
Accessing nearby copies of replicated objects
Range-Efficient Counting of Distinct Elements
Discrete Methods in Mathematical Informatics
Succinct Representation of Labeled Graphs
Range-Efficient Computation of F0 over Massive Data Streams
Tries 2/23/2019 8:29 AM Tries 2/23/2019 8:29 AM Tries.
Topic 5: Heap data structure heap sort Priority queue
Succinct Data Structures
Presentation transcript:

Succincter Mihai P ă trașcu unemployed ??

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently Plank, 2005

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently SpaceTime trivial2nO(1)

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently SpaceTime trivial2nO(1) arithmetic codingn∙log 2 3Õ(n) 0 1 ⅓⅔ “1”“2”“3” “21” “22” “23”

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently SpaceTime trivial2nO(1) arithmetic codingn∙log 2 3Õ(n) block codingn∙log O(n/τ)Õ(τ) redundancy

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently SpaceTime trivial2nO(1) arithmetic codingn∙log 2 3Õ(n) block codingn∙log O(n/τ)Õ(τ) Hmm… If a block uses O(1) redundancy, the encoding must spread information around Hmm… If a block uses O(1) redundancy, the encoding must spread information around

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently SpaceTime trivial2nO(1) arithmetic codingn∙log 2 3Õ(n) block codingn∙log O(n/τ)Õ(τ) succinct data structuresn∙log O(n/lg n)O(1)

Succinct Data Structures “Make data structures use space close to optimum (≈entropy)” dictionaries * classic hash tables use O(n∙lg u) bits * strive for H = log ( u n ) pattern matching * suffix trees use O(n∙lg n) bits * strive for H = n∙lg Σ represent trees with fast navigation (also graphs, etc) * trivial representations use O(n∙lg n) bits * strive for H = lg C n ≈ 2n etc

Succinct Data Structures “Make data structures use space close to optimum (≈entropy)” dictionaries * classic hash tables use O(n∙lg u) bits * strive for H = log ( u n ) pattern matching * suffix trees use O(n∙lg n) bits * strive for H = n∙lg Σ represent trees with fast navigation (also graphs, etc) * trivial representations use O(n∙lg n) bits * strive for H = lg C n ≈ 2n etc Down deep, common technique: * blocks of ε lg n elements stored with redundancy ≤1 * tabulation to handle blocks ⇒ Space ≈ H+O(H/(τlg n)) with time O(τ)

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently SpaceTime trivial2nO(1) arithmetic codingn∙log 2 3Õ(n) block codingn∙log O(n/τ)Õ(τ) succinct data structuresn∙log O(n/lg n)O(1) [Golynski et al ’07, ’08]n∙log O(n/lg 2 n)O(1)

Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently SpaceTime trivial2nO(1) arithmetic codingn∙log 2 3Õ(n) block codingn∙log O(n/τ)Õ(τ) succinct data structuresn∙log O(n/lg n)O(1) [Golynski et al ’07, ’08]n∙log O(n/lg 2 n)O(1) Here…n∙log O(n/lg τ n)O(τ)

Discussion Big new concept: ** use recursion to reduce redundancy ** Many applications: succinct data structures with space O(n/lg c n), ∀ c How am I getting away with improving on a constant?

A Succinct Proof

Spill-Over Encodings Say I want to represent x ∈ X (think X={1,2,3} B ) If |X| is not a power of 2, how to get redundancy <<1 ? New encoding framework: x ➝ M bits + a spill in {1,..,K} Redundancy = wasted entropy = M + lg K – lg |X| … need to approximate lg|X| ≈ integer + lg(integer) M + lg(K-1) < lg|X| < M + lg K ⇒ redundancy ≤ lg K – lg(K-1) = O(1/K)

Recursion {1,2,3} … …… 12..M’ 12..M B=lg n 12..M12 M {1,..,K} B’=(lg n)/(lg K) {1,..,K’}

Analysis choose K, K’, K’’, … = Θ( κ ) redundancy at each node = O(1/ κ ) times ≈n nodes ⇒ O(n/ κ ) bits degree B’, B’’, … = Θ( (lg n) / (lg κ ) ) go up until O(n/ κ ) nodes left, then waste 1 bit/node ⇒ query time τ = O(lg B’ κ ) ⇒ κ =((lg n) / τ) τ ⇒ redundancy n / ((lg n) / τ) τ ≈ n/lg τ n

Succinct(er) Data Structures

A Building Block The “Rank/Select Problem”: Store A[1..n] ∈ {0,1} n subject to: * rank(k): return Σ k i=1 A[i] * select(j): find k such that rank(k)=j [Golynski et al ’07]space n + O(n∙lglg n / lg 2 n), query O(1) [P. FOCS’08]space n + O(n / lg c n), query O(c)

The Algorithm A[2] … 12..M(Σ) X = ( B Σ ) A[1] A[B] Σ Σ {1,.., K(Σ)} 12..M {1,.., K’(Σ)} spill H(Σ) + H(A|Σ) = n ⇒ H(Σ) + lg K(Σ) + M ≈ n ⇒ H(spill) ≈ n – M

The End Questions?