Presentation is loading. Please wait.

Presentation is loading. Please wait.

A New Point Access Method based on Wavelet Trees Nieves R. Brisaboa, Miguel R. Luaces, Diego Seco Database Laboratory University of A Coruña A Coruña,

Similar presentations


Presentation on theme: "A New Point Access Method based on Wavelet Trees Nieves R. Brisaboa, Miguel R. Luaces, Diego Seco Database Laboratory University of A Coruña A Coruña,"— Presentation transcript:

1 A New Point Access Method based on Wavelet Trees Nieves R. Brisaboa, Miguel R. Luaces, Diego Seco Database Laboratory University of A Coruña A Coruña, Spain Gonzalo Navarro Department of Computer Science University of Chile Santiago, Chile

2 Gramado - SeCoGIS 20092 11th November, 2009 Outline Motivation Compressed Data Structures PW-Tree Experiments Conclusions and Future Work

3 Gramado - SeCoGIS 20093 11th November, 2009 Outline Motivation Compressed Data Structures PW-Tree Experiments Conclusions and Future Work

4 Gramado - SeCoGIS 20094 11th November, 2009 Motivation Spatial indexes are a key component in GIS  Large collections of geographic data  Geographic operations are very complex Sequential search is not feasible Spatial index classification (indexable objects)  Point Access Methods (PAMs) E.g.: K-d-tree family  Spatial Access Methods (SAMs) E.g.: R-tree family

5 Gramado - SeCoGIS 20095 11th November, 2009 Motivation Typical requirements of spatial indexes:  Dynamic operations: inserts, deletes, updates, …  Secondary storage management Space consumption is a less important issue Nowadays, some of these requirements have changed  Static data collections are useful in many domains  Memory hierarchy evolution Reduction of the main memory cost New levels (flash memory) Our goal is a new point access method  Static geographic data collections  Main memory: compact  Efficiency similar to classical indexes

6 Gramado - SeCoGIS 20096 11th November, 2009 Outline Motivation Compressed Data Structures PW-Tree Experiments Conclusions and Future Work

7 Gramado - SeCoGIS 20097 11th November, 2009 Compressed Data Structures Same features as classical data structures with few storage cost Based on two very efficient bit vector operations: rank and select Rank: returns the number of times bit b appears in the prefix B 1,i B = rank 1 (B,6) = 3

8 Gramado - SeCoGIS 20098 11th November, 2009 Compressed Data Structures Same features of classical data structures with few storage cost Based on two very efficient bit vector operations: rank and select Rank: returns the number of times bit b appears in the prefix B 1,i rank 1 (B,6) = 3 B = rank 0 (B,16) =10

9 Gramado - SeCoGIS 20099 11th November, 2009 Compressed Data Structures Select: returns the position i of the j-th appearance of bit b in B 1,n B = select 1 (B,2) = 5 B = select 0 (B,9) = 14

10 Gramado - SeCoGIS 200910 11th November, 2009 Outline Motivation Compressed Data Structures PW-Tree Experiments Conclusions and Future Work

11 Gramado - SeCoGIS 200911 11th November, 2009 PW-tree Abstraction  N points distributed in a two-dimensional space  Construction of an N x N matrix  One point for each row i and one for each column j 12345678910111213141516 1o 2o 3o 4o 5o 6o 7o 8o 9o 10o 11o 12o 13o 14o 15o 16o

12 Gramado - SeCoGIS 200912 11th November, 2009 PW-tree Abstraction  N points distributed in a two-dimensional space  Construction of an N x N matrix  One point for each row i and one for each column j 12345678910111213141516 1o 2o 3o 4o 5o 6o 7o 8o 9o 10o 11o 12o 13o 14o 15o 16o Column12345678910111213141516 Row15141116121013873521469 Column12345678910111213141516 Row1514 Column12345678910111213141516 Row151 Column12345678910111213141516 Row15 Column12345678910111213141516 Row

13 Gramado - SeCoGIS 200913 11th November, 2009 PW-tree Wavelet tree construction 12345678910111213141516 1o 2o 3o 4o 5o 6o 7o 8o 9o 10o 11o 12o 13o 14o 15o 16o 15141116121013873521469 1001111100000101 01 14873526 00110101 151116121013149 10100110 01 1432 0110 8756 1100 01 12 01 43 10 01 12 01 34 01 56 01 87 10 01 56 01 78 01 1112109 1100 15161314 1100 01 109 10 1112 01 01 910 01 1112 01 1314 01 1516 01 01 1314 01 1516 [1, 16] [1, 8] [1, 4] [1, 2] [9, 16] [5, 8] [3, 4] [9, 12][13, 16] 15141116121013873521469 100 15141116121013873521469 10 15141116121013873521469 1 15141116121013873521469 [1,8] → 0 [9,16] → 1 14873526141 151116121013149

14 Gramado - SeCoGIS 200914 11th November, 2009 PW-tree Obtain the row of the point that is in the column 8 12345678910111213141516 1o 2o 3o 4o 5o 6o 7o 8o 9o 10o 11o 12o 13o 14o 15o 16o 15141116121013873521469 1001111100000101 01 14873526 00110101 151116121013149 10100110 01 1432 0110 8756 1100 01 12 01 43 10 01 12 01 34 01 56 01 87 10 01 56 01 78 01 1112109 1100 15161314 1100 01 109 10 1112 01 01 910 01 1112 01 1314 01 1516 01 01 1314 01 1516 [1 16] [1, 8] [1, 4] [1, 2] [9, 16] [5, 8] [3, 4] [9, 12][13, 16] rank 1 (B, 8) = 6 rank 0 (B’’, 3) = 1 rank 0 (B’’’, 1) = 1 rank 1 (B’, 6) = 3

15 Gramado - SeCoGIS 200915 11th November, 2009 PW-tree Obtain the column of the point that is in the row 6 12345678910111213141516 1o 2o 3o 4o 5o 6o 7o 8o 9o 10o 11o 12o 13o 14o 15o 16o 15141116121013873521469 1001111100000101 01 14873526 00110101 151116121013149 10100110 01 1432 0110 8756 1100 01 12 01 43 10 01 12 01 34 01 56 01 87 10 01 56 01 78 01 1112109 1100 15161314 1100 01 109 10 1112 01 01 910 01 1112 01 1314 01 1516 01 01 1314 01 1516 [1 16] [1, 8] [1, 4] [1, 2] [9, 16] [5, 8] [3, 4] [9, 12][13, 16] select 1 (B’’’, 1) = 2 select o (B’’, 2) = 4 select 1 (B’, 4) = 8 select 0 (B, 8) = 15

16 Gramado - SeCoGIS 200916 11th November, 2009 PW-tree Solve the range query q:{r[12,16], c[6,10]} 12345678910111213141516 1o 2o 3o 4o 5o 6o 7o 8o 9o 10o 11o 12o 13o 14o 15o 16o 12345678910111213141516 15141116121013873521469 1001111100000101 01 12345678 14873526 00110101 12345678 151116121013149 10100110 01 1234 1432 0110 1234 8756 1100 01 12 12 01 12 43 10 01 12 01 34 01 12 56 01 12 87 10 01 56 01 78 0 1 1234 1112109 1100 1234 15161314 1100 0 1 12 109 10 12 1112 01 01 910 01 1112 01 12 1314 01 12 1516 01 01 1314 01 1516 [1, 16] [1, 8] [1, 4] [1, 2] [9, 16] [5, 8] [3, 4] [9, 12] [13, 16] q (13, 8) (12, 6) rank 1 (B, 6-1)+1 = 4 rank 1 (B, 10) = 6 rank 1 (B’, 4-1)+1 = 3 rank 1 (B’, 6) = 3 rank 0 (B’’, 3) = 1 rank 0 (B’’’, 1) = 1 rank 0 (B’, 4-1)+1 = 2 rank 0 (B’, 6) = 3 [9, 10] ¢ [12, 16] [1, 8] ¢ [12, 16] [9, 10]

17 Gramado - SeCoGIS 200917 11th November, 2009 PW-tree Solve the range query q:{r[12,16], c[6,10]}  Point identifiers must be returned  Ordered array to store the relation between rows (or columns) and identifiers  Wavelet tree solutions are used to access this ordered array to obtain the identifiers Columna12345678910111213141516 Id65454334788698104412142428998420 Wavelet tree solution: (12, 6) y (13, 8)

18 Gramado - SeCoGIS 200918 11th November, 2009 PW-tree Two variants of this structure:  DPW-tree Point identifiers are stored in the same order of the tree leaves The algorithm always needs to reach these leaves  UPW-tree Point identifiers are stored in the same order of the root node The first tree traversal can be stopped without reaching the leaves A second ascending traversal is necessary

19 Gramado - SeCoGIS 200919 11th November, 2009 Outline Motivation Compressed Data Structures PW-Tree Experiments Conclusions and Future Work

20 Gramado - SeCoGIS 200920 11th November, 2009 Experiments (space) Structure Total Bytes per point PW-tree20N +(N lg N x 1,375)/823,69 R-tree20N + 36N/(M-1)21,24 K-d-tree20N + 16(2 h -1+(N mod 2└ lg N ┘))36,00 Notes: R-tree: M = 30 (best experimental performance) K-d-tree: h = ┌lg N┐

21 Gramado - SeCoGIS 200921 11th November, 2009 Results (time) Uniform distribution

22 Gramado - SeCoGIS 200922 11th November, 2009 Results (time) Zipf distribution

23 Gramado - SeCoGIS 200923 11th November, 2009 Results (time) Gauss distribution

24 Gramado - SeCoGIS 200924 11th November, 2009 Results (time) North East dataset (123,593 postal addresses)

25 Gramado - SeCoGIS 200925 11th November, 2009 Results (time) Geonames gazetteer (2,693,569 populated places)

26 Gramado - SeCoGIS 200926 11th November, 2009 Outline Motivation Compressed Data Structures PW-Tree Experiments Conclusions and Future Work

27 Gramado - SeCoGIS 200927 11th November, 2009 Conclusions and Future Work Conclusions:  A new PAM based on compressed data structures (wavelet tree, rank, select) Two variants (DPW-tree, UPW-tree) Good experimental performance Future Work:  Algorithms to solve other queries (k-NN, spatial join)  Support for dynamic operations  New spatial compressed data structures: Spatial access methods based on wavelet trees Balanced representation of a K-d-tree

28 A New Point Access Method based on Wavelet Trees Contact: Diego Seco dseco@udc.es


Download ppt "A New Point Access Method based on Wavelet Trees Nieves R. Brisaboa, Miguel R. Luaces, Diego Seco Database Laboratory University of A Coruña A Coruña,"

Similar presentations


Ads by Google