CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.

Slides:



Advertisements
Similar presentations
CMU SCS : Multimedia Databases and Data Mining Lecture #17: Text - part IV (LSI) C. Faloutsos.
Advertisements

CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - IV Grid files, dim. curse C. Faloutsos.
On Reinsertions in M-tree Jakub Lokoč Tomáš Skopal Charles University in Prague Department of Software Engineering Czech Republic.
CMU SCS : Multimedia Databases and Data Mining Lecture #4: Multi-key and Spatial Access Methods - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #19: SVD - part II (case studies) C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Spatial and Temporal Data Mining V. Megalooikonomou Spatial Access Methods (SAMs) II (some slides are based on notes by C. Faloutsos)
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals - case studies - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#5: Multi-key and Spatial Access Methods - II C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU
CMU SCS : Multimedia Databases and Data Mining Lecture#3: Primary key indexing – hashing C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
15-826: Multimedia Databases and Data Mining
Improving the Performance of M-tree Family by Nearest-Neighbor Graphs Tomáš Skopal, David Hoksza Charles University in Prague Department of Software Engineering.
1 NNH: Improving Performance of Nearest- Neighbor Searches Using Histograms Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research) Chen Li (UC Irvine)
Pivoting M-tree: A Metric Access Method for Efficient Similarity Search Tomáš Skopal Department of Computer Science, VŠB-Technical.
ADBIS 2003 Revisiting M-tree Building Principles Tomáš Skopal 1, Jaroslav Pokorný 2, Michal Krátký 1, Václav Snášel 1 1 Department of Computer Science.
CMU SCS : Multimedia Databases and Data Mining Lecture #16: Text - part III: Vector space model and clustering C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals - case studies Part III (regions, quadtrees, knn queries) C. Faloutsos.
VLSH: Voronoi-based Locality Sensitive Hashing Sung-eui Yoon Authors: Lin Loi, Jae-Pil Heo, Junghwan Lee, and Sung-Eui Yoon KAIST
SASH Spatial Approximation Sample Hierarchy
Algorithms for Nearest Neighbor Search Piotr Indyk MIT.
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Spatial Indexing SAMs. Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert.
Multimedia DBs. Multimedia dbs A multimedia database stores text, strings and images Similarity queries (content based retrieval) Given an image find.
Scalable and Distributed Similarity Search in Metric Spaces Michal Batko Claudio Gennaro Pavel Zezula.
Multi-dimensional Indexes
Chapter 3: Data Storage and Access Methods
10-603/15-826A: Multimedia Databases and Data Mining SVD - part I (definitions) C. Faloutsos.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept.
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #6: Spatial Access Methods Part III: R-trees C. Faloutsos.
Metric based KNN indexing Lecturer:Prof Ooi Beng Chin Presenters:Frankie ChanHT Y Tan ZhenqiangHT J.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: –partial match (give only a subset of the dimensions) –range.
M- tree: an efficient access method for similarity search in metric spaces Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
CMU SCS : Multimedia Databases and Data Mining Lecture #12: Fractals - case studies Part III (quadtrees, knn queries) C. Faloutsos.
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
NM-Tree: Flexible Approximate Similarity Search in Metric and Non-metric Spaces Tomáš Skopal Jakub Lokoč Charles University in Prague Department of Software.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept. of Electronic.
A New Spatial Index Structure for Efficient Query Processing in Location Based Services Speaker: Yihao Jhang Adviser: Yuling Hsueh 2010 IEEE International.
CMU SCS : Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
DASFAA 2005, Beijing 1 Nearest Neighbours Search using the PM-tree Tomáš Skopal 1 Jaroslav Pokorný 1 Václav Snášel 2 1 Charles University in Prague Department.
Presenters: Amool Gupta Amit Sharma. MOTIVATION Basic problem that it addresses?(Why) Other techniques to solve same problem and how this one is step.
High-Dimensional Data. Topics Motivation Similarity Measures Index Structures.
Similarity Search without Tears: the OMNI- Family of All-Purpose Access Methods Michael Kelleher Kiyotaka Iwataki The Department of Computer and Information.
Indexing Multidimensional Data
Spatial Data Management
SIMILARITY SEARCH The Metric Space Approach
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Instance Based Learning
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research)
C. Faloutsos Spatial Access Methods - R-trees
Presentation transcript:

CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos

CMU SCS Copyright: C. Faloutsos (2013)#2 Must-read material Textbook, Chapter 5 Roberto F. Santos Filho, Agma Traina, Caetano Traina Jr., and Christos Faloutsos: Similarity search without tears: the OMNI family of all-purpose access methods ICDE, Heidelberg, Germany, April (code at Similarity search without tears: the OMNI family of all-purpose access methods

CMU SCS Copyright: C. Faloutsos (2013)#3 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search Data Mining

CMU SCS Copyright: C. Faloutsos (2013)#4 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing spatial access methods –problem dfn –z-ordering –R-trees –misc fractals text

CMU SCS Copyright: C. Faloutsos (2013)#5 SAMs - Detailed outline spatial access methods –problem dfn –z-ordering –R-trees –misc topics grid files dimensionality curse; dim. reduction metric trees fractals text,...

CMU SCS Copyright: C. Faloutsos (2013)#6 Metric trees What if we only have a distance function d(o1, o2)? (Applications?)

CMU SCS Copyright: C. Faloutsos (2013)#7 Metric trees (assumption: d() is a metric: positive; symmetric; triangle inequality) then, we can use some variation of ‘Vantage Point’ trees [Yannilos] many variations (GNAT trees [Brin95], MVP-trees [Ozsoyoglu+]...)

CMU SCS Copyright: C. Faloutsos (2013)#8 Finally: M-trees [Ciaccia, Patella, Zezula, vldb 97] M-trees = ‘ball-trees’: groups in spheres Metric trees

CMU SCS Copyright: C. Faloutsos (2013)#9 Finally: M-trees [Ciaccia, Patella, Zezula, vldb 97] M-trees = ‘ball-trees’: Minimum Bounding spheres Metric trees

CMU SCS Copyright: C. Faloutsos (2013)#10 Search (range and k-nn): like R-trees Split? Metric trees

CMU SCS Copyright: C. Faloutsos (2013)#11 Search (range and k-nn): like R-trees Split? Several criteria: –minimize max radius (or sum radii) –(even: random!) Algorithm? Metric trees

CMU SCS Copyright: C. Faloutsos (2013)#12 Search (range and k-nn): like R-trees Split? Several criteria: –minimize max radius (or sum radii) –(even: random!) Algorithm? eg., similar to the quadratic split of Guttman Metric trees

CMU SCS Copyright: C. Faloutsos (2013)#13 OMNI tree [Filho+, ICDE2001] Metric trees - variations

CMU SCS Copyright: C. Faloutsos (2013)#14 How to turn objects into vectors? (assume that distance computations are expensive; we need to answer range/nn queries quickly) Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2013)#15 How to turn objects into vectors? A: pick n ‘anchor’ objects; record the distance of each object from them -> n-d vector Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2013)#16 How to turn objects into vectors? A: pick n ‘anchor’ objects; record the distance of each object from them -> n-d vector Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2013)#17 How to turn objects into vectors? A: pick n ‘anchor’ objects; record the distance of each object from them -> n-d vector Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2013)#18 we could put OMNI coordinates in R-tree (or other SAM, or even do seq. scan) and still answer range and nn queries! (see [Filho’01] for details) Metric trees - OMNI trees r anchor1 anchor2 d1 d2

CMU SCS Copyright: C. Faloutsos (2013)#19 Result: faster than M-trees and seq. scanning (especially if distance computations are expensive) Metric trees - OMNI trees r anchor1 anchor2 d1 d2

CMU SCS Copyright: C. Faloutsos (2013)#20 Q1: how to choose anchors? Q2:... and how many? Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2013)#21 Conclusions for SAMs z-ordering and R-trees for low-d points and regions M-trees & variants for metric datasets beware of the ‘dimensionality curse’

CMU SCS Copyright: C. Faloutsos (2013)#22 References Aurenhammer, F. (Sept. 1991). “Voronoi Diagrams - A Survey of a Fundamental Geometric Data Structure.” ACM Computing Surveys 23(3): Bentley, J. L., B. W. Weide, et al. (Dec. 1980). “Optimal Expected-Time Algorithms for Closest Point Problems.” ACM Trans. on Mathematical Software (TOMS) 6(4): Burkhard, W. A. and R. M. Keller (Apr. 1973). “Some Approaches to Best-Match File Searching.” Comm. of the ACM (CACM) 16(4):

CMU SCS Copyright: C. Faloutsos (2013)#23 References Christian Böhm, Stefan Berchtold, Daniel A. Keim: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3): (2001) Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases Edgar Chávez, Gonzalo Navarro, Ricardo A. Baeza-Yates, José L. Marroquín: Searching in metric spaces. ACM Comput. Surv. 33(3): (2001)Searching in metric spaces

CMU SCS Copyright: C. Faloutsos (2013)#24 References Ciaccia, P., M. Patella, et al. (1997). M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. VLDB. Filho, R. F. S., A. Traina, et al. (2001). Similarity search without tears: the OMNI family of all-purpose access methods. ICDE, Heidelberg, Germany. Friedman, J. H., F. Baskett, et al. (Oct. 1975). “An Algorithm for Finding Nearest Neighbors.” IEEE Trans. on Computers (TOC) C-24:

CMU SCS Copyright: C. Faloutsos (2013)#25 References Fukunaga, K. and P. M. Narendra (July 1975). “A Branch and Bound Algorithm for Computing k-Nearest Neighbors.” IEEE Trans. on Computers (TOC) C-24(7): Shapiro, M. (May 1977). “The Choice of Reference Points in Best-Match File Searching.” Comm. of the ACM (CACM) 20(5):

CMU SCS Copyright: C. Faloutsos (2013)#26 References Shasha, D. and T.-L. Wang (Apr. 1990). “New Techniques for Best-Match Retrieval.” ACM TOIS 8(2): Traina, C., A. J. M. Traina, et al. (2000). Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes. EDBT, Konstanz, Germany.