CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #12: Fractals - case studies Part III (quadtrees, knn queries) C. Faloutsos.

Slides:



Advertisements
Similar presentations
CMU SCS : Multimedia Databases and Data Mining Lecture #17: Text - part IV (LSI) C. Faloutsos.
Advertisements

CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - IV Grid files, dim. curse C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #4: Multi-key and Spatial Access Methods - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #19: SVD - part II (case studies) C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals - case studies - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#5: Multi-key and Spatial Access Methods - II C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU
        iDistance -- Indexing the Distance An Efficient Approach to KNN Indexing C. Yu, B. C. Ooi, K.-L. Tan, H.V. Jagadish. Indexing the distance:
CMU SCS : Multimedia Databases and Data Mining Lecture#3: Primary key indexing – hashing C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #25: Multimedia indexing C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #9: Fractals - introduction C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #6: Spatial Access Methods Part III: R-trees C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
15-826: Multimedia Databases and Data Mining
ADBIS 2003 Revisiting M-tree Building Principles Tomáš Skopal 1, Jaroslav Pokorný 2, Michal Krátký 1, Václav Snášel 1 1 Department of Computer Science.
CMU SCS : Multimedia Databases and Data Mining Lecture #16: Text - part III: Vector space model and clustering C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals - case studies Part III (regions, quadtrees, knn queries) C. Faloutsos.
Spatio-Temporal Databases
CMU SCS Data Mining Meets Systems: Tools and Case Studies Christos Faloutsos SCS CMU.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Indexing and Data Mining in Multimedia Databases Christos Faloutsos CMU
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
10-603/15-826A: Multimedia Databases and Data Mining SVD - part I (definitions) C. Faloutsos.
Carnegie Mellon Powerful Tools for Data Mining Fractals, Power laws, SVD C. Faloutsos Carnegie Mellon University.
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #6: Spatial Access Methods Part III: R-trees C. Faloutsos.
Introduction to Fractals and Fractal Dimension Christos Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
School of Computer Science Carnegie Mellon UIUC 04C. Faloutsos1 Advanced Data Mining Tools: Fractals and Power Laws for Graphs, Streams and Traditional.
CMU SCS : Multimedia Databases and Data Mining Lecture #9: Fractals – examples & algo’s C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #26: Compression - JPEG, MPEG, fractal C. Faloutsos.
Indexing for Multidimensional Data An Introduction.
Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: –partial match (give only a subset of the dimensions) –range.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Spatial Access Methods - z-ordering.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Data Mining - assoc. rules C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.
R-trees: An Average Case Analysis. R-trees - performance analysis How many disk (=node) accesses we ’ ll need for range nn spatial joins why does it matter?
Spatio-Temporal Databases
DASFAA 2005, Beijing 1 Nearest Neighbours Search using the PM-tree Tomáš Skopal 1 Jaroslav Pokorný 1 Václav Snášel 2 1 Charles University in Prague Department.
The Power-Method: A Comprehensive Estimation Technique for Multi- Dimensional Queries Yufei Tao U. Hong Kong Christos Faloutsos CMU Dimitris Papadias Hong.
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
Carnegie Mellon Data Mining – Research Directions C. Faloutsos CMU
Similarity Search without Tears: the OMNI- Family of All-Purpose Access Methods Michael Kelleher Kiyotaka Iwataki The Department of Computer and Information.
Strategies for Spatial Joins
15-826: Multimedia Databases and Data Mining
Next Generation Data Mining Tools: SVD and Fractals
Spatial Indexing.
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
R-trees: An Average Case Analysis
15-826: Multimedia Databases and Data Mining
Presentation transcript:

CMU SCS : Multimedia Databases and Data Mining Lecture #12: Fractals - case studies Part III (quadtrees, knn queries) C. Faloutsos

CMU SCS Copyright: C. Faloutsos (2014)2 Must-read Material Alberto Belussi and Christos Faloutsos, Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension Proc. of VLDB, p , 1995 Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension

CMU SCS Copyright: C. Faloutsos (2014)3 Optional Material Optional, but very useful: Manfred Schroeder Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise W.H. Freeman and Company, 1991

CMU SCS Copyright: C. Faloutsos (2014)4 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search Data Mining Optional

CMU SCS Copyright: C. Faloutsos (2014)5 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing spatial access methods –z-ordering –R-trees –misc fractals –intro –applications text... Optional

CMU SCS Copyright: C. Faloutsos (2014)6 Indexing - Detailed outline fractals –intro –applications disk accesses for R-trees (range queries) dimensionality reduction dim. curse revisited quad-tree analysis [Gaede+] nn queries [Belussi+] Optional ✔ ✔ ✔

CMU SCS Copyright: C. Faloutsos (2014)7 Fractals and Quadtrees Problem: how many quadtree nodes will we need, to store a region in some level of approximation? [Gaede+96] Optional Faloutsos, C. and V. Gaede (Sept. 1996). Analysis of the z- ordering Method Using the Hausdorff Fractal Dimension. VLDB, Bombay, India.

CMU SCS Copyright: C. Faloutsos (2014)8 Fractals and Quadtrees I.e.: Optional

CMU SCS Copyright: C. Faloutsos (2014)9 Fractals and Quadtrees I.e.: level of quadtree # of quadtree ‘blocks’ (= # gray nodes) ? Optional

CMU SCS Copyright: C. Faloutsos (2014)10 Fractals and Quadtrees Datasets: Franconia Brain Atlas Optional

CMU SCS Copyright: C. Faloutsos (2014)11 Fractals and Quadtrees Hint: –assume that the boundary is self-similar, with a given fd –how will the quad-tree (oct-tree) look like? Optional

CMU SCS Copyright: C. Faloutsos (2014)12 Fractals and Quadtrees white black gray Optional

CMU SCS Copyright: C. Faloutsos (2014)13 Fractals and Quadtrees Let p g (i) the prob. to find a gray node at level i. If self-similar, what can we say for p g (i) ? Optional

CMU SCS Copyright: C. Faloutsos (2014)14 Fractals and Quadtrees Let p g (i) the prob. to find a gray node at level i. If self-similar, what can we say for p g (i) ? A: p g (i) = p g = constant Optional

CMU SCS Copyright: C. Faloutsos (2014)15 Fractals and Quadtrees Assume only ‘gray’ and ‘white’ nodes (ie., no volume’) Assume that p g is given - how many gray nodes at level i? Optional

CMU SCS Copyright: C. Faloutsos (2014)16 Fractals and Quadtrees Assume only ‘gray’ and ‘white’ nodes (ie., no volume’) Assume that p g is given - how many gray nodes at level i? A: 1 at level 0; 4*p g (4*p g )* (4*p g )... (4*p g ) i … Optional

CMU SCS Copyright: C. Faloutsos (2014)17 Fractals and Quadtrees I.e.: level of quadtree (‘i’) # of quadtree ‘blocks’ ? (4*p g ) i Optional

CMU SCS Copyright: C. Faloutsos (2014)18 Fractals and Quadtrees I.e.: level of quadtree log(# of quadtree ‘blocks’) ? log[(4*p g ) i ] Optional

CMU SCS Copyright: C. Faloutsos (2014)19 Fractals and Quadtrees Conclusion: Self-similarity leads to easy and accurate estimation level log 2 (#blocks) Optional

CMU SCS Copyright: C. Faloutsos (2014)20 Fractals and Quadtrees Conclusion: Self-similarity leads to easy and accurate estimation level log 2 (#blocks) Optional

CMU SCS Copyright: C. Faloutsos (2014)21 Fractals and Quadtrees Optional

CMU SCS Copyright: C. Faloutsos (2014)22 Fractals and Quadtrees level log(#blocks) Optional

CMU SCS Copyright: C. Faloutsos (2014)23 Fractals and Quadtrees Final observation: relationship between p g and fractal dimension? Optional

CMU SCS Copyright: C. Faloutsos (2014)24 Fractals and Quadtrees Final observation: relationship between p g and fractal dimension? A: very close: (4*p g ) i = # of gray nodes at level i = # of Hausdorff grid-cells of side (1/2) i = r Eventually: D H = 2 + log 2 ( p g ) and, for E-d spaces: D H = E + log 2 ( p g ) Optional

CMU SCS Copyright: C. Faloutsos (2014)25 Fractals and Quadtrees for E-d spaces: D H = E + log 2 ( p g ) Sanity check: - point in 2-d: D H = 0 p g = ?? -line in 2-d: D H = 1 p g = ?? -plane in 2-d: D H =2 p g = ?? -point in 3-d: D H = 0 p g = ?? Optional

CMU SCS Copyright: C. Faloutsos (2014)26 Fractals and Quadtrees for E-d spaces: D H = E + log 2 ( p g ) Sanity check: - point in 2-d: D H = 0 p g = 1/4 - line in 2-d: D H = 1 p g = 1/2 -plane in 2-d: D H =2 p g = 1 -point in 3-d: D H = 0 p g = 1/8 Optional

CMU SCS Copyright: C. Faloutsos (2014)27 Fractals and Quadtrees Final conclusions: self-similarity leads to estimates for # of z- values = # of quadtree/oct-tree blocks close dependence on the Hausdorff fractal dimension of the boundary Optional

CMU SCS Copyright: C. Faloutsos (2014)28 Indexing - Detailed outline fractals –intro –applications disk accesses for R-trees (range queries) dimensionality reduction dim. curse revisited quad-tree analysis [Gaede+] nn queries [Belussi+] Optional ✔ ✔ ✔ ✔

CMU SCS Copyright: C. Faloutsos (2014)29 NN queries Q: in NN queries, what is the effect of the shape of the query region? [Belussi+95] r L inf L2L2 L1L1

CMU SCS Copyright: C. Faloutsos (2014)30 NN queries Q: in NN queries, what is the effect of the shape of the query region? that is, for L2, and self-similar data: log(d) log(#pairs-within(<=d)) r L2L2 D2D2

CMU SCS Copyright: C. Faloutsos (2014)31 NN queries Q: What about L 1, L inf ? log(d) log(#pairs-within(<=d)) r L2L2 D2D2

CMU SCS Copyright: C. Faloutsos (2014)32 NN queries Q: What about L 1, L inf ? A: Same slope, different intercept log(d) log(#pairs-within(<=d)) r L2L2 D2D2

CMU SCS Copyright: C. Faloutsos (2014)33 NN queries Q: What about L 1, L inf ? A: Same slope, different intercept log(d) log(#neighbors)

CMU SCS Copyright: C. Faloutsos (2014)34 NN queries Q: what about the intercept? Ie., what can we say about N 2 and N inf r L2L2 N 2 neighbors r L inf N inf neighbors volume: V 2 volume: V inf SKIP Optional

CMU SCS Copyright: C. Faloutsos (2014)35 NN queries Consider sphere with volume V inf and r’ radius r L2L2 N 2 neighbors r L inf N inf neighbors volume: V 2 volume: V inf r’ SKIP Optional

CMU SCS Copyright: C. Faloutsos (2014)36 NN queries Consider sphere with volume V inf and r’ radius (r/r’)^E = V 2 / V inf (r/r’)^D 2 = N 2 / N 2 ’ N 2 ’ = N inf (since shape does not matter) and finally: SKIP Optional

CMU SCS Copyright: C. Faloutsos (2014)37 NN queries ( N 2 / N inf ) ^ 1/D 2 = (V 2 / V inf ) ^ 1/E SKIP Optional

CMU SCS Copyright: C. Faloutsos (2014)38 NN queries Conclusions: for self-similar datasets Avg # neighbors: grows like (distance)^D 2, regardless of query shape (circle, diamond, square, e.t.c. ) Optional

CMU SCS Copyright: C. Faloutsos (2014)39 Indexing - Detailed outline fractals –intro –applications disk accesses for R-trees (range queries) dimensionality reduction dim. curse revisited quad-tree analysis [Gaede+] nn queries [Belussi+] –Conclusions Optional

CMU SCS Copyright: C. Faloutsos (2014)40 Fractals - overall conclusions self-similar datasets: appear often powerful tools: correlation integral, NCDF, rank-frequency plot intrinsic/fractal dimension helps in –estimations (selectivities, quadtrees, etc) –dim. reduction / dim. curse (later: can help in image compression...)

CMU SCS Copyright: C. Faloutsos (2014)41 References 1. Belussi, A. and C. Faloutsos (Sept. 1995). Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension. Proc. of VLDB, Zurich, Switzerland. 2. Faloutsos, C. and V. Gaede (Sept. 1996). Analysis of the z- ordering Method Using the Hausdorff Fractal Dimension. VLDB, Bombay, India. 3. Proietti, G. and C. Faloutsos (March 23-26, 1999). I/O complexity for range queries on region data stored using an R- tree. International Conference on Data Engineering (ICDE), Sydney, Australia.