Presentation is loading. Please wait.

Presentation is loading. Please wait.

Selectivity Estimation for Optimizing Similarity Query in Multimedia Databases IDEAL 2003 Paper review.

Similar presentations


Presentation on theme: "Selectivity Estimation for Optimizing Similarity Query in Multimedia Databases IDEAL 2003 Paper review."— Presentation transcript:

1 Selectivity Estimation for Optimizing Similarity Query in Multimedia Databases IDEAL 2003 Paper review

2 Query optimization in traditional database  Query: find the employee who’s age between 30-40 and work for Engineering Faculty  Running time of different execution plans depend on  Number of employees between 30-40  Number of employees work for Engineering Faculty  Task: Estimate the number in advance and select the best execution plan (selectivity estimation)  Statistics are stored in database (metadata)

3 Techniques: one dimension  Parametric – unrealistic  Curve fitting – negative value problem  Sampling – large overhead  Non-parametric (Histogram technique) – widely used age 10 1520253035404550

4 Problem in multimedia database  (Color = ‘red’) ^ (Shape = ‘round’)  Color, shape – feature vector  Multi-dimension  Number of buckets increases exponentially with dimension  Histogram technique fails  1d – 5  2d – 25  3d – 125  4d – 625

5 Previous Work – SIGMOD 99  Use DCT to compress information of histogram  2D example  Store DCT coefficient 101513 142016 9131140.33-2.86-5.422.04-0.50-0.29 -6.84-0.291.167 DCT Histogram valueDCT coefficients0000-2 200-0.331.630.47-1.230.00 1.18-0.581.33 DCT

6 Reconstruction of histogram value 10 12 30 15 15 24 36 81 10 30 9 40 42 23 20 18 13 35 60 70 10 15 34 43 60 151.0000 -39.1747 -25.2604 -11.2001 24.9442 -24.0137 42.2456 -24.0044 -8.9490 16.2098 -15.0187 -14.2921 15.5469 9.8779 0.0979 -9.2651 -19.4490 19.4228 16.7544 -20.1256 -27.0396 12.9394 -4.5979 -4.8360 -17.5469 151.0000 -39.1747 -25.2604 -11.2001 24.9442 -24.0137 42.2456 -24.0044 -8.9490 0 -15.0187 -14.2921 15.5469 0 0 -9.2651 -19.4490 0 0 0 -27.0396 0 0 0 0 1.6184 17.9451 34.7007 10.3893 17.3465 31.0059 44.7644 57.9655 25.3113 21.9529 11.2059 25.0820 47.2188 25.3614 25.1319 12.6449 19.9000 49.6779 49.1996 64.5775 14.5248 8.3085 32.4371 40.7383 65.9913 DCT Zone sampling IDCT

7 Selectivity estimation 25 9 13 2 10 23 6 19 14 10 28 10 3 17 8 22 26 13 14 21 30 16 2 20 19

8 Current Work - IDEAL 2003  Extend the range query from hyper-cube to hyper- sphere  Model hyper-sphere as combination of hyper-cube  Task  Find combination of hyper-cubes to represent hyper-sphere  Find the area of overlapping

9 Generate combination of hyper- cube

10

11

12

13

14

15

16 Overlapping of hyper-cube with hyper- sphere  Monte-Carlo method  Generate uniformly distributed random point inside the hypercube  Count the number of points within the hyper-sphere  Use the ratio to estimate area of overlapping

17 Generate uniformly distributed points inside a hyper-sphere  Accept / Reject method  Generate points within hyper-cube  Accept those fall within the hyper-sphere  Greedy method  Generate θ uniformly [0,2π]  Generate r according to F -1 (U(0,1)) θ r

18 Experiment


Download ppt "Selectivity Estimation for Optimizing Similarity Query in Multimedia Databases IDEAL 2003 Paper review."

Similar presentations


Ads by Google