A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen hc2599@columbia.edu Advisor: Professor Shih-Fu Chang

Outline Flow chart of the overall work The idea of using statistical approach to do re- ranking – By feature locations relationship O(n 2 ) time complexity – By orientation relationship O(n) time complexity The re-rank accuracy is as good as RANSAC Experimental result evaluation

Flow Chart 1 – ranking components construction Dataset: Ukbench [1] [1] D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages 2161-2168, June 2006. [2] http://www.vlfeat.org/http://www.vlfeat.org/ Code Book Hierarchical k-means [1][2] Bag of Word histograms of the database images Query image Bag of Word histogram of the query image Respond top-N result

Flow Chart 2 – re-ranking components construction Respond top-N result Re-rank by RANSAC [3] [3] http://www.csse.uwa.edu.au/~pk/research/matlabfns/, Peter Kovesi,http://www.csse.uwa.edu.au/~pk/research/matlabfns/ Centre for Exploration Targeting School of Earth and Environment The University of Western Australia Re-rank by proposed statistical approach Result evaluation

1. Feature Locations Relationship SIFT features [4] are: – Invariant to translation, rotation and scaling – Partially invariant to local geometric distortion For an ideal similar image pair: – Only translation, rotation and scaling – The ratio of corresponding distance pairs should be constant. [4] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004) P 1a P 1b P 2a Image AImage B dist1 dist2

1. Feature Locations Relationship SIFT features [4] are: – Invariant to translation, rotation and scaling – Partially invariant to local geometric distortion For a similar image pair with view angle difference: – Translation, rotation and scaling – Local geometric distortion, and wrong feature points matching – The ratio of corresponding distance pairs is near constant. [4] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004) P 1a P 1b P 2a Image AImage B dist1 dist2

Example ukbench00000ukbench00001 Mean = 0.85 Variance = 0.017 Total amount of match points: 554 Mean: scaling Variance: matching error, the smaller the better

1. Feature Locations Relationship Assumption after observation: – A similar image pair: a distribution with small distribution variance – A dissimilar image pair: a distribution with large distribution variance

Analysis of feature locations relationship Relationship of match pair numbers and average variances between similar image pairs and dissimilar image pairs Red: dissimilar image pairs Blue: similar image pairs

2. Feature orientation Relationship SIFT features [4] are: – Invariant to translation, rotation and scaling – Partially invariant to local geometric distortion For similar image pairs: – The rotation degree of P 1a -> P1b should be EQUAL to the rotation degree of P 2a -> P 2b [4] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004) P 1a P 1b P 2a Image AImage B

Example ukbench00000 ukbench00001 Shift about pi/4 The rotation degree is about 50, Distance measured by histogram intersection

2. Feature orientation Relationship Assumption after observation: – A similar image pair: small orientation histogram distance – A dissimilar image pair: large orientation histogram distance

Analysis of Feature orientation Relationship Relationship of match pair numbers and average orientation intersection difference between similar image pairs and dissimilar image pairs Red: dissimilar image pairs Blue: similar image pairs

Why I zoom in the small-match-number portion of the diagrams?

Dataset and features discussion Ukbench dataset analysis: – 2550 classes, 4 images/class – Similar image pairs combination: C(4, 2) * 2550 = 15300 pairs High percentage of similar image pairs having small amount of match points. (with default ratio value = 0.6) The re-ranking criteria should have outstanding performance especially only having small match points amount. Match points #Accumulated #/%Match points #Accumulated #/% 06023.9%6327821.4% 111907.8%7355523.2% 2173311.3%8381224.9% 3223614.6%9404626.4% 4261317.1%10429728.1% 5298219.5%20593438.8%

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar 3 32.23690377.65 1 0.6020.027 498.64155598860. 926 0.6100.035 4 79.0731028033. 066 0.6040.029 772.344266541945.753 0.6410.030 5 198.8307229360. 856 0.5950.019 882.084 205772324.251 0.6570.025 10 27.82226219.27 5 0.6090.011 1937.780303821731.998 0.6850.024 overall 18.207235756.2 36 0.4220.032 495.66992963999. 421 0.6140.030

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar 3 32.23690377.65 1 0.6020.027 498.64155598860. 926 0.6100.035 4 79.0731028033. 066 0.6040.029 772.344266541945.753 0.6410.030 5 198.8307229360. 856 0.5950.019 882.084 205772324.251 0.6570.025 10 27.82226219.27 5 0.6090.011 1937.780303821731.998 0.6850.024 overall 18.207235756.2 36 0.4220.032 495.66992963999. 421 0.6140.030 High variance of the variance of Scaling Distribution, even though the mean of it is quite distinctive.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar 3 32.23690377.65 1 0.6020.027 498.64155598860. 926 0.6100.035 4 79.0731028033. 066 0.6040.029 772.344266541945.753 0.6410.030 5 198.8307229360. 856 0.5950.019 882.084 205772324.251 0.6570.025 10 27.82226219.27 5 0.6090.011 1937.780303821731.998 0.6850.024 overall 18.207235756.2 36 0.4220.032 495.66992963999. 421 0.6140.030 The variance of orientation histogram difference are very small (with respect to its mean value) and stable.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar 3 32.23690377.65 1 0.6020.027 498.64155598860. 926 0.6100.035 4 79.0731028033. 066 0.6040.029 772.344266541945.753 0.6410.030 5 198.8307229360. 856 0.5950.019 882.084 205772324.251 0.6570.025 10 27.82226219.27 5 0.6090.011 1937.780303821731.998 0.6850.024 overall 18.207235756.2 36 0.4220.032 495.66992963999. 421 0.6140.030 Overall, the orientation histogram difference can clearly separate similar/dissimilar image pairs, because of its large distance of mean value and quite small variance.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar 3 32.23690377.65 1 0.6020.027 498.64155598860. 926 0.6100.035 4 79.0731028033. 066 0.6040.029 772.344266541945.753 0.6410.030 5 198.8307229360. 856 0.5950.019 882.084 205772324.251 0.6570.025 10 27.82226219.27 5 0.6090.011 1937.780303821731.998 0.6850.024 overall 18.207235756.2 36 0.4220.032 495.66992963999. 421 0.6140.030 When match points are more than 5, the orientation histogram difference can roughly separate similar and dissimilar image pairs.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar 3 32.23690377.65 1 0.6020.027 498.64155598860. 926 0.6100.035 4 79.0731028033. 066 0.6040.029 772.344266541945.753 0.6410.030 5 198.8307229360. 856 0.5950.019 882.084 205772324.251 0.6570.025 10 27.82226219.27 5 0.6090.011 1937.780303821731.998 0.6850.024 overall 18.207235756.2 36 0.4220.032 495.66992963999. 421 0.6140.030 When match points are more than 10, the orientation histogram difference can clearly separate similar and dissimilar image pairs.

Experimental results discussion 1. the impact of k values (cluster centers) K=1000K=4096K=10000K=50625K=100000 Recall = 1 (33%) 0.7220.7580.7810.8180.808 Recall = 2 (66%) 0.5440.5850.6140.6400.645 Recall = 3 (100%) 0.3600.4010.4310.4590.460 K=1000 K=4096 K=10000 K=50625 K=100000

Experimental results discussion 2. the impact of looking up code book by different approach: – A. by tracing the vocabulary tree [1]: efficient, but the result is not optimal – B. by scanning the whole code book: very slow, but guarantees a optimal BoW result with respect to the K centers K=1000(by tree)K=1000K=10000(by tree)K=10000 Recall = 1 (33%) 0.7220.7500.7810.815 Recall = 2 (66%) 0.5440.5750.6140.658 Recall = 3 (100%) 0.3600.3900.4310.470 K=1000: decoded by tree K=1000: decoded directly K=10000: decoded by tree K=10000: decoded directly

K=1000 Ground truth Rotation Scale var + rotation RANSAC Scale var Original Ground truthRotationScale var + rotation RANSACScale varOriginal 0.8370.7820.7800.7730.7540.722 0.6640.600 0.5910.5830.544 0.4550.4070.4040.4010.3980.360 Re-rank depth =20

K=50625 Ground truth Rotation Scale var + rotation RANSAC Scale var Original Ground truthRotationScale var + rotation RANSACScale varOriginal 0.9210.8490.8460.8450.8130.818 0.7690.6880.6850.6750.6650.640 0.5570.5020.4970.4930.4870.459 Re-rank depth =20

Experimental result -- all Re-rank depth = 20 distributionK=1000K=4096K=10000K=50625 Recall = 1 (33%) 0.7540.7800.7990.813 Recall = 2 (66%) 0.5830.6170.6470.665 Recall = 3 (100%) 0.3980.4300.4560.487 rotationK=1000K=4096K=10000K=50625 Recall = 1 (33%) 0.7820.8100.8270.849 Recall = 2 (66%) 0.6000.6350.6650.688 Recall = 3 (100%) 0.4070.4410.4690.502 K=1000K=4096K=10000K=50625K=100000 Recall = 1 (33%) 0.7220.7580.7810.8180.808 Recall = 2 (66%) 0.5440.5850.6140.6400.645 Recall = 3 (100%) 0.3600.4010.4310.4590.460

RANSACK=1000K=4096K=10000K=50625 Recall = 1 (33%) 0.7730.8030.8210.845 Recall = 2 (66%) 0.5910.6280.6560.675 Recall = 3 (100%) 0.4010.4350.4630.493 Ground TruthK=1000K=4096K=10000K=50625 Recall = 1 (33%) 0.8370.8690.9000.921 Recall = 2 (66%) 0.6640.7040.7330.769 Recall = 3 (100%) 0.4550.4930.5260.557 Dist+RoK=1000K=4096K=10000K=50625 Recall = 1 (33%) 0.7800.8080.8260.846 Recall = 2 (66%) 0.6000.6330.6620.685 Recall = 3 (100%) 0.4040.4380.4650.497

Time Complexity Analysis RANSAC: O(Kn): – K: random subset tried – n: input data size – no upper bound on the time it takes to compute the parameters Distribution of Feature Location distance relationship: – O(n 2 ) : distribution consists of all distance relationships – O(n): when n (match point number) is large enough, we can subsample “reliable enough” amount of samples to form the distribution The distance of orientation histograms of matched SIFT features: – O(n) : to generate rotation angle histograms of matched SIFT features – Constant time for compute rotation angles – Only little overhead with respect to searching match points

Future work We have: – 1. Scale information – 2. Orientation information – 3. Trivial to find translation – A good initial guess for precise homography matrix estimation? Applied the current approach to quantized SIFT features: – Using a code word to represent a interesting point, rather than applying 128 dimension vector Moving from exact 1-1 mapping to many-to-many mapping. – I’ve tried to solve this problem. However, there are now no satisfying results at this stage.

Reference [1] D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages 2161-2168, June 2006. [2] http://www.vlfeat.org/ [3] http://www.csse.uwa.edu.au/~pk/research/matlabfns/, Peter Kovesi, Centre for Exploration Targeting School of Earth and Environment The University of Western Australiahttp://www.csse.uwa.edu.au/~pk/research/matlabfns/ [4] David G. Lowe, "Distinctive image features from scale- invariant keypoints," International Journal of Computer Vision, 60, 2 (2004)

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

Similar presentations

Presentation on theme: "A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

Similar presentations

Presentation on theme: "A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang."— Presentation transcript:

Similar presentations

About project

Feedback