Download presentation
Presentation is loading. Please wait.
Published byMarianna Houston Modified over 9 years ago
1
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
2
MIT CSAIL Vision interfaces Key challenges: robustness IlluminationObject pose Clutter Viewpoint Intra-class appearance Occlusions
3
MIT CSAIL Vision interfaces Key challenges: efficiency Thousands to millions of pixels in an image 3,000-30,000 human recognizable object categories Billions of images indexed by Google Image Search 18 billion+ prints produced from digital camera images in 2004 295.5 million camera phones sold in 2005
4
MIT CSAIL Vision interfaces Local representations Superpixels [Ren et al.] Shape context [Belongie et al.] Maximally Stable Extremal Regions [Matas et al.] Geometric Blur [Berg et al.] SIFT [Lowe] Salient regions [Kadir et al.] Harris-Affine [Schmid et al.] Spin images [Johnson and Hebert] Describe component regions or patches separately
5
MIT CSAIL Vision interfaces How to handle sets of features? Each instance is unordered set of vectors Varying number of vectors per instance
6
MIT CSAIL Vision interfaces Partial matching Compare sets by computing a partial matching between their features.
7
MIT CSAIL Vision interfaces Pyramid match overview optimal partial matching
8
MIT CSAIL Vision interfaces Computing the partial matching Optimal matching Greedy matching Pyramid match for sets with features of dimension
9
MIT CSAIL Vision interfaces Pyramid match overview Place multi-dimensional, multi-resolution grid over point sets Consider points matched at finest resolution where they fall into same grid cell Approximate optimal similarity with worst case similarity within pyramid cell No explicit search for matches! Pyramid match measures similarity of a partial matching between two sets:
10
MIT CSAIL Vision interfaces Pyramid match Number of newly matched pairs at level i Measure of difficulty of a match at level i Approximate partial match similarity [Grauman and Darrell, ICCV 2005]
11
MIT CSAIL Vision interfaces Pyramid extraction, Histogram pyramid: level i has bins of size
12
MIT CSAIL Vision interfaces Counting matches Histogram intersection
13
MIT CSAIL Vision interfaces Example pyramid match
14
MIT CSAIL Vision interfaces Example pyramid match
15
MIT CSAIL Vision interfaces Example pyramid match
16
MIT CSAIL Vision interfaces Example pyramid match pyramid match optimal match
17
MIT CSAIL Vision interfaces x Randomly generated uniformly distributed point sets with m= 5 to 100, d=2 Approximating the optimal partial matching
18
MIT CSAIL Vision interfaces PM preserves rank…
19
MIT CSAIL Vision interfaces and is robust to clutter…
20
MIT CSAIL Vision interfaces Learning with the pyramid match Kernel-based methods –Embed data into a Euclidean space via a similarity function (kernel), then seek linear relationships among embedded data –Efficient and good generalization –Include classification, regression, clustering, dimensionality reduction,… Pyramid match forms a Mercer kernel
21
MIT CSAIL Vision interfaces ComplexityKernel Pyramid match Match [Wallraven et al.] Time (s) Accuracy Category recognition results ETH-80 data set Mean number of features
22
MIT CSAIL Vision interfaces 0.002 s / match5 s / match Category recognition results Pyramid match kernel over spatial features with quantized appearance 2004 Time of publication 6/05 12/05 3/06 6/06
23
MIT CSAIL Vision interfaces But rectangular histogram may scale poorly with input dimension… Build data-dependent histogram structure… New Vocabulary-guided PM [NIPS 06]: Hierarchical k-means over training set Irregular cells; record diameter of each bin VG pyramid structure stored O(k L ); stored once Individual Histograms still stored sparsely Vocabulary-guided pyramid match
24
MIT CSAIL Vision interfaces Vocabulary-guided pyramid match Uniform bins Tune pyramid partitions to the feature distribution Accurate for d > 100 Requires initial corpus of features to determine pyramid structure Small cost increase over uniform bins: kL distances against bin centers to insert points Vocabulary- guided bins
25
MIT CSAIL Vision interfaces Vocabulary-guided pyramid match n ij (X) : hist. X level i cell j w ij : weight for hist. X level i cell j (1)~= diameter of cell (2)~= d ij (X) + d ij (Y) (d ij (H)=max dist of H’s pts in cell i,j to center) c h (n) : child h of node n c 2 (n 11 ) Mercer kernel Upper bound w ij * (# matches in cell j level i - # matches in children) W * # new matches @ level i
26
MIT CSAIL Vision interfaces Results: Evaluation criteria Quality of match scores How similar are the rankings produced by the approximate measure to those produced by the optimal measure? Quality of correspondences How similar is the approximate correspondence field to the optimal one? Object recognition accuracy Used as a match kernel over feature sets, what is the recognition output?
27
MIT CSAIL Vision interfaces Match score quality Uniform bin pyramid match Vocabulary- guided pyramid match ETH-80 images, sets of SIFT features d=8d=128 d=8 Dense SIFT (d=128) k=10, L=5 for VG PM; PCA for low-dim feats
28
MIT CSAIL Vision interfaces ETH-80 images, sets of SIFT features Match score quality
29
MIT CSAIL Vision interfaces Spearman correlation Correlation coefficient to measure how well two ordinal rankings agree rank value in true ordering corresponding rank assigned by approximate ordering
30
MIT CSAIL Vision interfaces Bin structure and match counts Data-dependent bins allow more gradual distance ranges d=8d=13 d=68 d=3 d=113 d=128
31
MIT CSAIL Vision interfaces Approximate correspondences Use pyramid intersections to compute smaller explicit matchings.
32
MIT CSAIL Vision interfaces Approximate correspondences Use pyramid intersections to compute smaller explicit matchings. optimal per bin random per bin
33
MIT CSAIL Vision interfaces Correspondence examples
34
MIT CSAIL Vision interfaces ETH-80 images, sets of SIFT descriptors Approximate correspondences
35
MIT CSAIL Vision interfaces ETH-80 images, sets of SIFT descriptors Approximate correspondences
36
MIT CSAIL Vision interfaces Impact on recognition accuracy VG-PMK as kernel for SVM Caltech-4 data set SIFT descriptors extracted at Harris and MSER interest points
37
MIT CSAIL Vision interfaces Sets of features elsewhere diseases as sets of gene expressions documents as bags of words methods as sets of instructions
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.