# Ranking Metrics and Evaluation Measures Jie Yu, Qi Tian, Nicu Sebe Presented by Jie Yu.

## Presentation on theme: "Ranking Metrics and Evaluation Measures Jie Yu, Qi Tian, Nicu Sebe Presented by Jie Yu."— Presentation transcript:

Ranking Metrics and Evaluation Measures Jie Yu, Qi Tian, Nicu Sebe Presented by Jie Yu

Outline Introduction Introduction Distance Metric Analysis Distance Metric Analysis Boosting Distance Metrics Boosting Distance Metrics Experiments and Analysis Experiments and Analysis Conclusions and Discussions Conclusions and Discussions

Introduction Similarity Evaluation: Similarity Evaluation: An organizing process in which individuals classify objects, form concepts, and make generalizations In automated programs, similarity evaluation can be achieved by ranking the distances between objects. The similarity between two vectors is often determined by computing the distance between them using certain distance metric. The similarity between two vectors is often determined by computing the distance between them using certain distance metric. Euclidean Distance (Sum of Squared Distance, L 2 )Euclidean Distance (Sum of Squared Distance, L 2 ) Manhattan Distance (Sum of Absolute Distance - L 1 )Manhattan Distance (Sum of Absolute Distance - L 1 ) Which metric to use and why? Is there any better metric?

Distance Metric and Maximum Likelihood Maximum likelihood theory allows us to relate a data distribution to a distance metric Maximum likelihood theory allows us to relate a data distribution to a distance metric The problem of finding the right measure for the distance comes down to the maximization of the similarity probability. Given a specific distribution, the distribution mean can be decided according to Maximum Likelihood theory Given a specific distribution, the distribution mean can be decided according to Maximum Likelihood theory is equivalent to is equivalent to M-Estimator: Any estimate  defined by a minimization problem of the form is called an M-estimate.M-Estimator: Any estimate  defined by a minimization problem of the form is called an M-estimate. implying the additive distance modelimplying the additive distance model In similarity estimation, the  could be the query data y.In similarity estimation, the  could be the query data y.

Distance Metric and Maximum Likelihood Given a specific distribution, the distance metric d can be decided. Given a specific distribution, the distance metric d can be decided. According to Maximal Likelihood the M-estimator can be solved by According to Maximal Likelihood the M-estimator can be solved by => => Exponential distribution:Exponential distribution: L 1 => MedianL 1 => Median Gaussian distribution:Gaussian distribution: L 2 => Arithmetic MeanL 2 => Arithmetic Mean

Distance Metric and Maximum Likelihood New distance metrics are found by doing reverse engineering on harmonic and geometric mean estimation. New distance metrics are found by doing reverse engineering on harmonic and geometric mean estimation. The distance function can be derived from a mean estimation given the restriction that The distance function can be derived from a mean estimation given the restriction that

Distance Metric and Maximum Likelihood Further study on generalized harmonic and geometric mean gives us more distance metrics

Distance Metric and Maximum Likelihood

Comparison of Distance Functions

Robust Distance Metric For a specific distribution a distance metric may fit it best. For a specific distribution a distance metric may fit it best. In content-based image retrieval feature elements are extracted for different statistical properties associated with entire digital images, or perhaps with specific region of interest, e.g.: In content-based image retrieval feature elements are extracted for different statistical properties associated with entire digital images, or perhaps with specific region of interest, e.g.: Color : color moment and color histogram FeatureColor : color moment and color histogram Feature Texture : wavelet featureTexture : wavelet feature Shape: water-filling featureShape: water-filling feature Images are represented by a concatenated feature vector of different feature elements. Images are represented by a concatenated feature vector of different feature elements. Distance between two image feature vectors are often evaluated by isotropic distances such as L 1 or L 2.Distance between two image feature vectors are often evaluated by isotropic distances such as L 1 or L 2. Assumption: The feature elements from heterogeneous sources have the same distribution, e.g. Gaussian.Assumption: The feature elements from heterogeneous sources have the same distribution, e.g. Gaussian. The above assumption is often inappropriate.The above assumption is often inappropriate.

Robust Distance Metric Related Work Mahananobis distance Mahananobis distance d= (x i -y i ) T W(x i -y i )d= (x i -y i ) T W(x i -y i ) Solving for W involves estimation of d 2 parametersSolving for W involves estimation of d 2 parameters Assumes Gaussian distributionAssumes Gaussian distribution Sensitive to sample set sizeSensitive to sample set size Relevant Component Analysis (RCA) Relevant Component Analysis (RCA) Applying the estimated distance along with equivalence constraints Assumes Gaussian distributionAssumes Gaussian distribution

Boosting Distance Metric An anisotropic and heterogeneous distance metric may be more suitable for estimating the similarity between features. An anisotropic and heterogeneous distance metric may be more suitable for estimating the similarity between features. we propose a boosted distance metric for similarity estimation where similarity function for certain class of samples can be estimated by a generalization of different distance metrics on selected feature elements. we propose a boosted distance metric for similarity estimation where similarity function for certain class of samples can be estimated by a generalization of different distance metrics on selected feature elements. In particular, we use AdaBoost with decision stumps and our distance metric analysis to estimate the similarity. In particular, we use AdaBoost with decision stumps and our distance metric analysis to estimate the similarity.

Boosting Distance Metric Training set D consists of pair-wise distance vector d between two samples given a specific distance metric m. Training set D consists of pair-wise distance vector d between two samples given a specific distance metric m. The corresponding label for the distance vector is defined as follows: The corresponding label for the distance vector is defined as follows: A weak classifier is defined by a distance metric m on a feature element f with estimated parameter(s) θ A weak classifier is defined by a distance metric m on a feature element f with estimated parameter(s) θ

Boosting Distance Metric Algorithm

Theoretical Analysis Advantages: Advantages: The similarity estimation only uses a small set of elements that is most useful for similarity estimation.The similarity estimation only uses a small set of elements that is most useful for similarity estimation. For each element the distance metric that best fits its distribution is learnt.For each element the distance metric that best fits its distribution is learnt. it adds effectiveness and robustness to the classifier when we have a small training set compared to the number of dimensions.it adds effectiveness and robustness to the classifier when we have a small training set compared to the number of dimensions. Since the training iteration T is usually much smaller than the original data dimension, the boosted distance metric works as a non-linear dimension reduction technique, which keeps the most important elements to similarity judgment. It could be very helpful to overcome the small sample set problem. Since the training iteration T is usually much smaller than the original data dimension, the boosted distance metric works as a non-linear dimension reduction technique, which keeps the most important elements to similarity judgment. It could be very helpful to overcome the small sample set problem. The proposed method is general and can be plugged into many similarity estimation techniques, such as widely used K-NN. The proposed method is general and can be plugged into many similarity estimation techniques, such as widely used K-NN.

Experiments and Analysis Are the new metrics better than L 1 and L 2 ? Are the new metrics better than L 1 and L 2 ? Stereo matchingStereo matching Motion trackingMotion tracking Is boosted distance metric better than single distance metric? Is boosted distance metric better than single distance metric? Benchmark datasetsBenchmark datasets Image retrievalImage retrieval

Experiments and Analysis: Stereo Matching Stereo matching: to find correspondences between entities in images with overlapping scene content. Stereo matching: to find correspondences between entities in images with overlapping scene content. The images are taken from cameras at different viewpoints.The images are taken from cameras at different viewpoints. The intensities of corresponding pixels are different.The intensities of corresponding pixels are different. A stereo matcher based on the distance between different image regions is implemented.A stereo matcher based on the distance between different image regions is implemented. The optimal metric in this case will give the most accurate stereo matching performance.The optimal metric in this case will give the most accurate stereo matching performance. Two standard stereo data sets, Castle and Tower, from Carnegie Mellon University are used for training and testing. Two standard stereo data sets, Castle and Tower, from Carnegie Mellon University are used for training and testing. 28 reference points in each frame

Experiments and Analysis: Stereo Matching Distance-based Stereo Matcher Distance-based Stereo Matcher Objective: To match a template (5X5) defined around one point from an image with the templates around points in the other images in order to find similarityObjective: To match a template (5X5) defined around one point from an image with the templates around points in the other images in order to find similarity We search the templates centered at a 7X7 zone around the reference points.We search the templates centered at a 7X7 zone around the reference points. Using the distance metrics we discussed, we obtain sum of distances between pixel values of two templates.Using the distance metrics we discussed, we obtain sum of distances between pixel values of two templates. If the reference point is in the predicted template in the next frame, we consider that we have a hit, otherwise, we have a miss.If the reference point is in the predicted template in the next frame, we consider that we have a hit, otherwise, we have a miss.

Experiments and Analysis: Stereo Matching Chi-Square Test Chi-Square Test To verify if the ground truth distance matches the modeled distribution most accuratelyTo verify if the ground truth distance matches the modeled distribution most accurately Chi-Square tests are run on the distances between ground truth and predicted templates.Chi-Square tests are run on the distances between ground truth and predicted templates. The smaller the value the more similar are the distributions.The smaller the value the more similar are the distributions. The parameters p, q and r for generalized harmonic and geometric distance are tested from -5 to 5 with step size 0.1 The parameters p, q and r for generalized harmonic and geometric distance are tested from -5 to 5 with step size 0.1 Generalized geometric (0.0239) L 2 (0.0378)

Experiments and Analysis: Stereo Matching L 1 and L 2 are outperformed by several metrics. The generalized geometric distance gives the best accuracy. According to Chi- Square test it also fits the distribution better.

Experiments and Analysis: Motion Tracking In this experiment distance metric analysis is tested on motion tracking application. In this experiment distance metric analysis is tested on motion tracking application. To tracing moving facial expressions.To tracing moving facial expressions. A video sequence containing 19 images on a moving head in a static backgroundA video sequence containing 19 images on a moving head in a static background For each image in this video sequence, there are 14 points given as ground truth.For each image in this video sequence, there are 14 points given as ground truth. The motion tracking algorithm between the test frame and another frame performs template matching to find the best match in a template around a central pixel.The motion tracking algorithm between the test frame and another frame performs template matching to find the best match in a template around a central pixel. In searching for the corresponding pixel, we examine a region of width and the height of 7 pixels centered at the position of the pixel in the previous frame.In searching for the corresponding pixel, we examine a region of width and the height of 7 pixels centered at the position of the pixel in the previous frame.

Experiments and Analysis: Motion Tracking This figure shows the prediction errors vs. the step between frames compared. L 1 and L 2 are not the best choice. Generalized geometric distance gives the best performance.

Boosted Distance Metric on Benchmark Database we compare the performance of our boosted distance metric with several well-known traditional metrics. we compare the performance of our boosted distance metric with several well-known traditional metrics. L 1,L 2,RCA,Mah, Mah-CL 1,L 2,RCA,Mah, Mah-C The last 3 needs parameter estimation.The last 3 needs parameter estimation. RCA-CD, Mah-D, Mah-CDRCA-CD, Mah-D, Mah-CD Diagonal matrix is used to avoid small sample size problemDiagonal matrix is used to avoid small sample size problem Simple AdaBoost with Decision Stump and C4.5Simple AdaBoost with Decision Stump and C4.5

Boosted Distance Metric on Benchmark Database Database: 15 data sets from UCI repository Database: 15 data sets from UCI repository The first 4 datasets have large dimensionality. The first 4 datasets have large dimensionality. 20% of data used as training while 80% as testing. 20% of data used as training while 80% as testing. Average on results of 100 runs. Average on results of 100 runs.

Boosted Distance Metric on Benchmark Database For traditional metrics only the one gives the best performance is shown in the table.

Boosted Distance Metric on Benchmark Database Our proposed error metric performs the best on 13 out of 15 data sets.

Boosted Distance Metric in Image Retrieval The boosted distance metric performs an element selection that is highly discriminant for similarity estimation The boosted distance metric performs an element selection that is highly discriminant for similarity estimation it doesn’t suffer from the small sample set problem as LDA and other dimension reduction techniques.it doesn’t suffer from the small sample set problem as LDA and other dimension reduction techniques. To evaluate the performance, we tested the boosted distance metric on image classification against some state-of-the-art dimension reduction techniques: To evaluate the performance, we tested the boosted distance metric on image classification against some state-of-the-art dimension reduction techniques: PCA, LDA, NDA and plain Euclidean distance in the original feature space using simple NN classifierPCA, LDA, NDA and plain Euclidean distance in the original feature space using simple NN classifier

Boosted Distance Metric in Image Retrieval Two data sets are used: Two data sets are used: A subset of MNIST data set containing similar hand-written 1’s and 7’sA subset of MNIST data set containing similar hand-written 1’s and 7’s A gender recognition database containing facial images from the AR database and the XM2TVS databaseA gender recognition database containing facial images from the AR database and the XM2TVS database The dimension of the feature for both databases is 784 while the size of training set is fixed to 200 which is small compared to the dimensionality of feature.The dimension of the feature for both databases is 784 while the size of training set is fixed to 200 which is small compared to the dimensionality of feature.

Boosted Distance Metric in Image Retrieval For the boosted distance metric, iteration T is equivalent to the dimensionality we reduced to.

Discussions Our main contribution is to provide a general guideline for designing robust distance estimation that could adapt data distributions automatically. Our main contribution is to provide a general guideline for designing robust distance estimation that could adapt data distributions automatically. Novel distance metrics deriving from harmonic, geometric mean and their generalized forms are presented and discussed. Novel distance metrics deriving from harmonic, geometric mean and their generalized forms are presented and discussed. The creative component of our work is to start from an estimator and perform reverse engineering to obtain a metric.The creative component of our work is to start from an estimator and perform reverse engineering to obtain a metric. Some of the proposed metrics cannot be translated into a known probabilistic modelSome of the proposed metrics cannot be translated into a known probabilistic model

Discussions In similarity estimation the feature elements are often from heterogeneous sources. The assumption that the feature has a unified isotropic distribution is invalid. In similarity estimation the feature elements are often from heterogeneous sources. The assumption that the feature has a unified isotropic distribution is invalid. Unlike traditional anisotropic distance metric, our proposed method does not make any assumption on the feature distribution. Instead it learns distance metrics on each element to capture the underlying feature structure.Unlike traditional anisotropic distance metric, our proposed method does not make any assumption on the feature distribution. Instead it learns distance metrics on each element to capture the underlying feature structure. We examined the new metrics on several applications in computer vision, and the estimation of similarity can be significantly improved by the proposed distance metric analysis. We examined the new metrics on several applications in computer vision, and the estimation of similarity can be significantly improved by the proposed distance metric analysis.

Reference 1.M. Zakai, “General distance criteria,” IEEE Trans. on Information Theory, pp. 94- 95, January 1964. 2.M. Swain and D. Ballard, “Color indexing,” Intl. Journal Computer Vision, vol. 7, no.1, pp. 11-32, 1991. 3.R. M. Haralick, et al, “Texture features for image classification,” IEEE Trans. on Sys. Man and Cyb., 1990. 4.B. M. Mehtre, et al., “Shape measures for content based image retrieval: a comparison”, Information Proc. Management, 33(3):319-337, 1997. 5.R. Haralick and L. Shapiro, Computer and Robot Vision II, Addison-Wesley, 1993. 6.R. E. Schapire, Y. Singer, “Improved boosting using confidence-rated predictions,” Machine Learning 37 (3) (1999) 297–336. 7.C. Domeniconi, J. Peng, D. Gunopulos, “Locally adaptive metric nearestneighbor classification,” IEEE Trans. PAMI 24 (9) (2002) 1281–1285. 8.J. Peng, et al., “LDA/SVM driven nearest neighbor classification,” IEEE Proc. CVPR, 2001, pp. 940–942. 9.E. P. Xing, A. Y. Ng, M. I. Jordan, S. Russell, “Distance metric learning, with application to clustering with side-information.”, Proc. NIPS, 2003, pp. 505–512

Reference 10.A. Bar-Hillel, T. Hertz, N. Shental, D. Weinshall, “Learning distance functions using equivalence relations,” Proc. ICML, 2003, pp. 11–18. 11.T. Hertz, A. Bar-Hillel, D. Weinshall, “Learning distance functions for image retrieval,” IEEE Proc. CVPR, 2004, pp. 570–577. 12.P. J. Huber, Robust Statistics, John Wiley & Sons, 1981. 13.L. Tang, et al., “Performance evaluation of a facial feature tracking algorithm,” Proc. NSF/ARPA Workshop: Performance vs. Methodology in Computer Vision, 1994. 14.Y. LeCun, et al., MNIST database, http://yann.lecun.com/exdb/mnist/. 15.A. Martinez, R. Benavente, “The AR face database,” Tech. Rep. 24, Computer Vision Center (1998). 16.J. Matas, et al, “Comparison of face verification results on the xm2vts database,” Proc. ICPR, 1999, pp. 858–863.

Download ppt "Ranking Metrics and Evaluation Measures Jie Yu, Qi Tian, Nicu Sebe Presented by Jie Yu."

Similar presentations