Lecture 19 Representation and description II Regional descriptors Principle components Representation with Matlab 4. Feature selection
Regional Descriptors area perimeter compactness topological descriptors texture
Simple descriptors Area = the number of pixels in the region Perimeter = length of its boundary Compactness = (perimeter)2/area
Topological descriptors Features that does not change when deformation E = C – H E = Euler number C = number of connected region H = number of holes
Example Straight-line segments (polygon networks) V – Q + F = C – H = E V = number of vertices Q = number of edges F = number of faces 7-11+2 = 1-3 = -2
Example
Texture description Texture features measures Smoothness, coarseness, and regularity Three approaches: statistical, structural, and spectral Statistical approach yields characterization of smoothness, coarse, grainy Structural approach deals with arrangement of image primitives such as regularly spaced parallel lines Spectral approach is based on properties of Fourier spectrum to detect global periodicity in an image by identifying high-energy, narrow peaks in the spectrum.
Statistical approaches
Example
Co-occurrence matrix and descriptors
Occurrence with large distance Example: correlation descriptor as a function of offset
Structural approaches Texture pattern generated by grammar rules Examples S →aS, and a represents a circle S →aS, S →bA, A →cA, A →c, A →bS, S →a b represents a circle down, c a circle on the left
Spectral approaches Consider the FT F(u, v) of the region. Or represent F(u, v) in polar coordinates S(r, θ) Ray descriptor Ring descriptor
Principle components Suppose we are given n components of an image (e.g. n = 3 for RGB image), written as The mean vector is The covariance matrix Real and symmetric
Eigenvalue, eigenvector, and Hoteling transformation
Example 2 Let X =(x1, x2) be the coordinates of pixel of a region or a boundary The eigenvalues are descriptors insensitive to size and routation
Example 2
Lecture 19 Part II Feature selection This is to select of a subset of features from a larger pool of available features The goal is to select those that are rich in discriminatory information with respect to the classification problem at hand.
Some housekeeping techniques Outlier removal An outlier is a point that lies far away from the mean value of the corresponding random variable; e.g. for normally distributed data, a threshold of 1, 2, or 3 times the standard deviation is used to define outliers. Data normalization : restrict the values of all features within predetermined ranges. E.g. transform to standard normal distribution or transform the range to [-1, 1] or by softmax scaling r is user defined para.
Informative or not by hypothesis testing A feature is informative or not. Statistical tests are commonly used. The idea is to test whether the mean values of feature has in two classes differ significantly H1: The mean values of the feature in the two classes are different. (alternative hypothesis) H0: The mean values of the feature in the two classes are equal. (null hypothesis)
Receiver operating characteristic Measure the overlap between the pdfs describing the data distribution. This overlap is quantified in terms of an area between two curves, also known as AUC (area under the receiver operating curve).
Fisher’s discriminant Ratio Fisher’s discriminant ratio (FDR) is commonly employed to quantify the discriminatory power of individual features between two equiprobable classes.
Combination of features Divergence Si is the covariance matrix
Bhattacharyya Distance and Chernoff Bound
Measures Based on Scatter Matrices Large values of J1, J2, and J3 indicate that data points in the respective feature space have small within-class variance and large between-class distance.
Feature Subset Selection Reduce the number of features by discarding the less informative ones, using scalar feature selection. Consider the features that survive from the previous step in different combinations in order to keep the “best” combination.
Feature ranking 1. features are ranked in descending order according to some criterion C. 2. Let i1 be the index of the best one. Next, the cross-correlations among the first (top-ranked) feature with each of the remaining features are computed. 3. The index, i2, of the second most important feature, x_i2 , is computed as 4. General k
Feature Vector Selection Find the “best” combination of features. To examine all possible combinations of the m features Suboptimal Searching Techniques, e.g. sequential forward selection (SFS),