Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification Yo Horikawa Kagawa University, Japan.

Similar presentations


Presentation on theme: "1 Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification Yo Horikawa Kagawa University, Japan."— Presentation transcript:

1 1 Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification Yo Horikawa Kagawa University, Japan

2 2 ・ Support vector machine (SVM) ・ Kernel principal component analysis (kPCA) ・ Kernel canonical correlation analysis (kCCA) with modified versions of correlation kernels → Invariant texture classification Compare the performance of the modified correlation kernels and the kernel methods.

3 3 Support vector machine (SVM) Sample data: x i (1 ≤ i ≤ n), belonging to Class c i ∊ {-1, 1} SVM learns a discriminant function for test data x: d(x) = sgn(∑ i=1 n’ α i c i k(x, x si ) + b) α i and b are obtained through the quadratic programming problem. Kernel function: Inner product of nonlinear maps φ(x): k(x i, x j ) = φ(x i ) ・ φ(x j ) Support vectors: x si (1 ≤ i ≤ n’ (≤ n)): a part of sample data Feature extraction process is implicitly done in SVM through the kernel function and support vectors.

4 4 Kernel principal component analysis (kPCA) Principal components for for the nonlinear map φ(x i ) are obtained through the eigenproblem: Φv =λv (Φ: Kernel matrix (Φ ij = φ(x i )∙φ(x j ) = k(x i, x j )) ) Let v r = (v r1, …, v rn ) T (1 ≤ r ≤ R ( ≤ n)) be the eigenvectors in the non-increasing order of the corresponding non-zero eigenvalues λ r, which are normalized as λ r v r ∙ T v r = 1. The rth principal component u r for a new data x is obtained by u r = ∑ i=1 n v ri φ(x i )∙φ(x) = ∑ i=1 n v ri k(x i, x) Classification methods, e.g., the nearest-neighbor method, can be applied in the principal component space (u 1, ∙∙∙, u R ).

5 5 Kernel canonical correlation analysis (kCCA) Pairs of feature vectors of sample objects: (x i, y i ) (1 ≤ i ≤ n) KCCA finds projections ( canonical variates ) (u, v) that yield maximum correlation between φ (x) and θ( y). (u, v) = (w φ ・ φ(x), w θ ・ θ(y)) w φ = ∑ i=1 n f i φ(x i ), w θ = ∑ i=1 n g i θ(y i ) where f T = (f 1, ∙∙∙, f n ) and g T = (g 1, ∙∙∙, g n ) are the eigenvectors of the generalized eigenvalue problem: Φ ij = φ(x i ) ・ φ(x j ) Θ ij = θ(y i ) ・ θ(y j ) I: Identity matrix of n×n

6 6 Application of KCCA for classification problems Use an indicator vector as the second feature vector y. y = (y 1, ∙∙∙, y nc ) corresponding to x: y c = 1 if x belongs to class c y c = 0 otherwise (n c : the number of classes) Mapping θ of y is not used. A total of n c -1 eigenvectors f r = ( f r1, …, f kn ) (1 ≤ k ≤ n c -1) corresponding to non-zero eigenvalues are obtained. Canonical variates u r (1 ≤ r ≤ n c -1) for a new object (x, ?) are calculated by u r = ∑ r=1 n f r φ(x r ) ・ φ(x) = ∑ r=1 n f r k(x r, x) Classification methods can be applied in the canonical variate space (u 1, …, u nc-1 ).

7 7 Correlation kernel The kth-order autocorrelation of data x i (t): r xi (t 1, t 2, ∙∙∙, t k-1 ) = ∫x i (t)x i (t+t 1 ) ・・・ x i (t+t k-1 )dt The inner product between r xi and r xj is calculated with the k-th power of the cross-correlation function (2nd-order): r xi ・ r xj =∫{cc xi, xj (t 1 )} k dt 1 (cc xi, xj (t 1 ) =∫x i (t)x j (t+t 1 )dt) The calculation of explicit values of the autocorrelations is avoided. → High-order autocorrelations are tractable with practical computational cost. ・ Linear correlation kernel: K(x i, x j ) = r xi ・ r xj ・ Gaussian correlation kernel: K(x i, x j ) = exp(-μ|r xi - r xj | 2 ) = exp(-μ(r xi ・ r xj + r xi ・ r xj - 2r xi ・ r xj ))

8 8 Calculation of correlation kernels r xi ・ r xj for 2- dimensional image data: x(l, m) (1≤ l ≤ L, 1≤ m ≤ M) ・ Calculate the cross-correlations between x i (l, m) and x j (l, m): cc xi, xj (l 1, m 1 ) = ∑ l=1 L-l1 ∑ m=1 M-m1 x i (l, m)x j (l+l 1, m+m 1 )/(LM) (1 ≤ l 1 ≤ L 1, 1 ≤ m 1 ≤ M 1 ) ・ Sum up the kth-power of the cross-correlations: r xi ・ r xj = ∑ l1=0 L1-1 ∑ m1=0 M1-1 {cc xi, xj (l 1, m 1 )} k /(L 1 M 1 ) L M M1M1 L1L1 x i (l, m) x j (l+l 1, m+m 1 ) ∑ l,m x i (l+m)x j (l+l 1, m+m 1 ) r xi ・ r xj = ∑ l1, m1 { ・ } k

9 9 Problem of correlation kernels The order k of correlation kernels increases. → The generalization ability and robustness are lost. r xi ・ r xj = ∑ t1 (cc xi, xj (t 1 )) k → δ i, j (k → ∞) For test data x (≠x i ), r xi ・ r x = 0 In kCCA, Φ= I, Θ: block matrix, eigenvectors: f = (p 1, …, p 1, p 2, …, p 2, …, p C, …, p C ) (f i = p c, if x i ∊ class c) For sample data, canonical variates lie on a line through the origin corresponding to its class: u xi = (r xi ・ r xi )p c (p c = (p c,1, ∙∙∙, p c,C-1 )), if x i ∊ class c For test data: u x ≈ 0

10 10 Fig. A. Scatter diagram of canonical variates (u1, u2) and (u3, u1) of Test 1 data of texture images in the Brodatz album in kCCA. Plotted are square ( ■ ) for D4, cross (×) for D84, circle ( ● ) for D5 and triangle (Δ) for D92. (a) linear kernel ( ⅰ )(b) Gaussian kernel ( ⅱ ) (c) 2nd-order correlation kernel ( ⅲ ) (d) 3rd-order correlation kernel ( ⅲ ) (e) 4th-order correlation kernel ( ⅲ )(f) 10th-order correlation kernel ( ⅲ ) Most of test data u ≈ 0

11 11 Modification of correlation kernels ・ The kth root of the kth-order correlation kernel in the limit of k→∞ is related to the max norm, which corresponds to the L p norm ||x|| p = {∑|x i | p } 1/p in the limit of p→∞. The max norm corresponds to the peak response of a matched filter, which maximizes SNR, and is then expected to have robustness. Then the correlation kernel can be modified with its kth root, taking account of its sign. ・ A difference between the even and odd-order correlations is that the odd-order autocorrelations are blind to sinusoidal signals and random signals with symmetric distributions. This is attributed to the fact that changes in the sign of the original data (x→-x) cause changes in the signs of the autocorrelations of odd-orders but not of even-orders. In the correlation kernel, it appears as the parity of the number of the power of the cross-correlations. Then the absolute values of the cross- correlations might be used instead.

12 12 Proposed modified autocorrelation kernels L p norm kernel (P) : sgn(cc xi, xj (l 1, m 1 ))|∑ l1,m1 {cc xi, xj (l 1, m 1 )} k | 1/k Absolute kernel (A) : ∑ l1, m1 |cc xi, xj (l 1, m 1 )| k Absolute L p norm kernel (AP): |∑ l1, m1 {cc xi, xj (l 1, m 1 )} k | 1/k Absolute L p norm absolute kernel (APA): |∑ l1, m1 |cc xi, xj (l 1, m 1 )| k | 1/k Max norm kernel (Max): max l1, m1 cc xi, xj (l 1, m 1 ) Max norm absolute kernel (MaxA): max l1, m1 |cc xi, xj (l 1, m 1 )|

13 13 Classification experiment Fig. 1. Texture images. Table 1. Sample and test sets. 4-class classification problems with SVM, kPCA and kCCA Original images: 512×512 pixels (256 gray scale) in the VisTex database and the Brodatz album Sample and test images: 50×50 pixels, chosen in the original images with random shift and scaling, rotation, Gaussian noise (100 images each)

14 14 Kernel functions K(x i, x j ) Linear kernel: x i ・ x j Gaussian kernel: exp(-μ||x i – x j || 2 ) Correlation kernels: r xi ・ r xj (C2-10) Modified correlation kernels: (P2-10, A3-7, AP3-7, APA3-7, Max, MaxA) Range of correlation lags: L 1 = M 1 = 10 (in 50×50 pixel images) The simple nearest-neighbor classifier is used for classification in the principal component space (u 1, ∙∙∙, u R ) in kPCA and with canonical variate space (u 1, …, u C-1 ) in kCCA. Parameter values are empirically chosen. (Soft margin: C = 100, Regularization:γ x =γ y = 0.1)

15 15 Fig. 2. Correct classification rates (CCR (%)) in SVM.

16 16 Fig. 3. Correct classification rates (CCR (%)) in kPCA.

17 17 Fig. 4. Correct classification rates (CCR (%)) in kCCA.

18 18 Comparison of the performance Correct classification rates (CCRs) of the correlation kernels (C2- 10) of odd- or higher-orders are low. With the modification, the L p norm kernels (P2-10) and the absolute kernels (A3-7) give high CCRs even for higher-orders and for odd-orders, respectively. Their combination (AP3-7, APA3-7), and the max norm kernels (Max, MaxA) also show good performance. Table 2. Highest correct classification rates.

19 19 Summary Modified versions of the correlation kernels are proposed. ・ Apply of the L p norm and max norm → The poor generalization of the higher-order correlation kernels is improved. ・ Use of the absolute values → The inferior performance of the correlation kernels of odd-orders to even-orders due to the blindness to sinusoidal or symmetrically distributed signals is also improved. SVMs, kPCA and kCCA with the modified correlation kernels show good performance in texture classification experiments.


Download ppt "1 Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification Yo Horikawa Kagawa University, Japan."

Similar presentations


Ads by Google