Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,

Slides:



Advertisements
Similar presentations
Component Analysis (Review)
Advertisements

Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL

An Introduction of Support Vector Machine
Machine learning continued Image source:
Dimension reduction (1)
Face Recognition By Sunny Tang.
Face Recognition and Biometric Systems
Discriminative and generative methods for bags of features
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Face Recognition Committee Machine Presented by Sunny Tang.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Face Recognition Under Varying Illumination Erald VUÇINI Vienna University of Technology Muhittin GÖKMEN Istanbul Technical University Eduard GRÖLLER Vienna.
Principal Component Analysis
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
CS 790Q Biometrics Face Recognition Using Dimensionality Reduction PCA and LDA M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
An Introduction to Kernel-Based Learning Algorithms K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda and B. Scholkopf Presented by: Joanna Giforos CS8980: Topics.
Eigenfaces As we discussed last time, we can reduce the computation by dimension reduction using PCA –Suppose we have a set of N images and there are c.
Face Recognition using PCA (Eigenfaces) and LDA (Fisherfaces)
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Face Recognition Using Eigenfaces
Feature Extraction for Outlier Detection in High- Dimensional Spaces Hoang Vu Nguyen Vivekanand Gopalkrishnan.
FACE RECOGNITION, EXPERIMENTS WITH RANDOM PROJECTION
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Comparing Kernel-based Learning Methods for Face Recognition Zhiguo Li
A PCA-based feature extraction method for face recognition — Adaptively weighted sub-pattern PCA (Aw-SpPCA) Group members: Keren Tan Weiming Chen Rong.
Oral Defense by Sunny Tang 15 Aug 2003
Face Detection and Recognition
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
Recognition Part II Ali Farhadi CSE 455.
Outline Separating Hyperplanes – Separable Case
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
Transcription of Text by Incremental Support Vector machine Anurag Sahajpal and Terje Kristensen.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
Face Recognition: An Introduction
CSE 185 Introduction to Computer Vision Face Recognition.
Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.
Principal Manifolds and Probabilistic Subspaces for Visual Recognition Baback Moghaddam TPAMI, June John Galeotti Advanced Perception February 12,
Lecture 4 Linear machine
Discriminant Analysis
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
Dimensionality reduction
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
2D-LDA: A statistical linear discriminant analysis for image matrix
Linear Classifiers Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
LDA (Linear Discriminant Analysis) ShaLi. Limitation of PCA The direction of maximum variance is not always good for classification.
1 An introduction to support vector machine (SVM) Advisor : Dr.Hsu Graduate : Ching –Wen Hong.
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Principal Component Analysis (PCA)
Deeply learned face representations are sparse, selective, and robust
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
University of Ioannina
LECTURE 10: DISCRIMINANT ANALYSIS
Recognition with Expression Variations
Face Recognition and Feature Subspaces
Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Feature space tansformation methods
CS4670: Intro to Computer Vision
LECTURE 09: DISCRIMINANT ANALYSIS
Presentation transcript:

Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li, Xiaoou Tang Dept. of Information Engineering Chinese University of Hong Kong

Outline  Introduction  Bayesian SVM  SVM  Bayesian Analysis  Bayesian SVM  Two-Stage Clustering-Based Classification  Hierarchical Agglomerative Clustering (HAC)  Two-Stage SVM  Adaptive clustering Bayesian SVM  Adaptive Clustering Multilevel Subspace SVM Algorithm  Experiments  Conclusion

Introduction  Face recognition has been one of the most challenging computer vision research topics  Existing face recognition techniques:  Eigenface  Fisherface  Bayesian algorithm  Support vector machines (SVM) improve the classification performance of the PCA and LDA subspace features  Find one hyperplane to separate the two classes of vectors

Introduction  SVM vs. face recognition  Binary vs. multiclass  Have to reduce the multiclass classification to a combination of SVMs  Several strategies to solve the problem  One-versus-all  Pairwise  Large number of SVMs have to be trained

Introduction  Bayesian method  Convert the multiclass face recognition problem into a two-class classification problem  Suitable for using the SVM directly  Only one hyperplane may not be enough

SVM Var 1 Var 2 Margin Width IDEA 1: Select the separating hyperplane that maximizes the margin!

SVM Var 1 Var 2 The width of the margin is: So, the problem is:

SVM Var 1 Var 2 There is a scale and unit for data so that k=1. Then problem becomes:

SVM  If class 1 corresponds to 1 and class 2 corresponds to -1, we can rewrite  as  So the problem becomes: or

Bayesian Face Recognition Intrapersonal Extrapersonal Equate ML similarity between any two images PCA-based density estimation

PCA-based Density Estimation Perform PCA and factorize into (orthogonal) Gaussians subspaces: Solve for minimal KL divergence residual for the orthogonal subspace:

Bayesian SVM Intrapersonal variation set Extrapersonal variation set Project and whiten all the image difference vectors in the intrapersonal subspace and use these two vectors to train the SVM to generate the decision function For testing, compute the face difference vector, and then project and white it: Classification decision is made by:

Two-Stage Clustering-Based Classification  Problems about above methods  One-versus all approach : too many SVMs  Direct Bayesian SVM : too many samples for one SVM  Try to find a solution to balance the two extremes  When training an SVM  The most important region is around the decision hyperplane  Partition the gallery data into clusters  Methods  Use Bayesian SVM to estimate the similarity matrix  Use HAC to group the similar face clusters

Hierarchical Agglomerative Clustering  Basic process of the HAC: 1) Initialize a set of clusters. 2) Find the nearest pair of clusters that have the largest similarity measure, and then merge them into a new cluster. Estimate the similarity measure between the new cluster and all the other clusters. 3) Repeat step 2 until the stopping rule is satisfied.  Different strategies used in each steps lead to different designs of the HAC algorithm

Two-Stage SVM  Similarity measure between the two images: where  Perform one-versus-all Bayesian SVM within each cluster  During testing  Compute the whitened face difference vector  Find the class that gives the smallest  perform one-versus-all SVM on the cluster

Adaptive Clustering Bayesian SVM  Method 1) Use Bayesian algorithm to find a cluster that most similar to the test face 2) Use one-versus-all algorithm to reclassify the face in this cluster  Train the one-versus-all Bayesian SVM in the training stage, and then use it to reclassify only the faces in the new cluster

Adaptive Clustering Multilevel Subspace SVM  Detailed algorithms in the first stage: 1) Divide the original face vector into K feature slices. Project each feature slice to its PCA subspace computed from the training set of the slice and adjust the PCA dimension to reduce the most noise 2) Compute the intrapersonal subspace using the within-class scatter matrix in the reduced PCA subspace and adjust the dimension of intrapersonal subspace to reduce the intrapersonal variation 3) For the L individuals in the gallery, compute their training data class centers. Project all of the class centers onto the intrapersonal subspace, and then normalize the projections by intrapersonal eigenvalues to compute the whitened feature vectors 4) Apply PCA on the whitened feature vector centers to compute the final discriminant feature vector 5) Combine the extracted discriminant feature vectors from each slice into a new feature vector 6) Apply PCA on the new feature vector to remove redundant information in multiple slice. The features with large eigenvalues are selected to form the final feature vector for recognition  The second stage is similar to that of the adaptive clustering Bayesian SVM

Experiments

Conclusion  The direct Bayesian-based SVM is too simple, that tries to separate two complex subspaces by just one hyperplane  In order to improve the recognition performance, further develop the one-versus-all, HAC-based, and adaptive clustering Bayesian-based SVM  The experiments results clearly demonstrate the superiority of the new algorithm over traditional subspace methods

Eigenfaces Projects all the training faces onto a universal eigenspace to “encode” variations (“modes”) via principal components analysis (PCA) Uses inverse-distance as a similarity measure for matching & recognition

Fisherfaces  Eigenfaces attempt to maximize the scatter of the training images in face space  Fisherfaces attempt to maximize the between class scatter, while minimizing the within class scatter  In other words, moves images of the same face closer together, while moving images of difference faces further apart Fisher Linear Discriminant