Detecting Faces in Images: A Survey

Detecting Faces in Images: A Survey
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 24, NO. 1, JANUARY 2002 Ming-Hsuan Yang, Member, IEEE, David J. Kriegman, Senior Member, IEEE, Narendra Ahuja, Fellow, IEEE Detecting Faces in Images: A Survey

Face Detection Given a single image,
Identify all image regions which contain a face Regardless of its 3D position, orientation and lighting conditions Categorize and evaluate different algorithms IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Methods to Detect/Locate Faces
Knowledge-based methods Encode human knowledge of what constitutes a typical face (usually, the relationships between facial features) Feature invariant approaches Aim to find structural features of a face that exist even when the pose, viewpoint, or lighting conditions vary Template matching methods Several standard patterns stored to describe the face as a whole or the facial features separately Appearance-based methods The models (or templates) are learned from a set of training images which capture the representative variability of facial appearance IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Appearance-Based Methods
Learn appearance “templates” from examples in images Statistical analysis and machine-learning Train a classifier using positive (and usually negative) examples of faces Representation Pre processing Train a classifier Search strategy Post processing View based IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Bayesian Classifier Image or feature vector: variable x
High-dimension x  multimodal of p(x|..) No natural parameterized forms Empirically validated parametric or non-parametric approximation IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Appearance-based Methods: Classifiers
Neural network: Multilayer Perceptrons Principal Component Analysis (PCA), Factor Analysis Mixture of PCA, Mixture of factor analyzers Support vector machine (SVM) Distribution-based method Naïve Bayes classifier Hidden Markov model Sparse network of winnows (SNoW) Kullback relative information Inductive learning: C4.5 Adaboost … IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Eigenfaces Face Images  linearly encoded using a modest number of basis images [Kirby and Sirovich] Principle Component Analysis (PCA) … … Minimize the mean square error between the projection of the training images onto this subspace and the original images Eigen faces mxn m*n vectors, N samples K Basis vectors, K<<N IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Eigenfaces for recognition
Matthew Turk and Alex Pentland J. Cognitive Neuroscience 1991 Eigenfaces for recognition

Linear subspaces Classification can be expensive:
convert x into v1, v2 coordinates What does the v2 coordinate measure? distance to line use it for classification—near 0 for orange pts What does the v1 coordinate measure? position along line use it to specify which orange point it is Classification can be expensive: Big search prob (e.g., nearest neighbors) or store large PDF’s Suppose the data points are arranged as above Idea—fit a line, classifier measures distance to line CSE 576, Spring 2008 Face Recognition and Detection

Dimensionality reduction
We can represent the orange points with only their v1 coordinates (since v2 coordinates are all essentially 0) This makes it much cheaper to store and compare points A bigger deal for higher dimensional problems CSE 576, Spring 2008 Face Recognition and Detection

Linear subspaces Consider the variation along direction v among all of the orange points: What unit vector v minimizes var? What unit vector v maximizes var? Solution: v1 is eigenvector of A with largest eigenvalue v2 is eigenvector of A with smallest eigenvalue CSE 576, Spring 2008 Face Recognition and Detection

Principal component analysis
Suppose each data point is N-dimensional Same procedure applies: The eigenvectors of A define a new coordinate system eigenvector with largest eigenvalue captures the most variation among training vectors x eigenvector with smallest eigenvalue has least variation We can compress the data using the top few eigenvectors corresponds to choosing a “linear subspace” represent points on a line, plane, or “hyper-plane” these eigenvectors are known as the principal components CSE 576, Spring 2008 Face Recognition and Detection

The space of faces + = An image is a point in a high dimensional space
An N x M image is a point in RNM We can define vectors in this space as we did in the 2D case CSE 576, Spring 2008 Face Recognition and Detection

Dimensionality reduction
The set of faces is a “subspace” of the set of images We can find the best subspace using PCA This is like fitting a “hyper-plane” to the set of faces spanned by vectors v1, v2, ..., vK any face CSE 576, Spring 2008 Face Recognition and Detection

Eigenfaces PCA extracts the eigenvectors of A
Gives a set of vectors v1, v2, v3, ... Each vector is a direction in face space what do these look like? CSE 576, Spring 2008 Face Recognition and Detection

Projecting onto the eigenfaces
The eigenfaces v1, ..., vK span the space of faces A face is converted to eigenface coordinates by CSE 576, Spring 2008 Face Recognition and Detection

Recognition with eigenfaces
Algorithm Process the image database (set of images with labels) Run PCA—compute eigenfaces Calculate the K coefficients for each image Given a new image (to be recognized) x, calculate K coefficients Detect if x is a face If it is a face, who is it? Find closest labeled face in database nearest-neighbor in K-dimensional space CSE 576, Spring 2008 Face Recognition and Detection

Choosing the dimension K
NM i = eigenvalues How many eigenfaces to use? Look at the decay of the eigenvalues the eigenvalue tells you the amount of variance “in the direction” of that eigenface ignore eigenfaces with low variance CSE 576, Spring 2008 Face Recognition and Detection

Distribution-Based Methods
[Sung and Poggio, 94] Learn distribution of image patterns from one object from positive and negative examples Distribution-based models for face/nonface patterns 19x19 image, 361-D vector K-means: 6 face clusters, 6 non-face clusters Multidimensional Gaussian: mean & covariance matrix Multilayer perceptron classifier IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

[Sung and Poggio, 94] IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

[Sung and Poggio, 94] Masking: reduce the unwanted background noise in a face pattern Illumination gradient correction: find the best fit brightness plane and then subtracted from it to reduce heavy shadows caused by extreme lighting angles Histogram equalization: compensates the imaging effects due to changes in illumination and different camera input gains IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Distance Metrics [Sung and Poggio, 94] Compute distances of a sample to all the face and non-face clusters Within subspace distance (D1) Mahalanobis distance of the projected sample to the cluster center Distance to the subspace (D2) Distance of the sample to the subspace IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

[Sung and Poggio, 94] Distance measure IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

[Sung and Poggio, 94] Feature vector for each sample A vector of distance measurements to all clusters Multilayer perceptron classifier Train from database: 47316 4150 face: easy to collect Non-face: hard to get the representative sample Bootstrap method: selectively adds image to the training set as training progress IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Face and Non-Face Exemplars
Positive examples Get as much variation as possible Manually crop and normalize each face image into a standard size (e.g., 19 ×19) Creating virtual examples [Sung and Poggio 94] Negative examples: Fuzzy idea Any images that do not contain faces A large image subspace Bootstraping [Sung and Poggio 94] IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Creating Virtual Positive Examples
Simple and very effective method Randomly mirror, rotate, translate and scale face samples by small amounts Increase number of training examples Less sensitive to alignment error Randomly mirrored, rotated translated, and scaled faces [Sung & Poggio 94] IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Bootstrapping [Sung and Poggio, 94] Start with a small set of non-face examples in the training set Train a MLP classifier with the current training set Run the learned face detector on a sequence of random images. Collect all the non-face patterns that the current system wrongly classifies as faces (i.e., false positives) Add these non-face patterns to the training set Got to Step 2 or stop if satisfied Improve the system performance greatly IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Probabilistic Visual Learning method based on density estimation
(B. Moghaddam and A. Pentland) i PCA decomposition Principal subspace Orthogonal complement Discarded in standard PCA Learn local features Multivariate Gaussian Mixture of Gaussians Detect Maximum likelihood distance from feature space distance in feature space IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Mixture of Factor Analyses
[Yang et al. 00] Factor Analysis (FA) Generative method that performs clustering and dimensionality reduction within each cluster Modeling the covariance structure of High dimensional data using a small number of latent variables Similar with PCA, but different Data density is normalized along the principal component subspace Robust to independent noise in the features Able to detect faces in wide variations IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Mixture of Factor Analyses
[Yang et al. 00] Use mixture model to detect faces in different pose Using EM to estimate all the parameters in the mixture model See also [Moghaddam and Pentland 97] on using probabilistic Gaussian mixture for object localization IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Fisher’s Linear Discriminant
[Yang et al. 00] High-D image space to low-D Provides a better projection than PCA for pattern classification since it aims to find the most discriminant projection direction. Outperform the Eigenface method on several databases IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Fisher’s Linear Discriminant
[Yang et al. 00] Given a set of unlabeled face and non—face samples Apply Self Self-Organizing Map (SOM) to cluster faces/non-faces, and thereby labels for samples Apply FLD to find optimal projection matrix for maximal separation Estimate class-conditional density for detection SOM Face/non face prototypes generated by SOM FLD Class Conditional Density Maximum Likelihood Estimation IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Neural Networks Feasibility of training a system to capture the complex class conditional density of face patterns Hierarchical neural networks [Agui et al. 1992] Two parallel subnetworks First: Inputs are intensity values from original image and intensity values from filtered image using 3x3 Sobel filter Second: outputs from the subnetworks and extracted feature values Works for faces have the same size IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Convolutional neural networks
Vaillant et al. Examples of face/non-face images: 20x20 pixels Two neural networks: A: Trained to find approximate locations of faces at some scale -- select candidates B: trained to determine the exact position of faces at some scale -- verify IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Multilayer Perceptron
[Burel and Carel, 94] Compress examples using SOM Multilayer perceptron is used to learn them for face/background classification Detection Scanning each image at various resolution Normalize each location and size to standard size Classify normalized window by an MLP IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Autoassociative network
With multiple layers  nonlinear principle component analysis Different autoassociative networks to One to Detect frontal-view faces One to Turned up to 60°to left/right A gating networks to assign weights to frontal/side face detectors Utilized in an ensemble of autoassociative networks IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Probabilistic Decision-Based Neural Network (PDBNN)
[Lin et al. 1997] Similar to radial basis function network with Modified learning rules Probabilistic interpretation Extract feature vectors on intensity and edge Contains eyebrows, eyes, nose Feed two vectors to PDBNN and Use fusion of the outputs to classify IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Multilayer Neural Network
Rowley et al. Train multiple multilayer perceptrons with different receptive fields [Rowley and Kanade 96]. Merging the overlapping detections within one network Train an arbitration network to combine the results from different networks Needs to find the right neural network architecture (number of layers, hidden units, etc.) and parameters (learning rate, etc.) IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Neural Network-Based Detector
H. Rowley, S. Baluja, and T. Kanade IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Dealing with Multiple Detects
Merging overlapping detections within one network [Rowley and Kanade 96] IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Dealing with Multiple Detects
Arbitration among multiple networks AND operator OR operator Voting Arbitration network IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Support Vector Machines
A paradigm to train polynomial function, neural networks, or radial basis function (RBF) classifiers Methods for training a classifier (e.g., Bayesian, neural networks, radial basis function RBF) are based on of minimizing the training error SVMs operates on structural risk minimization, to minimize an upper bound on the expected generalization error IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Support Vector Machines
Find the optimal separating hyperplane constructed by support vectors [Vapnik 95] Maximize distances between the data points closest to the separating hyperplane (large margin classifier) Formulated as a quadratic programming problem Kernel functions for nonlinear SVMs support IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

SVM-Based Face Detector
[Osuna et al. 97] Adopt similar architecture Similar to [Sung and Poggio 94] with the SVM classifier Pros: Good recognition rate with theoretical support Cons: Time consuming in training and testing Need to pick the right kernel IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

SVM-Based Face Detector: Issues
Training: Solve a complex quadratic optimization problem Speed-up: Sequential Minimal Optimization (SMO) [Platt 99] Testing: The number of support vectors may be large  lots of kernel computations Speed-up: Reduced set of support vectors [Romdhani et al. 01] Variants: Component-based SVM [Heisele et al. 01]: Learn components and their geometric configuration Less sensitive to pose variation IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Sparse Network of Winnows (SNoW)
Yang et al. 00 A sparse network of linear functions that utilizes the Winnow update rule On line, mistake driven algorithm Attribute (feature) efficiency Allocations of nodes and links is data driven complexity depends on number of active features Allows for combining task hierarchically Multiplicative learning rule IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Sparse Network of Winnows (SNoW)
Yang et al. 00 Multiplicative weight update algorithm Pros: On--line feature selection [Yang et al. 00] Detect faces with different features and expressions, in different poses, and under different lighting conditions Cons: Need more powerful feature representation Have similar performance, but computationally more efficient Also been applied to object recognition [Yang et al. 02] IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Naive Bayes Classifier
Schneiderman and Kanade, 98 Estimate joint probability of local appearance and position at multiple resolutions Local patterns are more unique Intensity patterns around the eyes are much more distinctive Learn the distribution by parts using Naïve Bayes classifier Provides better estimation of conditional density functions Provides a functional form of the posterior probability to capture the joint statistics of local appearance and position IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Schneiderman and Kanade, 98 At each scale, a face image is decomposed into 4 subregions The project to a lower dimensional space (PCA) Quantized into a finite set of patterns The statistics of each projected subregion are estimated from the projected samples to encode local appearance IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Schneiderman and Kanade, 98 Apply Bayes decision rule Further decompose the appearance into space, frequency, and orientation Also wavelet representation for general object recognition [H. Schneiderman and T. Kanade, 00] IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Detecting faces in Different Pose
Schneiderman and Kanade, 98 Extend to detect faces in different pose with multiple detectors Each detector specializes to a view: frontal, left pose and right pose [Mikolajczyk et al. 01] extend to detect faces from side pose to frontal view IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Experimental Results Schneiderman and Kanade, 98
Able to detect profile faces [Schneiderman and Kanade 98] Extended to detect cars [Schneiderman and Kanade 00] IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Hidden Markov Model Assumption of HMM: Develop HMM
Patterns can be characterized as a parametric random process Parameters can be estimated in a precise, well-defined manner Develop HMM Hidden states need to be decided Learn transitional probability between states from examples each example is represented as a sequence of observations Maximize the probability of observing the training data by adjusting the parameters (Viterbi segmentation method and Baum-Welch algorithms) IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Hidden Markov Model Face Pattern
Several regions (eye, nose, mouth, forehead, chin) Observe these regions in an appropriate order (top-bottom, left-right) Aims to associate facial regions with the states of a continuous density Hidden Markov Model IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Hidden Markov Model for Face Localization
Observe vectors: scan the window vertically with P pixels of overlap Five hidden states The boundaries between strips of pixels are represented by probabilistic transitions between states IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Information-Theoretical Approach
Contextual constraints in a face pattern A small neighborhood of pixels Markov random field (MRF) Convenient and consistent to model context-dependent entities image pixels correlated features Achieved by characterizing mutual influences using conditional MRF distributions Using Kullback relative information, Markov process maximizing the information-based discrimination between the two classes Apply to detection IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Elements of Information Theory
T. Cover and J. Thomas, 91 Probability functions p(x): the template is a face q(x): the template is a non-face Training database to estimate distribution Face 100 individuals x 9 views Nonface nonface templates using histograms IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Select the most informative pixels (MIP) Maximize the Kullback relative information between p(x) and q(x) the MIP distribution focuses on the eye and mouth regions and avoids the nose area. Use MIP to obtain linear features for classification and representation [Fukunaga and Koontz] Detect faces Pass a window over the input image Compute the distance from face space (DFFS) [Pentland et al, 94] If the DFFS-Face < DFFS-Nonface, a face is assumed to exist within the window IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Colmenarez and Huang, 97 Apply Kullback relative information to Maximize the information-based discrimination between positive and negative examples of faces A family of discrete Markov processes Model the face and background patterns Estimate the probability model Select the Markov process that maximizes the information-based discrimination between the two classes Learning Optimization IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Object Detection Using Hierarchical MRF and MAP Estimation
Qian and Huang, 97 Combine view-based and model-based Use visual-attention algorithm to reduce search space – select important image regions Detect face in selected regions Combination of template matching and feature matching Using a hierarchical Markov random field Maximum a posterior estimation IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Inductive Learning Learning by example Algorithms
A system tries to induce a general rule from a set of observed instances Algorithms ID3 (Quinlan, 1986) C4.5 (Quinlan, 1993) FOIL (Quinlan, 1990) IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Detection of Human Faces Using Decision Trees
J. Huang et al. 96 Learn decision tree from positive and negative examples of face pattern Training example 8x8 pixel window represented by a vector of 30 attributes which is composed of entropy, mean, and standard deviation of the pixel intensity values. C4.5 builds a classifier as a decision tree leaves indicate class identity nodes specify tests to perform on a single attribute. The learned decision tree is then used to decide whether a face exists in the input example. Results Localization accuracy rate of 96% A set of 2,340 frontal face images in the FERET data set. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Learning the Human Face Concept from Black and White Pictures
N. Duta and A.K. Jain, IIPR, 1998. Learn face concept using Mitchell’s Find-S algorithm Distribution of face patterns P(x|face) can be approximated by a set of Gaussian clusters For a face instance, Apply Find-S algorithm to learn the thresholding distance such that faces and nonfaces can be differentiated. Several distinct characteristics First, it does not use negative (nonface) examples Second, only the central portion of a face is used for training. Third, feature vectors consist of images with 32 intensity levels or textures, while some uses full-scale intensity values as inputs. Detection rate of 90 percent on the first CMU data set. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Face Databases Training process is essential Benchmark data sets
Face image Databases FERET database consists of monochrome images taken in different frontal views and in left and right profiles assess the strengthens and weaknesses of different face recognition approaches Since each image consists of an individual on a uniform and uncluttered background, it is not suitable for face detection benchmarking IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Turk and Pentland 16 people
ftp://whitechapel.media.mit.edu/pub/images/ 16 people images are taken in frontal view with slight variability in head orientation (tilted upright, right, and left) on a cluttered background IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

AT&T Cambridge Laboratories
Formerly known as the Olivetti database 10 images for 40 distinct subjects Different time, lighting, facial expression, facial details IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Harvard Database Cropped, masked frontal face images
Taken from a wide variety of light sources Study on face recognition under the effect of varying illumination conditions IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Yale Face Database 5760 single light source images of
5760 single light source images of 10 subjects each seen under 576 viewing conditions (9 poses x 64 illumination conditions). For every subject in a particular pose An image with ambient (background) illumination was also captured. Total number of images is in fact =5850. Total size of the compressed database is ~ 1GB. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

M2VTS Multimodal Database
Developed for access control experiments using multimodal inputs Contains sequences of face images of 37 people. Five sequences for each subject were taken over one week. Each image sequence contains images from right profile (-90 degree) to left profile (90 degree) While the subject counts from“0” to “9” in their native languages IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

UMIST Database 564 images of 20 people with varying pose.
The images of each subject cover a range of poses from right profile to frontal views IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Purdue AR Database A. Martinez and R. Benavente, 1998 3,276 color images of 126 people (70 males + 56 females) in frontal view Designed for face recognition experiments under several mixing factors, such as facial expressions, illumination conditions, and occlusions. Also has been applied to image and video indexing as well as retrieval All the faces appear with different facial expression (neutral, smile, anger, and scream), illumination (left light source, right light source, and sources from both sides), Occlusion (wearing sunglasses or scarf). Taken During two sessions separated by two weeks. By the same camera setup under tightly controlled conditions of illumination and pose. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Face Image Databases The abovementioned databases are designed mainly to measure performance of face recognition methods and, thus, each image contains only one individual. Best utilized as training sets rather than test sets IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Benchmark Test Sets K.-K. Sung and T. Poggio, 96&98
First, 301 frontal and near-frontal mugshots of 71 different people High quality digitized images with a fair amount of lighting variation Second, 23 images with a total of 149 face patterns. Most of these images have complex background with Faces taking up only a small amount of the image area IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Samples of Sung and Poggio 98
Some images are scanned from newspapers and, thus, have low resolution. Though most faces in the images are upright and frontal. Some faces in the images appear in different pose IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Database by Rowley et al.
130 images with a total of 507 frontal faces. Also includes 23 images of the second data set used by [Sung and Poggio, 1998]. Most images contain more than one face on a cluttered background A good test set to assess algorithms which detect upright frontal faces. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Database by Rowley et al.
Some images contain hand-drawn cartoon faces. Most images contain more than one face and the face size varies significantly. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Another Database by Rowley et al.
For detecting 2D faces with frontal pose and rotation in image 50 images with a total of 223 faces, of which 210 are at angles > 10 degrees. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Profile Views Database
Schneiderman and Kanade, 00 208 images Each image contains faces with facial expressions and in profile views IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Kodak Face Database A common test bed for direct benchmarking of face detection and recognition algorithms 300 digital photos Captured in a variety of resolutions Face size ranges from as small as 13x13 pixels to as large as 300x300 pixels. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Test Sets for Face Detection
IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Performance Evaluation
They were not tested on the same test set Performance among several appearance-based face detection methods on two standard data sets Test Set 1 (125 Images with 483 Faces) and Test Set 2 (23 Images with 136 Faces) IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Experimental Results Appearance-based face detection methods
The number and variety of training examples have a direct effect on the classification performance IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

More Issues Training time and execution time
The number of scanning windows vary a lot Different criteria adopted in reporting the detection rates A loose criterion may declare all the faces as “successful” detections, while a more strict one would declare most of them as nonfaces. IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

More Issues Training time and execution time
The number of scanning windows vary a lot Different criteria adopted in reporting the detection rates The evaluation criteria may and should depend on the purpose of the detector Required computational resources, particularly, time and memory IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

A Collect of sample face detection codes and evaluation tools

Detecting Faces in Images: A Survey
Provide a comprehensive survey of research on face detection Provide some structural categories for the methods described in over 150 papers It is imprudent to explicitly declare which methods indeed have the lowest error rates The community needs to more seriously consider systematic performance evaluation IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Challenging and Interesting Problem
The class of faces admits a great deal of shape, color, albedo variability due to differences in individuals, nonrigidity, facial hair, glasses, and makeup Images are formed under variable lighting and 3D pose and may have cluttered backgrounds IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Robust real-time face detection
Paul A. Viola and Michael J. Jones Intl. J. Computer Vision 57(2), 137–154, 2004 (originally in CVPR’2001) (slides adapted from Bill Freeman, MIT 6.869, April 2005) Robust real-time face detection

Scan classifier over locs. & scales
CSE 576, Spring 2008 Face Recognition and Detection

“Learn” classifier from data
Training Data 5000 faces (frontal) 108 non faces Faces are normalized Scale, translation Many variations Across individuals Illumination Pose (rotation both in plane and out) CSE 576, Spring 2008 Face Recognition and Detection

Characteristics of algorithm
Feature set (…is huge about 16M features) Efficient feature selection using AdaBoost New image representation: Integral Image Cascaded Classifier for rapid detection Fastest known face detector for gray scale images CSE 576, Spring 2008 Face Recognition and Detection

Image features “Rectangle filters”
Similar to Haar wavelets Differences between sums of pixels in adjacent rectangles CSE 576, Spring 2008 Face Recognition and Detection

Integral Image Partial sum Any rectangle is D = 1+4-(2+3)
Also known as: summed area tables [Crow84] boxlets [Simard98] CSE 576, Spring 2008 Face Recognition and Detection

Huge library of filters
CSE 576, Spring 2008 Face Recognition and Detection

Constructing the classifier
Perceptron yields a sufficiently powerful classifier Use AdaBoost to efficiently choose best features add a new hi(x) at each round each hi(xk) is a “decision stump” hi(x) b=Ew(y [x> q]) a=Ew(y [x< q]) x q CSE 576, Spring 2008 Face Recognition and Detection

Constructing the classifier
For each round of boosting: Evaluate each rectangle filter on each example Sort examples by filter values Select best threshold for each filter (min error) Use sorting to quickly scan for optimal threshold Select best filter/threshold combination Weight is a simple function of error rate Reweight examples (There are many tricks to make this more efficient.) CSE 576, Spring 2008 Face Recognition and Detection

Good reference on boosting
Friedman, J., Hastie, T. and Tibshirani, R. Additive Logistic Regression: a Statistical View of Boosting “We show that boosting fits an additive logistic regression model by stagewise optimization of a criterion very similar to the log-likelihood, and present likelihood based alternatives. We also propose a multi-logit boosting procedure which appears to have advantages over other methods proposed so far.” CSE 576, Spring 2008 Face Recognition and Detection

Trading speed for accuracy
Given a nested set of classifier hypothesis classes Computational Risk Minimization CSE 576, Spring 2008 Face Recognition and Detection

Speed of face detector (2001)
Speed is proportional to the average number of features computed per sub-window. On the MIT+CMU test set, an average of 9 features (/ 6061) are computed per sub-window. On a 700 Mhz Pentium III, a 384x288 pixel image takes about seconds to process (15 fps). Roughly 15 times faster than Rowley-Baluja-Kanade and 600 times faster than Schneiderman-Kanade. CSE 576, Spring 2008 Face Recognition and Detection

Sample results CSE 576, Spring 2008 Face Recognition and Detection

Summary (Viola-Jones)
Fastest known face detector for gray images Three contributions with broad applicability: Cascaded classifier yields rapid classification AdaBoost as an extremely efficient feature selector Rectangle Features + Integral Image can be used for rapid image analysis CSE 576, Spring 2008 Face Recognition and Detection

Face detector comparison
Informal study by Andrew Gallagher, CMU, for CMU Learning-Based Methods in Vision, Spring 2007 The Viola Jones algorithm OpenCV implementation was used. (<2 sec per image). For Schneiderman and Kanade, Object Detection Using the Statistics of Parts [IJCV’04], the demo was used. (~10-15 seconds per image, including web transmission). CSE 576, Spring 2008 Face Recognition and Detection

Schneiderman Kanade Viola Jones CSE 576, Spring 2008
Face Recognition and Detection

Example-based Caricature Generation with Exaggeration
Lin Liang1, Hong Chen2, Ying-Qing Xu1, Heung-Yeung Shum1 1 Microsoft Research, Asia 2 Xi’an Jiaotong University, China Example-based Caricature Generation with Exaggeration

Labeled feature points
Training data include 92 pairs of original facial images <--> exaggerated caricatures drawn by an artist IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

System Framework IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Exaggerated Caricature

Exaggerated caricature
Original image Unexaggerated sketch Exaggerated caricature Apply to the image Caricature by the artist IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, (1)

Detecting Faces in Images: A Survey

Similar presentations

Presentation on theme: "Detecting Faces in Images: A Survey"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Detecting Faces in Images: A Survey

Similar presentations

Presentation on theme: "Detecting Faces in Images: A Survey"— Presentation transcript:

Similar presentations

About project

Feedback