Download presentation

Presentation is loading. Please wait.

Published byBrady Newson Modified over 2 years ago

1
Face Recognition Ying Wu yingwu@ece.northwestern.edu Electrical and Computer Engineering Northwestern University, Evanston, IL http://www.ece.northwestern.edu/~yingwu

2
Recognizing Faces? Lighting View

3
Outline Bayesian Classification Principal Component Analysis (PCA) Fisher Linear Discriminant Analysis (LDA) Independent Component Analysis (ICA)

4
Bayesian Classification Classifier & Discriminant Function Discriminant Function for Gaussian Bayesian Learning and Estimation

5
Classifier & Discriminant Function Discriminant function g i (x) i=1,…,C Classifier Example Decision boundary The choice of D-function is not unique

6
Multivariate Gaussian principal axes The principal axes (the direction) are given by the eigenvectors of ; The length of the axes (the uncertainty) is given by the eigenvalues of x1x1 x2x2

7
Mahalanobis Distance Mahalanobis distance is a normalized distance x1x1 x2x2

8
Whitening Whitening: –Find a linear transformation (rotation and scaling) such that the covariance becomes an identity matrix (i.e., the uncertainty for each component is the same) y1y1 y2y2 x1x1 x2x2 y=A T x p(x) ~ N( , ) p(y) ~ N(A T , A T A)

9
Disc. Func. for Gaussian Minimum-error-rate classifier

10
Case I: i = 2 I constant Liner discriminant function Boundary:

11
Example Assume p( i )=p( j ) ii jj Let’s derive the decision boundary:

12
Case II: i = The decision boundary is still linear:

13
Case III: i = arbitrary The decision boundary is no longer linear, but hyperquadrics!

14
Bayesian Learning Learning means “training” i.e., estimating some unknowns from “training data” WHY? –It is very difficult to specify these unknowns –Hopefully, these unknowns can be recovered from examples collected.

15
Maximum Likelihood Estimation Collected examples D={x 1,x 2,…,x n } Estimate unknown parameters in the sense that the data likelihood is maximized Likelihood Log Likelihood ML estimation

16
Case I: unknown

17
Case II: unknown and generalize

18
Bayesian Estimation Collected examples D={x 1,x 2,…,x n }, drawn independently from a fixed but unknown distribution p(x) Bayesian learning is to use D to determine p(x|D), i.e., to learn a p.d.f. p(x) is unknown, but has a parametric form with parameters ~ p( ) Difference from ML: in Bayesian learning, is not a value, but a random variable and we need to recover the distribution of , rather than a single value.

19
Bayesian Estimation This is obvious from the total probability rule, i.e., p(x|D) is a weighted average over all If p( |D) peaks very sharply about some value *, then p(x|D) ~ p(x| * )

20
The Univariate Case assume is the only unknown, p(x| )~N( , 2 ) is a r.v., assuming a prior p( ) ~ N( 0, 0 2 ), i.e., 0 is the best guess of , and 0 is the uncertainty of it. P( |D) is also a Gaussian for any # of training examples

21
The Univariate Case The best guess for after observing n examples n measures the uncertainty of this guess after observing n examples n=1 n=5 n=10 n=30 p( |x 1,…,x n ) p( |D) becomes more and more sharply peaked when observing more and more examples, i.e., the uncertainty decreases.

22
The Multivariate Case

23
PCA and Eigenface Principal Component Analysis (PCA) Eigenface for Face Recognition

24
PCA: motivation Pattern vectors are generally confined within some low-dimensional subspaces Recall the basic idea of the Fourier transform –A signal is (de)composed of a linear combination of a set of basis signal with different frequencies.

25
PCA: idea m ee x xkxk

26
PCA To maximize e T Se, we need to select max

27
Algorithm Learning the principal components from {x 1, x 2, …, x n }

28
PCA for Face Recognition Training data D={x 1, …, x M } –Dimension (stacking the pixels together to make a vector of dimension N) –Preprocessing cropping normalization These faces should lie in a “face” subspace Questions: –What is the dimension of this subspace? –How to identify this subspace? –How to use it for recognition?

29
Eigenface The EigenFace approach: M. Turk and A. Pentland, 1992

30
An Issue In general, N >> M However, S, the covariance matrix, is NxN! Difficulties: –S is ill-conditioned. Rank(S)<

31
Solution I: Let’s do eigenvalue decomposition on A T A, which is a MxM matrix A T Av= v AA T Av= Av To see is clearly! (AA T ) (Av)= (Av) i.e., if v is an eigenvector of A T A, then Av is the eigenvector of AA T corresponding to the same eigenvalue! Note: of course, you need to normalize Av to make it a unit vector

32
Solution II: You can simply use SVD (singular value decomposition) A = [x 1 -m, …, x M -m] A = U V T –A: NxM –U: NxM U T U=I – : MxM diagonal –V: MxM V T V=VV T =I

33
Fisher Linear Discrimination LDA PCA+LDA for Face Recognition

34
When does PCA fail? PCA

35
Linear Discriminant Analysis Finding an optimal linear mapping W Finding an optimal linear mapping W Catches major difference between classes and discount irrelevant factors Catches major difference between classes and discount irrelevant factors In the mapped space, data are clustered In the mapped space, data are clustered

36
Within/between class scatters

37
Fisher LDA

38
Solution I If S w is not singular You can simply do eigenvalue decomposition on S W -1 S B

39
Solution II Noticing: –S B W is on the direction of m 1 -m 2 (WHY?) –We are only concern about the direction of the projection, rather than the scale We have

40
Multiple Discriminant Analysis

41
Comparing PCA and LDA PCA LDA

42
MDA for Face Recognition Lighting PCA does not work well! (why?) solution: PCA+MDA

43
Independent Component Analysis The cross-talk problem ICA

44
Cocktail party t S 1 (t) S 2 (t) t t x 1 (t) t x 2 (t) t x 3 (t) A Can you recover s 1 (t) and s 2 (t) from x 1 (t), x 2 (t) and x 3 (t)?

45
Formulation Both A and S are unknowns! Can you recover both A and S from X?

46
The Idea y is a linear combination of {s i } Recall the central limit theory! A sum of even two independent r.v. is more Gaussian than the original r.v. So, Z T S should be more Gaussian than any of {s i } In other words, Z T S become least Gaussian when in fact equal to {s i } Amazed!

47
Face Recognition: Challenges View

Similar presentations

OK

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google