Computer Vision – Lecture 9

Name: Computer Vision – Lecture 9
Uploaded: 2017-07-22T01:52:40+00:00
Duration: PTM29S41
Channel: Christiana Hardy
Description: Computer Vision – Lecture 9

Computer Vision – Lecture 9
Subspace Representations for Recognition Bastian Leibe RWTH Aachen TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAAAAA

Course Outline Image Processing Basics Segmentation & Grouping
Recognition Global Representations Subspace representations Object Categorization I Sliding Window based Object Detection Local Features & Matching Object Categorization II Part based Approaches 3D Reconstruction Motion and Tracking

Announcements Exercise sheet 4 will be made available this afternoon
Subspace methods for object recognition [today’s topic] Sliding-window object detection [next Thursday’s topic] The exercise will be Thursday,  Submit your results by Wednesday night. Application: Face detection & identification B. Leibe

Recap: Appearance-Based Recognition
Basic assumption Objects can be represented by a set of images (“appearances”). For recognition, it is sufficient to just compare the 2D appearances. No 3D model is needed. 3D object ry rx =  Fundamental paradigm shift in the 90’s B. Leibe

Recap: Recognition Using Histograms
Histogram comparison Test image Known objects B. Leibe

Recap: Comparison Measures
Vector space interpretation Euclidean distance Mahalanobis distance Statistical motivation Chi-square Bhattacharyya Information-theoretic motivation Kullback-Leibler divergence, Jeffreys divergence Histogram motivation Histogram intersection Ground distance Earth Movers Distance (EMD) ≠

Recap: Recognition Using Histograms
Simple algorithm Build a set of histograms H={hi} for each known object More exactly, for each view of each object Build a histogram ht for the test image. Compare ht to each hiH Using a suitable comparison measure Select the object with the best matching score Or reject the test image if no object is similar enough. “Nearest-Neighbor” strategy B. Leibe

Recap: Histogram Backprojection
„Where in the image are the colors we‘re looking for?“ Query: object with histogram M Given: image with histogram I Compute the „ratio histogram“: R reveals how important an object color is, relative to the current image. Project value back into the image (i.e. replace the image values by the values of R that they index). Convolve result image with a circular mask to find the object.

Recap: Multidimensional Representations
Combination of several descriptors Each descriptor is applied to the whole image. Corresponding pixel values are combined into one feature vector. Feature vectors are collected in multidimensional histogram. Dx Dy Lap 1.22 -0.39 2.78 B. Leibe

Recap: Bayesian Recognition Algorithm
Build up histograms p(mk|on) for each training object. Sample the test image to obtain mk, kK. Only small number of local samples necessary. Compute the probabilities for each training object. Select the object with the highest probability Or reject the test image if no object accumulates sufficient probability. B. Leibe

Recap: Colored Derivatives
Generalization: derivatives along Y axis  intensity differences C1 axis  red-green differences C2 axis  blue-yellow differences Application: Brand identification in video Y C2 C1 B. Leibe [Hall & Crowley, 2000]

You’re Now Ready for First Applications…
Histogram based recognition Line detection Circle detection Binary Segmen- tation You lost! ? Moment descriptors Skin color detection Image Source:

Topics of This Lecture Subspace Methods for Recognition
Motivation Principal Component Analysis (PCA) Derivation Object recognition with PCA Eigenimages/Eigenfaces Limitations Fisher’s Linear Discriminant Analysis (FLD/LDA) Fisherfaces for recognition B. Leibe

Representations for Recognition
Global object representations We’ve seen histograms as one example What could be other suitable representations? More generally, we want to obtain representations that are well-suited for Recognizing a certain class of objects Identifying individuals from that class (identification) How can we arrive at such a representation? Approach 1: Come up with a brilliant idea and tweak it until it works. Can we do this more systematically? B. Leibe

Example: The Space of All Face Images
When viewed as vectors of pixel values, face images are extremely high-dimensional. 100x100 image = 10,000 dimensions However, relatively few 10,000- dimensional vectors correspond to valid face images. We want to effectively model the subspace of face images. Slide credit: Svetlana Lazebnik B. Leibe

The Space of All Face Images
We want to construct a low-dimensional linear subspace that best explains the variation in the set of face images Slide credit: Svetlana Lazebnik B. Leibe

Subspace Methods Images represented as points in a high-dim. vector space Valid images populate only a small fraction of the space Characterize subspace spanned by images …  . Image set Basis images Representation coefficients Slide adapted from Ales Leonardis B. Leibe

Subspace Methods Today’s topics: PCA, FLD Subspace methods
Discriminative Reconstructive PCA, ICA, NMF FLD, SVM, CCA = a a a … representation classification regression Slide credit: Ales Leonardis B. Leibe

Principal Component Analysis
Given: N data points x1, … ,xN in Rd We want to find a new set of features that are linear combinations of original ones: (¹: mean of data points) What unit vector u in Rd captures the most variance of the data? Slide credit: Svetlana Lazebnik B. Leibe

Direction that maximizes the variance of the projected data: The direction that maximizes the variance is the eigenvector associated with the largest eigenvalue of . N Projection of data point N Covariance matrix of data Slide credit: Svetlana Lazebnik B. Leibe

Remember: Fitting a Gaussian
Mean and covariance matrix of data define a Gaussian model

Interpretation of PCA Compute eigenvectors of covariance §.
Eigenvectors: main directions Eigenvalues: variances along eigenvector Result: coordinate transform to best represent the variance of the data

Interpretation of PCA Now, suppose we want to represent the data using just a single dimension. I.e., project it onto a single axis What would be the best choice for this axis?

Interpretation of PCA Now, suppose we want to represent the data using just a single dimension. I.e., project it onto a single axis What would be the best choice for this axis? The first eigenvector gives us the best reconstruction. Direction that retains most of the variance of the data.

Cumulative influence of eigenvectors
Properties of PCA It can be shown that the mean square error between xi and its reconstruction using only m principle eigenvectors is given by the expression: where ¸_j are the eigenvalues Interpretation PCA minimizes reconstruction error PCA maximizes variance of projection Finds a more “natural” coordinate system for the sample data. 90% of variance k eigenvectors Cumulative influence of eigenvectors Slide credit: Ales Leonardis B. Leibe

Projection and Reconstruction
An n-pixel image xRn can be projected to a low-dimensional feature space yRm by From yRm, the reconstruction of the point is UTy The error of the reconstruction is Slide credit: Peter Belhumeur B. Leibe

Example: Object Representation
Slide credit: Ales Leonardis B. Leibe

Get a compact representation by keeping only the first k eigenvectors! = + w1 + w2 + w3 Slide credit: Ales Leonardis B. Leibe

Object Detection by Distance TO Eigenspace
Scan a window  over the image and classify the window as object or non-object as follows: Project window to subspace and reconstruct as earlier. Compute the distance between  and the reconstruction (reprojection error). Local minima of distance over all image locations  object locations Repeat at different scales Possibly normalize window intensity such that ||=1. Slide credit: Peter Belhumeur B. Leibe

M. Turk and A. Pentland, Face Recognition using Eigenfaces, CVPR 1991
Eigenfaces: Key Idea Assume that most face images lie on a low-dimensional subspace determined by the first k (k < d) directions of maximum variance Use PCA to determine the vectors u1,…uk that span that subspace: x ≈ μ + w1u1 + w2u2 + … + wkuk Represent each face using its “face space” coordinates (w1,…wk) Perform nearest-neighbor recognition in “face space” M. Turk and A. Pentland, Face Recognition using Eigenfaces, CVPR 1991 Slide credit: Svetlana Lazebnik B. Leibe

Eigenfaces Example Training images x1,…,xN
Slide credit: Svetlana Lazebnik B. Leibe

Eigenfaces Example Top eigenvectors: u1,…uk Mean: μ

Eigenface Example 2 (Better Alignment)
Slide credit: Peter Belhumeur B. Leibe

= Eigenfaces Example Face x in “face space” coordinates:

= Eigenfaces Example Face x in “face space” coordinates:
Reconstruction: = = + µ w1u1 + w2u2 + w3u3 + w4u4 + … x Slide credit: Svetlana Lazebnik B. Leibe

Recognition with Eigenspaces
Process labeled training images: Find mean µ and covariance matrix Σ Find k principal components (eigenvectors of Σ) u1,…uk Project each training image xi onto subspace spanned by principal components: (wi1,…,wik) = (u1T(xi – µ), … , ukT(xi – µ)) Given novel image x: Project onto subspace: (w1,…,wk) = (u1T(x – µ), … , ukT(x – µ)) Optional: check reconstruction error x – x to determine whether image is really a face Classify as closest training face in k-dimensional subspace ^ Slide credit: Svetlana Lazebnik B. Leibe

Obj. Identification by Distance IN Eigenspace
Objects are represented as coordinates in an n-dim. eigenspace. Example: 3D space with points representing individual objects or a manifold representing parametric eigenspace (e.g., orientation, pose, illumination). Estimate parameters by finding the NN in the eigenspace Slide credit: Ales Leonardis B. Leibe

Parametric Eigenspace
Object identification / pose estimation Find nearest neighbor in eigenspace [Murase & Nayar, IJCV’95] Slide adapted from Ales Leonardis B. Leibe

Applications: Recognition, Pose Estimation
H. Murase and S. Nayar, Visual learning and recognition of 3-d objects from appearance, IJCV 1995

Applications: Visual Inspection
S. K. Nayar, S. A. Nene, and H. Murase, Subspace Methods for Robot Vision, IEEE Transactions on Robotics and Automation, 1996. B. Leibe

Important Footnote Don’t really implement PCA this way! How big is ?
Why? How big is ? nn, where n is the number of pixels in an image! However, we only have m training examples, typically m<<n.   will at most have rank m! You only need the first k eigenvectors Slide credit: Peter Belhumeur B. Leibe

Singular Value Decomposition (SVD)
Any mn matrix A may be factored such that U: mm, orthogonal matrix Columns of U are the eigenvectors of AAT V: nn, orthogonal matrix Columns are the eigenvectors of ATA : mn, diagonal with non-negative entries (1, 2,…, s) with s=min(m,n) are called the singular values. Singular values are the square roots of the eigenvalues of both AAT and ATA. Columns of U are corresponding eigenvectors! Result of SVD algorithm: 12…s Slide credit: Peter Belhumeur

SVD Properties Matlab: [u s v] = svd(A) r = rank(A)
where A = u*s*v’ r = rank(A) Number of non-zero singular values U, V give us orthonormal bases for the subspaces of A first r columns of U: column space of A last m-r columns of U: left nullspace of A first r columns of V: row space of A last n-r columns of V: nullspace of A For d  r, the first d columns of U provide the best d-dimensional basis for columns of A in least-squares sense Slide credit: Peter Belhumeur B. Leibe

Performing PCA with SVD
Singular values of A are the square roots of eigenvalues of both AAT and ATA. Columns of U are the corresponding eigenvectors. And Covariance matrix So, ignoring the factor 1/n, subtract mean image  from each input image, create data matrix , and perform (thin) SVD on the data matrix. Slide credit: Peter Belhumeur B. Leibe

This is what you should use!
Thin SVD Any m by n matrix A may be factored such that If m>n, then one can view  as: Where ’=diag(1,2,…,s) with s=min(m,n), and lower matrix is (n-m by m) of zeros. Alternatively, you can write: In Matlab, thin SVD is: [U S V] = svds(A,k) This is what you should use! Slide credit: Peter Belhumeur B. Leibe

Limitations Global appearance method: not robust to misalignment, background variation Easy fix (with considerable manual overhead) Need to align the training examples Slide credit: Svetlana Lazebnik B. Leibe

Limitations PCA assumes that the data has a Gaussian distribution (mean ¹, covariance matrix Σ) The shape of this dataset is not well described by its principal components Slide credit: Svetlana Lazebnik B. Leibe

Limitations The direction of maximum variance is not always good for classification Slide credit: Svetlana Lazebnik B. Leibe

Restrictions of PCA PCA minimizes projection error
PCA is „unsupervised“, no information on classes is used Discriminating information might be lost Best discriminating projection PCA projection Slide credit: Ales Leonardis B. Leibe

Fischer’s Linear Discriminant Analysis (FLD)
FLD is an enhancement to PCA Constructs a discriminant subspace that minimizes the scatter between images of the same class and maximizes the scatter between different class images Also sometimes called LDA… Slide adapted from Peter Belhumeur B. Leibe

Mean Images Let X1, X2,…, Xk be the classes in the database and let each class Xi, i = 1,2,…,k have Ni images xj, j=1,2,…,k. We compute the mean image i of each class Xi as: Now, the mean image  of all the classes in the database can be calculated as: Slide credit: Peter Belhumeur B. Leibe

Scatter Matrices We calculate the within-class scatter matrix as:
We calculate the between-class scatter matrix as: Slide credit: Peter Belhumeur B. Leibe

Visualization Good separation Slide credit: Ales Leonardis B. Leibe

Fisher’s Linear Discriminant Analysis (FLD)
Maximize distance between classes Minimize distance within a class Criterion: SB … between-class scatter matrix SW … within-class scatter matrix In the two-class case, the optimal solution for w can be obtained as: Classification function: Class 1 x x Class 2 w Slide credit: Ales Leonardis B. Leibe

Multiple Discriminant Analysis
Generalization to K classes where Does this look familiar to anybody? B. Leibe

Multiple Discriminant Analysis
Generalization to K classes where Does this look familiar to anybody? Again a Rayleigh quotient… B. Leibe

Maximizing J(W) Solution from generalized eigenvalue problem
The columns of the optimal W are the eigenvectors correspon-ding to the largest eigenvectors of Defining , we get which is a regular eigenvalue problem.  Solve to get eigenvectors of v, then from that of w. For the K-class case we obtain (at most) K-1 projections. (i.e. eigenvectors corresponding to non-zero eigenvalues.) B. Leibe

Face Recognition Difficulty: Lighting
The same person with the same facial expression, and seen from the same viewpoint, can appear dramatically different when light sources illuminate the face from different directions.

Application: Fisherfaces
Idea: Using Fisher’s linear discriminant to find class-specific linear projections that compensate for lighting/facial expression. Singularity problem The within-class scatter is always singular for face recognition, since #training images << #pixels This problem is overcome by applying PCA first Slide credit: Peter Belhumeur B. Leibe [Belhumeur et.al. 1997]

Fisherfaces: Experiments
Variation in lighting Slide credit: Peter Belhumeur B. Leibe [Belhumeur et.al. 1997]

Slide credit: Peter Belhumeur [Belhumeur et.al. 1997]

Fisherfaces: Experimental Results
Slide credit: Peter Belhumeur B. Leibe [Belhumeur et.al. 1997]

Variation in facial expression, eye wear, lighting Slide credit: Peter Belhumeur B. Leibe [Belhumeur et.al. 1997]

Fisherfaces: Experimental Results
Slide credit: Peter Belhumeur B. Leibe [Belhumeur et.al. 1997]

Example Application: Fisherfaces
Visual discrimination task Training data: Test: C1: Subjects with glasses C2: Subjects without glasses = Take each image as a vector of pixel values and apply FLD…  glasses? B. Leibe Image source: Yale Face Database

Fisherfaces: Interpretability
Example Fisherface for recognition “Glasses/NoGlasses“ Slide credit: Peter Belhumeur B. Leibe [Belhumeur et.al. 1997]

References and Further Reading
Background information on PCA/FLD can be found in Chapter 22.3 of D. Forsyth, J. Ponce, Computer Vision – A Modern Approach. Prentice Hall, 2003 Important Papers (available on webpage) M. Turk, A. Pentland Eigenfaces for Recognition J. Cognitive Neuroscience, Vol. 3(1), 1991. P.N. Belhumeur, J.P. Hespanha, D.J. Kriegman Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, IEEE Trans. PAMI, Vol. 19(7), 1997.

Computer Vision – Lecture 9

Similar presentations

Presentation on theme: "Computer Vision – Lecture 9"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer Vision – Lecture 9

Similar presentations

Presentation on theme: "Computer Vision – Lecture 9"— Presentation transcript:

Similar presentations

About project

Feedback