Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)

Slides:



Advertisements
Similar presentations
Face Recognition Sumitha Balasuriya.
Advertisements

EigenFaces and EigenPatches Useful model of variation in a region –Region must be fixed shape (eg rectangle) Developed for face recognition Generalised.
3D Geometry for Computer Graphics
EigenFaces.
Tensors and Component Analysis Musawir Ali. Tensor: Generalization of an n-dimensional array Vector: order-1 tensor Matrix: order-2 tensor Order-3 tensor.
Machine Learning Lecture 8 Data Processing and Representation
Face Recognition and Biometric Systems
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Dimensionality Reduction Chapter 3 (Duda et al.) – Section 3.8
© 2003 by Davi GeigerComputer Vision September 2003 L1.1 Face Recognition Recognized Person Face Recognition.
Principal Component Analysis
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
CS 790Q Biometrics Face Recognition Using Dimensionality Reduction PCA and LDA M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Face Recognition using PCA (Eigenfaces) and LDA (Fisherfaces)
Face Recognition Jeremy Wyatt.
Face Recognition Using Eigenfaces
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
SVD(Singular Value Decomposition) and Its Applications
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Linear Least Squares Approximation. 2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find.
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
Recognition Part II Ali Farhadi CSE 455.
Face Recognition and Feature Subspaces
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Face Recognition and Feature Subspaces
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #19.
Digital Image Processing, 3rd ed. © 1992–2008 R. C. Gonzalez & R. E. Woods Gonzalez & Woods Matrices and Vectors Objective.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
PCA explained within the context of Face Recognition Berrin Yanikoglu FENS Computer Science & Engineering Sabancı University Updated Dec Some slides.
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Dimensionality Reduction Motivation I: Data Compression Machine Learning.
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
CSE 185 Introduction to Computer Vision Face Recognition.
CSSE463: Image Recognition Day 27 This week This week Today: Applications of PCA Today: Applications of PCA Sunday night: project plans and prelim work.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Feature Selection and Dimensionality Reduction. “Curse of dimensionality” – The higher the dimensionality of the data, the more data is needed to learn.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
CSSE463: Image Recognition Day 25 This week This week Today: Applications of PCA Today: Applications of PCA Sunday night: project plans and prelim work.
Principal Components Analysis ( PCA)
Unsupervised Learning II Feature Extraction
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Unsupervised Learning II Feature Extraction
Principal Component Analysis (PCA)
CSSE463: Image Recognition Day 27
University of Ioannina
Lecture 8:Eigenfaces and Shared Features
Face Recognition and Feature Subspaces
Lecture: Face Recognition and Feature Reduction
Recognition: Face Recognition
Principal Component Analysis (PCA)
Principal Component Analysis
Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Techniques for studying correlation and covariance structure
Principal Component Analysis
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Recitation: SVD and dimensionality reduction
CSSE463: Image Recognition Day 25
Feature space tansformation methods
CS4670: Intro to Computer Vision
CSSE463: Image Recognition Day 25
Presentation transcript:

Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)

Data x2x2 x1x1

x2x2 x1x1 First principal component Gives direction of largest variation of the data

Data x2x2 x1x1 Second principal component Gives direction of second largest variation First principal component Gives direction of largest variation of the data

Rotation of Axes x2x2 x1x1

Dimensionality reduction x2x2 x1x1

Classification (on reduced dimensionality space) + − x2x2 x1x1

Classification (on reduced dimensionality space) + − x2x2 x1x1 Note: Can be used for labeled or unlabeled data.

Principal Components Analysis (PCA) Summary: PCA finds new orthogonal axes in directions of largest variation in data. PCA used to create high-level features in order to improve classification and reduce dimensions of data without much loss of information. Used in machine learning and in signal processing and image compression (among other things).

Suppose attributes are A 1 and A 2, and we have n training examples. x’s denote values of A 1 and y’s denote values of A 2 over the training examples. Variance of an attribute: Background for PCA

Covariance of two attributes: If covariance is positive, both dimensions increase together. If negative, as one increases, the other decreases. Zero: independent of each other.

Covariance matrix – Suppose we have n attributes, A 1,..., A n. – Covariance matrix:

Covariance matrix

Eigenvectors: – Let M be an n  n matrix. v is an eigenvector of M if M  v = v is called the eigenvalue associated with v – For any eigenvector v of M and scalar a, – Thus you can always choose eigenvectors of length 1: – If M is symmetric with real entries, it has n eigenvectors, and they are orthogonal to one another. – Thus eigenvectors can be used as a new basis for a n-dimensional vector space. Review of Matrix Algebra

Principal Components Analysis (PCA) 1.Given original data set S = {x 1,..., x k }, produce new set by subtracting the mean of attribute A i from each x i. Mean: Mean: 0 0

2.Calculate the covariance matrix: 3.Calculate the (unit) eigenvectors and eigenvalues of the covariance matrix: x y xyxy

Eigenvector with largest eigenvalue traces linear pattern in data

4.Order eigenvectors by eigenvalue, highest to lowest. In general, you get n components. To reduce dimensionality to p, ignore n  p components at the bottom of the list.

Construct new “feature vector” (assuming v i is a column vector). Feature vector = (v 1, v 2,...v p ) V1V1 V2V2

5.Derive the new data set. TransformedData = RowFeatureVector  RowDataAdjust where RowDataAdjust = transpose of mean-adjusted data This gives original data in terms of chosen components (eigenvectors)—that is, along these axes.

Intuition: We projected the data onto new axes that captures the strongest linear trends in the data set. Each transformed data point tells us how far it is above or below those trend lines.

Reconstructing the original data We did: TransformedData = RowFeatureVector  RowDataAdjust so we can do RowDataAdjust = RowFeatureVector -1  TransformedData = RowFeatureVector T  TransformedData and RowDataOriginal = RowDataAdjust + OriginalMean

Textbook’s notation We have original data X and mean-subtracted data B, and covariance matrix C = cov(B), where C is an N×N matrix: We find matrix V such that the columns of V are the N eigenvectors of C and where λ i is the ith eigenvalue of C. Each eigenvalue in D corresponds to an eigenvector in V. The eigenvectors, sorted in order of decreasing eigenvalue, become the “feature vector” for PCA.

With new data, compute TransformedData = RowFeatureVector  RowDataAdjust where RowDataAdjust = transpose of mean-adjusted data

What you need to remember General idea of what PCA does – Finds new, rotated set of orthogonal axes that capture directions of largest variation – Allows some axes to be dropped, so data can be represented in lower-dimensional space. – This can improve classification performance and avoid overfitting due to large number of dimensions.

Example: Linear discrimination using PCA for face recognition (“Eigenfaces”) 1.Preprocessing: “Normalize” faces Make images the same size Line up with respect to eyes Normalize intensities

2.Raw features are pixel intensity values (2061 features) 3.Each image is encoded as a vector  i of these features 4.Compute “mean” face in training set:

Subtract the mean face from each face vector Compute the covariance matrix C Compute the (unit) eigenvectors v i of C Keep only the first K principal components (eigenvectors) From W. Zhao et al., Discriminant analysis of principal components for face recognition.

The eigenfaces encode the principal sources of variation in the dataset (e.g., absence/presence of facial hair, skin tone, glasses, etc.). We can represent any face as a linear combination of these “basis” faces. Use this representation for: Face recognition (e.g., Euclidean distance from known faces) Linear discrimination (e.g., “glasses” versus “no glasses”, or “male” versus “female”) Interpreting and Using Eigenfaces

Eigenfaces Demo hm/ hm/

Kernel PCA PCA: Assumes direction of variation are all straight lines Kernel PCA: Maps data to higher dimensional space,

From Wikipedia Original data Data after kernel PCA

Kernel PCA Use Φ(x) and kernel matrix K ij = Φ(x i )  Φ(x j ) to compute PCA transform. (Optional: See in textbook, though it might be a bit confusing. Also see “Kernel Principal Components Analysis” by Scholkopf et al., linked to the class website ).

Kernel Eigenfaces (Yang et al., Face Recognition Using Kernel Eigenfaces, 2000) Training data: ~ 400 images, 40 subjects Original features: 644 pixel gray-scale values. Transform data using kernel PCA, reduce dimensionality to number of components giving lowest error. Test: new photo of one of the subjects Recognition done using nearest neighbor classification