Principal Component Analysis (PCA)

Slides:



Advertisements
Similar presentations
Eigen Decomposition and Singular Value Decomposition
Advertisements

Dynamic Time Warping (DTW)
Component Analysis (Review)
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Machine Learning Lecture 8 Data Processing and Representation
PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Determinants Bases, linear Indep., etc Gram-Schmidt Eigenvalue and Eigenvectors Misc
Symmetric Matrices and Quadratic Forms
© 2003 by Davi GeigerComputer Vision September 2003 L1.1 Face Recognition Recognized Person Face Recognition.
Principal Component Analysis
3D Geometry for Computer Graphics
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
TFIDF-space  An obvious way to combine TF-IDF: the coordinate of document in axis is given by  General form of consists of three parts: Local weight.
Face Recognition Using Eigenfaces
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
3D Geometry for Computer Graphics
3D Geometry for Computer Graphics
5.1 Orthogonality.
SVD(Singular Value Decomposition) and Its Applications
PCA & LDA for Face Recognition
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Linear Least Squares Approximation. 2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find.
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
Presented By Wanchen Lu 2/25/2013
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
What is the determinant of What is the determinant of
Quadratic Classifiers (QC) J.-S. Roger Jang ( 張智星 ) CS Dept., National Taiwan Univ Scientific Computing.
Discriminant Analysis
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Feature selection/extraction/reduction Data Science in Practice Week 8, 04/11 Jia-Ming Chang CS Dept, National Chengchi University
Linear Classifiers (LC) J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
Lecture XXVII. Orthonormal Bases and Projections Suppose that a set of vectors {x 1,…,x r } for a basis for some space S in R m space such that r  m.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
CSE 554 Lecture 8: Alignment
Principal Component Analysis (PCA)
Quadratic Classifiers (QC)
Dimensionality Reduction
Background on Classification
School of Computer Science & Engineering
LECTURE 10: DISCRIMINANT ANALYSIS
Unsupervised Learning: Principle Component Analysis
Singular Value Decomposition
Principal Component Analysis
Introduction PCA (Principal Component Analysis) Characteristics:
Recitation: SVD and dimensionality reduction
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Feature space tansformation methods
Symmetric Matrices and Quadratic Forms
Principal Components What matters most?.
LECTURE 09: DISCRIMINANT ANALYSIS
Principal Component Analysis
Scientific Computing: Closing 科學計算:結語
Naive Bayes Classifiers (NBC)
Game Trees and Minimax Algorithm
Linear Algebra: Matrix Eigenvalue Problems – Part 2
Symmetric Matrices and Quadratic Forms
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Principal Component Analysis (PCA) J.-S Roger Jang (張智星) jang@mirlab.org http://mirlab.org/jang MIR Lab, CSIE Dept National Taiwan University

Introduction to PCA PCA (Principal Component Analysis) An effective method for reducing dataset dimensions while keeping spatial characteristics as much as possible Characteristics: For unlabeled data A linear transform with solid mathematical foundation Applications Line/plane fitting Face recognition

Problem Definition Input Output 2017/4/21 Problem Definition Input A dataset of n d-dim points which are zero justified: Output A unity vector u such that the square sum of the dataset’s projection onto u is maximized. 2017/4/21

Projection Angle between two vectors Projection of x onto u 2017/4/21

Mathematical Formulation 2017/4/21 Mathematical Formulation Dataset representation: X is d by n, with n>d Projection of each column of X onto u: Square sum: Objective function with a constraint on u: 2017/4/21

Optimization of the Obj. Function 2017/4/21 Optimization of the Obj. Function Set the gradient to zero:  u is the eigenvector while l is the eigenvalue When u is the eigenvector: If we arrange eigenvalues such that: Max of J(u) is l1, which occurs at u=u1 Min of J(u) is ld, which occurs at u=ud 2017/4/21

Facts about Symmetric Matrices 2017/4/21 Facts about Symmetric Matrices A square symmetric matrix have orthogonal eigenvectors corresponding to different eigenvalues 2017/4/21

2017/4/21 Conversion Conversion between orthonormal bases 2017/4/21

Steps for PCA Find the sample mean: Compute the covariance matrix: 2017/4/21 Steps for PCA Find the sample mean: Compute the covariance matrix: Find the eigenvalues of C and arrange them into descending order, with the corresponding eigenvectors The transformation is , with 2017/4/21

PCA for TLS Problem for ordinary LS (least squares) Not robust if the fitting line has a large slope PCA can be used for TLS (total least squares) PCA for TLS of lines in 2D Zero adjustment (Prove that the TLS line goes through the mean of the dataset.) Find the u1 & u2. Use u2 as the normal vector. Can be extend to surfaces in 3D.

2017/4/21 Tidbits PCA is designed for unlabeled data. For classification problem, we usually resort to LDA (linear discriminant analysis) for dimension reduction. If d>>n, then we need to have a workaround for computing the eigenvectors 2017/4/21

2017/4/21 Example of PCA IRIS dataset projection 2017/4/21

2017/4/21 Weakness of PCA Not designed for classification problem (with labeled training data) Ideal situation Adversary situation 2017/4/21

Linear Discriminant Analysis 2017/4/21 Linear Discriminant Analysis LDA projection onto directions that can best separate data of different classes. Adversary situation for PCA Ideal situation for LDA 2017/4/21