Principal Component Analysis (PCA)

Slides:



Advertisements
Similar presentations
Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Machine Learning Lecture 8 Data Processing and Representation
Dimension reduction (1)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Lecture 7: Principal component analysis (PCA)
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
An introduction to Principal Component Analysis (PCA)
Principal Component Analysis
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
Face Recognition Jeremy Wyatt.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
CSE 185 Introduction to Computer Vision Face Recognition.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Neural Computation Prof. Nathan Intrator
Principle Component Analysis and its use in MA clustering Lecture 12.
Principal Component Analysis (PCA)
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Feature Extraction 主講人:虞台文.
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Unsupervised Learning II Feature Extraction
Unsupervised Learning II Feature Extraction
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Principal Component Analysis
Principal Component Analysis (PCA)
Unsupervised Learning
PREDICT 422: Practical Machine Learning
Principal Component Analysis
Dimensionality Reduction
Exploring Microarray data
University of Ioannina
9.3 Filtered delay embeddings
Lecture: Face Recognition and Feature Reduction
Principal Components Analysis
Dimension Reduction via PCA (Principal Component Analysis)
Machine Learning Dimensionality Reduction
Principal Component Analysis
Principal Component Analysis
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Descriptive Statistics vs. Factor Analysis
Introduction to Statistical Methods for Measuring “Omics” and Field Data PCA, PcoA, distance measure, AMOVA.
X.1 Principal component analysis
Principal Components Analysis
Principal Component Analysis (PCA)
Dimensionality Reduction
Feature space tansformation methods
Principal Components What matters most?.
Feature Selection Methods
Principal Component Analysis
Unsupervised Learning
Presentation transcript:

Principal Component Analysis (PCA) Presented by Aycan YALÇIN 2003700369

Outline of the Presentation Introduction Objectives of PCA Terminology Algorithm Applications Conclusion

Introduction

Introduction Problem: Analysis of multivariate data plays a key role in data analysis Multidimensional hyperspace is often difficult to visualize Represent data in a manner that facilitates the analysis

Introduction (cont’d) Objectives of unsupervised learning methods: Reduce dimensionality Score all observations Cluster similar observations together Well-known linear transformation methods: PCA, Factor Analysis, Projection Pursuit,etc.

Introduction (cont’d) Benefits of dimensionality reduction: The computational overhead of the subsequent processing stages is reduced Noise may be reduced A projection into a subspace of a very low dimension is useful for visualizing the data

Objectives of PCA

Objectives of PCA Principal Component Analysis is a technique used to: Reduce the dimensionality of the data set Identify new meaningful underlying variables Loose minimum information by finding the directions in which a cloud of data points is stretched most.

Objectives of PCA (cont’d) PCA or Karhunen- Loeve transform summarizes the variation in a (possibly) correlated multi-attribute to a set of (a smaller number of) uncorrelated components (principal components). These uncorrelated variables are linear combinations of original variables. the objective of PCA is to reduce the dimensionality by extracting the smallest number components that account for most of the variation in the original multivariate data and to summarize the data with little loss of information.

Terminology

Terminology Variance Covariance Eigenvectors & Eigenvalues Principal Components

Terminology (Variance) Standard deviation: Average distance from mean to a point Variance: Standard deviation squared One-dimensional measure

Terminology (Covariance) How two dimensions vary from the mean with respect to each other cov(X,Y) > 0: Dimensions increase together cov(X,Y) < 0: One increases, one decreases cov(X,Y) = 0: Dimensions are independent

Terminology (Covariance Matrix) Contains covariance values between all possible dimensions: Example for three dimensions (x,y,z) (Always symetric): cov(x,x)  variance of component x

Terminology (Eigenvalues & Eigenvectors) Eigenvalues measure the amount of the variation explained by each PC (largest for the first PC and smaller for the subsequent PCs) > 1 indicates that PCs account for more variance than accounted by one of the original variables in standardized data This is commonly used as a cutoff point for which PCs are retained. Eigenvectors provides the weights to compute the uncorrelated PC. These vectors give the directions in which the data cloud is stretched most

Terminology (Eigenvalues & Eigenvectors) Vectors x having same direction as Ax are called eigenvectors of A (A is an n by n matrix). In the equation Ax=x,  is called an eigenvalue of A. Ax=x  (A-I)x=0 How to calculate x and : Calculate det(A-I), yields a polynomial (degree n) Determine roots to det(A-I)=0, roots are eigenvalues  Solve (A- I) x=0 for each  to obtain eigenvectors x

Terminology (Principal Component) The extracted uncorrelated components are called principal components(PC) Estimated from the eigenvectors of the covariance or correlation matrix of the original variables. The projections of the data on the eigenvectors Extracted by linear transformations of the original variables so that the first few PC’s contain most of the variations in the original dataset.

Algorithm

transform from 2 to 1 dimension Algorithm We look for axes which minimise projection errors and maximise the variance after projection n-dimensional vectors m-dimensional m < n Ex: transform from 2 to 1 dimension

more information (variance) Algorithm (cont’d) Preserve as much of the variance as possible more information (variance) rotate less information project

Algorithm (cont’d) Data is a matrix such as Rows  Observations(values) Columns  Attributes (dimensions) First center data by subtracting the mean in each dimension i is observation, j is dimension and m is total number of observation Calculate covariance matrix for DataAdjust

Algorithm (cont’d) Calculate eigenvalues  and eigenvectors x for covariance matrix: Eigenvalues j are used for calculation of [% of total variance] (Vj) for each component j

Algorithm (cont’d) Choose components – form feature vector Eigenvalues  and eigenvectors x are sorted in descending order Component with highest  is principal component Featurevector=(x1, ... , xn) where xi is a column oriented eigenvector. Contains chosen components. Derive new dataset Transpose Featurevector and DataAdjust Finaldata=RowFeatureVector x RowDataAdjust Original data in terms of chosen components Finaldata has eigenvectors as coordinate axes

Algorithm (cont’d) Retrieving old data (e.g. in data compression) RetrievedRowData = (RowFeatureVectorT x FinalData)+OriginalMean Yields original data using the chosen components

Algorithm (cont’d) Estimating the Number of PC Scree Test: Plotting the eigenvalues against the corresponding PC produces a scree plot that illustrates the rate of change in the magnitude of the eigenvalues for the PC. The rate of decline tends to be fast first then levels off. The ‘elbow’, or the point at which the curve bends, is considered to indicate the maximum number of PC to extract. One less PC than the number at the elbow might be appropriate if you are concerned about getting an overly defined solution.

Applications

Applications Example applications: Computer Vision Representation Pattern Identification Image compression Face recognition Gene expression analysis Purpose: Determine core set of conditions for useful gene comparison Handwritten character recognition Data Compression, etc.

Conclusion

Conclusion PCA can be useful when there is a severe high-degree of correlation present in the multi-attributes When a data set consists of several clusters, the principal axes found by PCA usually pick projections with good separation. PCA provides an effective basis for feature extraction in this case. For good data compression, PCA offers a useful self-organized learning procedure

Conclusion (cont’d) Shortcomings of PCA: PCA requires to diagonalise matrix C (dimension:n x n). Heavy if n is large ! PCA only finds linear sub-spaces It works best if the individual components are Gaussian-distributed(e.g ICA does not rely on such a distribution) PCA does not say how many target dimensions to use

Questions?