Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.

Slides:



Advertisements
Similar presentations
Step three: statistical analyses to test biological hypotheses General protocol continued.
Advertisements

Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Covariance Matrix Applications
Lecture 3: A brief background to multivariate statistics
Factor Analysis Continued
Machine Learning Lecture 8 Data Processing and Representation
Dimensionality Reduction PCA -- SVD
PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Lecture 7: Principal component analysis (PCA)
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
An introduction to Principal Component Analysis (PCA)
Principal Component Analysis
LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.
Principal Component Analysis
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9(b) Principal Components Analysis Martin Russell.
Elias Raftopoulos HY539 29/3/06 Prof. Maria Papadopouli
Face Recognition Using Eigenfaces
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Principal Component Analysis Barnabás Póczos University of Alberta Nov 24, 2009 B: Chapter 12 HRF: Chapter 14.5.
Microarray analysis Algorithms in Computational Biology Spring 2006 Written by Itai Sharon.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Principal Component Analysis. Consider a collection of points.
Techniques for studying correlation and covariance structure
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Principal Components Analysis on Images and Face Recognition
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Principal Components Analysis (PCA). a technique for finding patterns in data of high dimension.
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #19.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
Additive Data Perturbation: data reconstruction attacks.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
es/by-sa/2.0/. Principal Component Analysis & Clustering Prof:Rui Alves Dept Ciencies Mediques.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
CSE 185 Introduction to Computer Vision Face Recognition.
Algorithms 2005 Ramesh Hariharan. Algebraic Methods.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Feature Extraction 主講人:虞台文.
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Unsupervised Learning II Feature Extraction
Unsupervised Learning II Feature Extraction
Unsupervised Learning II Feature Extraction. Feature Extraction Techniques Unsupervised methods can also be used to find features which can be useful.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
PREDICT 422: Practical Machine Learning
Principal Component Analysis
Principal Component Analysis (PCA)
Principal Component Analysis
Principal Component Analysis (PCA)
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Descriptive Statistics vs. Factor Analysis
Recitation: SVD and dimensionality reduction
Principal Component Analysis (PCA)
Principal Components Analysis on Images and Face Recognition
Principal Component Analysis
Unsupervised Learning
Presentation transcript:

Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006

Data Presentation Example: 53 Blood and urine measurements (wet chemistry) from 65 people (33 alcoholics, 32 non-alcoholics). Matrix Format Spectral Format

Univariate Bivariate Trivariate Data Presentation

Better presentation than ordinate axes? Do we need a 53 dimension space to view data? How to find the ‘best’ low dimension space that conveys maximum useful information? One answer: Find “Principal Components” Data Presentation

Principal Components All principal components (PCs) start at the origin of the ordinate axes. First PC is direction of maximum variance from origin Subsequent PCs are orthogonal to 1st PC and describe maximum residual variance Wavelength 1 Wavelength Wavelength 1 Wavelength 2 PC 1 PC 2

Algebraic Interpretation Given m points in a n dimensional space, for large n, how does one project on to a low dimensional space while preserving broad trends in the data and allowing it to be visualized?

Algebraic Interpretation – 1D Given m points in a n dimensional space, for large n, how does one project on to a 1 dimensional space? Choose a line that fits the data so the points are spread out well along the line

Formally, minimize sum of squares of distances to the line. Why sum of squares? Because it allows fast minimization, assuming the line passes through 0 Algebraic Interpretation – 1D

Minimizing sum of squares of distances to the line is the same as maximizing the sum of squares of the projections on that line, thanks to Pythagoras. Algebraic Interpretation – 1D

How is the sum of squares of projection lengths expressed in algebraic terms? Point 1 Point 2 Point 3 : Point m LineLine P P P … P t t t … t … m L i n e BxBTBT xTxT Algebraic Interpretation – 1D

PCA: General From k original variables: x 1,x 2,...,x k : Produce k new variables: y 1,y 2,...,y k : y 1 = a 11 x 1 + a 12 x a 1k x k y 2 = a 21 x 1 + a 22 x a 2k x k... y k = a k1 x 1 + a k2 x a kk x k

From k original variables: x 1,x 2,...,x k : Produce k new variables: y 1,y 2,...,y k : y 1 = a 11 x 1 + a 12 x a 1k x k y 2 = a 21 x 1 + a 22 x a 2k x k... y k = a k1 x 1 + a k2 x a kk x k such that: y k 's are uncorrelated (orthogonal) y 1 explains as much as possible of original variance in data set y 2 explains as much as possible of remaining variance etc. PCA: General

1st Principal Component, y 1 2nd Principal Component, y 2

PCA Scores x i2 x i1 y i,1 y i,2

PCA Eigenvalues λ1λ1 λ2λ2

From k original variables: x 1,x 2,...,x k : Produce k new variables: y 1,y 2,...,y k : y 1 = a 11 x 1 + a 12 x a 1k x k y 2 = a 21 x 1 + a 22 x a 2k x k... y k = a k1 x 1 + a k2 x a kk x k such that: y k 's are uncorrelated (orthogonal) y 1 explains as much as possible of original variance in data set y 2 explains as much as possible of remaining variance etc. y k 's are Principal Components PCA: Another Explanation

{a 11,a 12,...,a 1k } is 1st Eigenvector of correlation/covariance matrix, and coefficients of first principal component {a 21,a 22,...,a 2k } is 2nd Eigenvector of correlation/covariance matrix, and coefficients of 2nd principal component … {a k1,a k2,...,a kk } is kth Eigenvector of correlation/covariance matrix, and coefficients of kth principal component PCA: General

PCA Summary until now Rotates multivariate dataset into a new configuration which is easier to interpret Purposes –simplify data –look at relationships between variables –look at patterns of units

A 2D Numerical Example

PCA Example –STEP 1 Subtract the mean from each of the data dimensions. All the x values have x subtracted and y values have y subtracted from them. This produces a data set whose mean is zero. Subtracting the mean makes variance and covariance calculation easier by simplifying their equations. The variance and co-variance values are not affected by the mean value.

PCA Example –STEP 1 DATA: x y ZERO MEAN DATA: x y

PCA Example –STEP 1

PCA Example –STEP 2 Calculate the covariance matrix cov = since the non-diagonal elements in this covariance matrix are positive, we should expect that both the x and y variable increase together.

PCA Example –STEP 3 Calculate the eigenvectors and eigenvalues of the covariance matrix eigenvalues = eigenvectors =

PCA Example –STEP 3 eigenvectors are plotted as diagonal dotted lines on the plot. Note they are perpendicular to each other. Note one of the eigenvectors goes through the middle of the points, like drawing a line of best fit. The second eigenvector gives us the other, less important, pattern in the data, that all the points follow the main line, but are off to the side of the main line by some amount.

PCA Example –STEP 4 Reduce dimensionality and form feature vector the eigenvector with the highest eigenvalue is the principle component of the data set. In our example, the eigenvector with the larges eigenvalue was the one that pointed down the middle of the data. Once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives you the components in order of significance.

PCA Example –STEP 4 Now, if you like, you can decide to ignore the components of lesser significance. You do lose some information, but if the eigenvalues are small, you don’t lose much n dimensions in your data calculate n eigenvectors and eigenvalues choose only the first p eigenvectors final data set has only p dimensions.

PCA Example –STEP 4 Feature Vector FeatureVector = (eig 1 eig 2 eig 3 … eig n ) We can either form a feature vector with both of the eigenvectors: or, we can choose to leave out the smaller, less significant component and only have a single column:

Reconstruction of original Data x