Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.

Slides:



Advertisements
Similar presentations
Eigen Decomposition and Singular Value Decomposition
Advertisements

Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Eigen Decomposition and Singular Value Decomposition
Covariance Matrix Applications
Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.
Tensors and Component Analysis Musawir Ali. Tensor: Generalization of an n-dimensional array Vector: order-1 tensor Matrix: order-2 tensor Order-3 tensor.
Dimension reduction (1)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Psychology 202b Advanced Psychological Statistics, II April 7, 2011.
Principal Component Analysis
Principal Component Analysis
Principal component analysis (PCA)
Dimensional reduction, PCA
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Principal Component Analysis Principles and Application.
Principal component analysis (PCA) Purpose of PCA Covariance and correlation matrices PCA using eigenvalues PCA using singular value decompositions Selection.
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Chapter 2 Dimensionality Reduction. Linear Methods
Presented By Wanchen Lu 2/25/2013
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
Eigen Decomposition Based on the slides by Mani Thomas Modified and extended by Longin Jan Latecki.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University.
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Christina Bonfanti University of Miami- RSMAS MPO 524.
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
Principle Component Analysis and its use in MA clustering Lecture 12.
Principal Component Analysis (PCA)
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Principal Component Analysis
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Unsupervised Learning II Feature Extraction
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Unsupervised Learning II Feature Extraction
Principal Component Analysis
Principal Component Analysis (PCA)
Dimensionality Reduction
Exploring Microarray data
Principal Component Analysis (PCA)
Principal Component Analysis
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Principal Component Analysis
Principal Component Analysis
Recitation: SVD and dimensionality reduction
X.1 Principal component analysis
Principal Component Analysis (PCA)
Dimensionality Reduction
Feature space tansformation methods
Lecture 13: Singular Value Decomposition (SVD)
Principal Component Analysis
Eigen Decomposition Based on the slides by Mani Thomas
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015

What is PCA? An orthogonal transformation Convert correlated variables to an artificial variable(Principle Component) The resulting vectors are an orthogonal basis set A tool in exploratory data analysisexploratory data analysis

Why use PCA? Reduce the dimensionality of the data Compress the data Prepare the data for further analysis using other techniques Understand your data better by interpreting the loadings, and by graphing the derived variables Dr. Peter Westfall

How PCA works 1.PCA begin with covariance matrix: Cov(X)=X T X 2.For the covariance matrix, calculate its eigenvectors and eigenvalues. 3.Get sets of eigenvectors z i and eigenvaluesλ i (Constraint: z i T z i =1) 4.arrange the eigenvectors in decreasing order of the eigenvalues 5.Pick eigenvectors, multiple by original data matrix(X), we will get PC matrix.

Example of how PCA works (by R) A financial sample data with 8 variables and 25obs Perform PCA on this data and reduce the number of variables from 8 to something more manageable

Simulate PC on uncorrelated data and highly correlated data (by R) PCA is better for more highly correlated data in that greater reduction is achievable. Provided by Dr. Peter Westfall

PCA standardization Why: The variable with the smaller numbers – even though this may be the more important number – will be overwhelmed by the other larger numbers in what it contributes to the covariance

properties of PC The number of principal components is less than or equal to the number of original variables. The first principal component has the largest possible variance. Each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components.

What is SVD? Applied_Regression_Analysis_A_Research_Tool.pdf

Relationship between SVD and PCA From SVD we have X = UL 1/2 Z T -> W = XZ = UL 1/2 If X is an n × p matrix of observations on p variables, each column of W is a new variable defined as a linear transformation of the original variables. Applied_Regression_Analysis_A_Research_Tool.pdf

EFA vs PCA EFA: EFA provides a model to explain why the data looks like it does. PCA: PC is not a model that explains how the data looks. There is no model at all. Provided by Dr. Peter Westfall

EFA vs PCA

EFA vs PCA EFA: in EFA one postulates that there is a smaller set of unobserved (latent) variables or constructs underlying the variables actually observed or measured (this is commonly done to assess validity) PCA: in PCA one is simply trying to mathematically derive a relatively small number of variables to use to convey as much of the information in the observed/measured variables as possible

Application of PCA Data visualization Image compression

Data visualization If a multivariate dataset is visualized as a set of coordinates in a high-dimensional data space (1 axis per variable), PCA can supply the user with a lower-dimensional picture.dimensional

PCA using on compressing image The PCA formulation may be used as a digital image compression algorithm with a low level of loss.

princomp vs prcomp For prcomp: The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy. For princomp: The calculation is done using eigen on the correlation or covariance matrix, as determined by cor. This is done for compatibility with the S- PLUS result. A preferred method of calculation is to use svd on x, as is done in prcomp."

Thanks!