Principal Component Analysis

Slides:



Advertisements
Similar presentations
Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Advertisements

Covariance Matrix Applications
EigenFaces.
Machine Learning Lecture 8 Data Processing and Representation
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
An introduction to Principal Component Analysis (PCA)
Principal Component Analysis
Principal Component Analysis
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9(b) Principal Components Analysis Martin Russell.
Face Recognition Jeremy Wyatt.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Principal Component Analysis. Consider a collection of points.
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Chapter 2 Dimensionality Reduction. Linear Methods
Presented By Wanchen Lu 2/25/2013
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #19.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Eigen Decomposition Based on the slides by Mani Thomas Modified and extended by Longin Jan Latecki.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
CSE 185 Introduction to Computer Vision Face Recognition.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
Principal Components Analysis ( PCA)
Unsupervised Learning II Feature Extraction
Unsupervised Learning II Feature Extraction
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Principal Component Analysis (PCA)
CSSE463: Image Recognition Day 27
PREDICT 422: Practical Machine Learning
Principal Component Analysis
Dimensionality Reduction
University of Ioannina
9.3 Filtered delay embeddings
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Recognition: Face Recognition
Principal Component Analysis (PCA)
Principal Components Analysis
Machine Learning Dimensionality Reduction
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Principal Component Analysis
Principal Component Analysis
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Descriptive Statistics vs. Factor Analysis
X.1 Principal component analysis
Dimensionality Reduction
CSSE463: Image Recognition Day 25
Feature space tansformation methods
CS4670: Intro to Computer Vision
Eigen Decomposition Based on the slides by Mani Thomas
CSSE463: Image Recognition Day 25
Feature Selection Methods
Principal Component Analysis
Eigen Decomposition Based on the slides by Mani Thomas
Presentation transcript:

Principal Component Analysis Features extraction and representation

Principal Component Analysis Abstract : In the present big data era, there is a need to process large amounts of unlabeled data and find some patterns in the data to use it further. Need to discard features that are unimportant and discover only the representations that are needed. It is possible to convert high-dimensional data to low-dimensional data using different techniques, this dimension reduction is important and makes tasks such as classification, visualization, communication and storage much easier. The loss of information should be less while mapping data from high- dimensional space to low-dimensional space.

Principal Components Analysis Ideas ( PCA) Does the data set ‘span’ the whole of d dimensional space? For a matrix of m samples x n genes, create a new covariance matrix of size n x n. Transform some large number of variables into a smaller number of uncorrelated variables called principal components (PCs). developed to capture as much of the variation in data as possible

Example Applications Face Recognition Image Compression Gene Expression Analysis Data Reduction Data Classification Trend Analysis Factor Analysis Noise Reduction

Principal Component Analysis x Note: Y1 is the first eigen vector, Y2 is the second. Y2 ignorable. Key observation:Varience = Largest

Principal Component Analysis: one attribute first Temperature 42 40 24 30 15 18 35 Question: how much spread is in the data along the axis? (distance to the mean) Variance=Standard deviation^2

Now consider two dimensions X=Temperature Y=Humidity 40 90 30 15 70 Covariance: measures the correlation between X and Y cov(X,Y)=0: independent Cov(X,Y)>0: move same direction Cov(X,Y)<0: move opposite direction

More than two attributes: covariance matrix Contains covariance values between all possible dimensions (=attributes): Example for three attributes (x,y,z):

What is Principal Component Analysis? They are the directions where there is the most variance, the directions where the data is most spread out. 

To find the direction where there is most variance, find the straight line where the data is most spread out when projected onto it. A vertical straight line with the points projected on to it will look like this:

On this line the data is way more spread out, it has a large variance On this line the data is way more spread out, it has a large variance. In fact there isn’t a straight line you can draw that has a larger variance than a horizontal one. A horizontal line is therefore the principal component in this example.

Eigenvalues & Eigenvectors Vectors x having same direction as Ax are called eigenvectors of A (A is an n by n matrix). In the equation Ax=x,  is called an eigenvalue of A.

Principal components:- 1. principal component (PC1) The eigenvalue with the largest absolute value will indicate that the data have the largest variance along its eigenvector, the direction along which there is greatest variation 2. principal component (PC2) the direction with maximum variation left in data, orthogonal to the 1. PC In general, only few directions manage to capture most of the variability in the data.

Steps of PCA :- Let be the mean vector (taking the mean of all rows) Adjust the original data by the mean X’ = X – Compute the covariance matrix C of adjusted X Find the eigenvectors and eigenvalues of C.

Eigenvalues Calculate eigenvalues  and eigenvectors x for covariance matrix: Eigenvalues j are used for calculation of [% of total variance] (Vj) for each component j:

Principal components - Variance

Transformed Data Eigenvalues j corresponds to variance on each component j Thus, sort by j Take the first p eigenvectors ei; where p is the number of top eigenvalues These are the directions with the largest variances

PCA –> Original Data Retrieving old data (e.g. in data compression) RetrievedRowData=(RowFeatureVectorT x FinalData)+OriginalMean Yields original data using the chosen components

Refrences :- https://pdfs.semanticscholar.org/6e72/2fafa3b1191f7c779ddf09b64eda01c94 a28.pdf https://en.wikipedia.org/wiki/Principal_component_analysis principalcomponentanalysis-150314161616-conversion-gate01.pdf https://georgemdallas.wordpress.com/2013/10/30/principal-component- analysis-4-dummies-eigenvectors-eigenvalues-and-dimension-reduction/ https://www.slideshare.net/CvilleDataScience/az-tecpca- datasciencemeetupkscott20140218?next_slideshow=2

THANKYOU