Information Management course

Slides:



Advertisements
Similar presentations
Covariance Matrix Applications
Advertisements

Component Analysis (Review)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Lecture 7: Principal component analysis (PCA)
Data Mining: Concepts and Techniques — Chapter 3 — Cont.
Dimensional reduction, PCA
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9(b) Principal Components Analysis Martin Russell.
Data mining and statistical learning, lecture 4 Outline Regression on a large number of correlated inputs  A few comments about shrinkage methods, such.
Canonical correlations
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Tables, Figures, and Equations
Principal Component Analysis. Consider a collection of points.
Techniques for studying correlation and covariance structure
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
Correlation. The sample covariance matrix: where.
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Separate multivariate observations
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
1 February 24 Matrices 3.2 Matrices; Row reduction Standard form of a set of linear equations: Chapter 3 Linear Algebra Matrix of coefficients: Augmented.
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
Some matrix stuff.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #19.
Additive Data Perturbation: data reconstruction attacks.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
A Review of Some Fundamental Mathematical and Statistical Concepts UnB Mestrado em Ciências Contábeis Prof. Otávio Medeiros, MSc, PhD.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
Discriminant Analysis
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
Principle Component Analysis and its use in MA clustering Lecture 12.
STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Univariate Point Estimation Confidence Interval Estimation Bivariate: Linear Regression Multivariate: Multiple Regression 1 Chapter 4: Statistical Approaches.
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Principal Component Analysis
Introduction to Vectors and Matrices
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Exploring Microarray data
Principle Component Analysis (PCA) Networks (§ 5.8)
LECTURE 10: DISCRIMINANT ANALYSIS
Regression.
Information Management course
Principal Component Analysis (PCA)
Principal Component Analysis (PCA)
Techniques for studying correlation and covariance structure
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Principal Component Analysis (PCA)
Feature space tansformation methods
Principal Components What matters most?.
Digital Image Processing Lecture 21: Principal Components for Description Prof. Charlene Tsai *Chapter 11.4 of Gonzalez.
LECTURE 09: DISCRIMINANT ANALYSIS
Introduction to Vectors and Matrices
Feature Selection Methods
Principal Component Analysis
3 basic analytical tasks in bivariate (or multivariate) analyses:
Marios Mattheakis and Pavlos Protopapas
Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.
Presentation transcript:

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 06: 24/10/2012

Data Mining: Methods and Models — Chapter 1 — Daniel T. Larose ©2006 John Wiley and Sons 2 2

Data (Dimension) Reduction In large datasets it is unlikely that all attributes are independent: multicollinearity Worse mining quality: Instability in multiple regression (significant overall, but poor wrt significant attributes) Overemphasize particular attributes (multiple counts) Violates principle of parsimony (too many unnecessary predictors in a relation with a response var) Curse of dimensionality: Sample size needed to fit a multivariate function grows exponentially with number of attributes e.g. in 1-dimensional distrib. 68% of normally distributed values lie between -1 and 1; in 10-dimensional distrib. only 0.02% within the radius 1 hypersphere 3 3 3

Principal Component Analysis (PCA) Try to explain correlation using a small set of linear combination of attributes Geometrically: Look at the attributes as variables forming a coordinate system Principal Components are a new coordinate system, found by rotating the original system along the directions of maximum variability

PCA – Step 1: standardize data Notation (review): Dataset with n rows and m columns Attributes (columns): Xi Mean of each attrib: Variance of each attrib: Covariance between two attrib: Correlation coefficient: μ 𝑖 = 1 𝑛 𝑗=1 𝑛 𝑋 𝑗 𝑖 σ 𝑖𝑖 2 = 1 𝑛 𝑗=1 𝑛 𝑋 𝑗 𝑖 − μ 𝑖 2 σ 𝑖𝑗 2 = 1 𝑛 𝑘=1 𝑛 𝑋 𝑘 𝑖 − μ 𝑖 ⋅ 𝑋 𝑘 𝑗 − μ 𝑗 𝑟 𝑖𝑗 = σ 𝑖𝑗 2 σ 𝑖𝑖 σ 𝑗𝑗

PCA – Step 1: standardize data Definitions Standard Deviation Matrix: (Symmetric) Covariance Matrix: Correlation Matrix: Standardization in matrix form: N.B. E(Z) = vector of zeros; Cov(Z) = ρ 𝑉 1 2 = σ 11 0 ... 0 0 σ 22 ... ... ... ... ... ... ... ... ... σ 𝑚𝑚 𝐶𝑜𝑣= σ 11 2 σ 12 2 ... σ 1m 2 σ 21 2 σ 22 2 ... σ 2m 2 ... ... ... ... ... ... ... σ 𝑚𝑚 2 ρ= 𝑟 𝑖𝑗 𝑍= 𝑉 1 2 −1 𝑋−μ 𝑍 𝑖𝑘 = 𝑋 𝑘 𝑖 − μ 𝑖 σ 𝑖𝑖

PCA – Step 2: compute eigenvalues and eigenvectors Eigenvalues of ρ are scalars λ1 ... λm such that det(ρ – λI) = 0 Given a matrix ρ and its eigenvalue λi, ei is a corresponding eigenvector if ρ ei = λiei We are interested in eigenvalues / eigenvectors of the correlation matrix

PCA – Step 3: compute principal components Consider the vectors Yi = eiT Z e.g. Y1 = e11 Z1 + e12 Z2 + … + e1m Zm Sort Yi by value of variance: Var(Yi) = eiT ρ ei Then Start with an empty sequence of principal components Select the vector ei that maximizes Var(Yi) Is independent from all selected components Goto (2)

PCA – Properties Property 1: The total variability in the standardized data set equals the sum of the variances for each Z-vector, which equals the sum of the variances for each component, which equals the sum of the eigenvalues, Which equals the number of variables 𝑖=1 𝑚 𝑉𝑎𝑟 𝑌 𝑖 = 𝑖=1 𝑚 𝑉𝑎𝑟 𝑍 𝑖 = 𝑖=1 𝑚 λ 𝑖 =𝑚

PCA – Properties Property 2: The partial correlation between a given component and a given variable is a function of an eigenvector and an eigenvalue. In particular, Corr(Yi, Zj) = eij sqrt(λi) Property 3: The proportion of the total variability in Z that is explained by the ith principal component is the ratio of the ith eigenvalue to the number of variables, that is the ratio λi/m

PCA – Experiment on real data Open R and read “cadata.txt” Keep first attribute (say 0) as response, remaining ones as predictors Know Your Data: Barplot and scatterplot attributes Normalize Data Scatterplot normalized data Compute correlation matrix Compute eigenvalues and eigenvectors Compute components (eigenvectors) – attribute correlation matrix Compute cumulative variance explained by principal components