Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.

Slides:



Advertisements
Similar presentations
Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Advertisements

Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.
Factor Analysis Continued
Chapter Nineteen Factor Analysis.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Lecture 7: Principal component analysis (PCA)
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
An introduction to Principal Component Analysis (PCA)
Principal Component Analysis
LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.
Principal Components An Introduction Exploratory factoring Meaning & application of “principal components” Basic steps in a PC analysis PC extraction process.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Dr. Michael R. Hyman Factor Analysis. 2 Grouping Variables into Constructs.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Techniques for studying correlation and covariance structure
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Correspondence Analysis Chapter 14.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
1 Dimension Reduction Examples: 1. DNA MICROARRAYS: Khan et al (2001): 4 types of small round blue cell tumors (SRBCT) Neuroblastoma (NB) Rhabdomyosarcoma.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Lecture 12 Factor Analysis.
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
Education 795 Class Notes Factor Analysis Note set 6.
Principle Component Analysis and its use in MA clustering Lecture 12.
Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
Multivariate Transformation. Multivariate Transformations  Started in statistics of psychology and sociology.  Also called multivariate analyses and.
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.
Principal Components Analysis ( PCA)
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Multivariate statistical methods Cluster analysis.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
FACTOR ANALYSIS CLUSTER ANALYSIS Analyzing complex multidimensional patterns.
Principal Component Analysis
Principal Component Analysis (PCA)
Exploratory Factor Analysis
Multivariate statistical methods
Principal Component Analysis
Information Management course
Exploring Microarray data
COMP 1942 PCA TA: Harry Chan COMP1942.
Factor analysis Advanced Quantitative Research Methods
Principal Component Analysis (PCA)
Principal Components Analysis
Dimension Reduction via PCA (Principal Component Analysis)
Descriptive Statistics vs. Factor Analysis
Measuring latent variables
Introduction to Statistical Methods for Measuring “Omics” and Field Data PCA, PcoA, distance measure, AMOVA.
Introduction PCA (Principal Component Analysis) Characteristics:
Principal Components Analysis
Multivariate Statistical Methods
PCA of Waimea Wave Climate
Principal Component Analysis
Principal Component Analysis (PCA)
Measuring latent variables
Presentation transcript:

Multivariate statistical methods

Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation vs. eploration analysis  confirmation – impact on parameter estimate and hypothesis testing  exploration – impact on data exploration, finding out of patterns and structure

Multivariate statistical methods Unit classification Cluster analysis Discrimination analysis Analysis of relations among variables Cannonical correlation analysis Factor analysis Principal component analysis

Methods for analysis of relations among variables

Principal component analysis the oldest and the most used multivariate statistical methods standed by Pearson in 1901 and independently from Pearson also by Hotelling in 1933 principal aims:  detection of relations among variables  reduction of variables number and finding of new purposeful variables

Principal component analysis as fundament is linear transformation of original variables into less number of new fictituous variables, so called principal components component characteristics:  are not mutually correlated  for m original variables is r<=m good dimension, r (best a lot less than m) principal components explain sufficiency variability of original variables

PCA component characteristics:  method is based on full explanation of total variability  principal components are ordered according share of explained variance  the most of variance is explained by first component, the least by last component

PCA procedure starting analysis – exploration of relations among variables (graphs, descriptive statistics) exploration of correlation matrix (existence of correlation among original variables – reduction of variables is possible) principal component analysis, choice of suitable number of components (usually is enough 70 – 90 % of explained variance) interpretation of principal components

PCA procedure PCA is based on 1. covariance matrix (the same units of variables, similar variance) 2. correlation matrix (standardized data or different units of variables)

Model of PCA → standardized original variable … weights of principal component … prin. components in standardized expression j,k = 1,2, …., p i = 1,2, …., n- number of units j = 1,2, …., p- number of variables

PCA – mathematical model original matrix – dataset X (n x m), n objects, m variables Z = [z ij ]standardized matrix X i = 1,…., nj = 1,…., m aim is find out transformation matrix Q, which convert m standardized variables (matrix Z) into m mutual independent component (matrix P) P = Z. Q

PCA – mathematical model Modification of P = Z. Q → we get matrix

PCA – mathematical model matrix Λ is matrix of covariance and variance of principal components. With regard to independence of principal components are covariances 0 and matrix Λ is diagonal with variances of principal component on diagonal sum of variances standardized variables equals to m. proportions indicate, how large is the share of the first, second, … last component on explanation of the total variance of all variables

PCA – mathematical model matrix R is correlation matrix of original variables where Diagonal values of matrix Λ are eigenvalues of matrix R, in columns of matrix Q are eigenvectors related to each eigenvalue

PCA – other notions coordinates of nonstandardized principal component are called „score“ matrix of all score for all objects (n) is called „score matrix“ scores for objects are in rows matrix columns are vectors of score

PCA – other notions share of total variability of each original variable X i, i = 1, 2,…, m, which is explained by r principals components is called communality of variable X i. is computed as second power of multiple coefficient of correlation → r 2

PCA – graphical visualisation Cattel´s graph → scree plot tool for determination of number of principal components

PCA – graphical visualization graph of coefficients of correlation (1st and 2nd principal component)

PCA – graphical visualization Graph of component score