Lecture 8: Factor analysis (FA)

Slides:



Advertisements
Similar presentations
Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.
Advertisements

Factor Analysis Continued
Chapter Nineteen Factor Analysis.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Lecture 7: Principal component analysis (PCA)
Psychology 202b Advanced Psychological Statistics, II April 7, 2011.
LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.
Principal Components An Introduction Exploratory factoring Meaning & application of “principal components” Basic steps in a PC analysis PC extraction process.
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
These slides are additional material for TIES4451 Lecture 5 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö.
1 Carrying out EFA - stages Ensure that data are suitable Decide on the model - PAF or PCA Decide how many factors are required to represent you data When.
Education 795 Class Notes Factor Analysis II Note set 7.
Relationships Among Variables
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Factor Analysis Psy 524 Ainsworth.
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Chapter 9 Factor Analysis
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.
Principal Component vs. Common Factor. Varimax Rotation Principal Component vs. Maximum Likelihood.
© 2007 Prentice Hall19-1 Chapter Nineteen Factor Analysis © 2007 Prentice Hall.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Lecture 12 Factor Analysis.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Education 795 Class Notes Factor Analysis Note set 6.
Chapter 13.  Both Principle components analysis (PCA) and Exploratory factor analysis (EFA) are used to understand the underlying patterns in the data.
Principle Component Analysis and its use in MA clustering Lecture 12.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Factor Analysis I Principle Components Analysis. “Data Reduction” Purpose of factor analysis is to determine a minimum number of “factors” or components.
Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
FACTOR ANALYSIS 1. What is Factor Analysis (FA)? Method of data reduction o take many variables and explain them with a few “factors” or “components”
Principal Component Analysis
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Feature Extraction 主講人:虞台文.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.
Basic statistical concepts Variance Covariance Correlation and covariance Standardisation.
Université d’Ottawa / University of Ottawa 2003 Bio 8102A Applied Multivariate Biostatistics L4.1 Lecture 4: Multivariate distance measures l The concept.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Exploratory Factor Analysis
EXPLORATORY FACTOR ANALYSIS (EFA)
Factor Analysis An Alternative technique for studying correlation and covariance structure.
Factor analysis Advanced Quantitative Research Methods
Information Management course
Principal Component Analysis (PCA)
Dimension Reduction via PCA (Principal Component Analysis)
Applied Statistics Using SAS and SPSS
Descriptive Statistics vs. Factor Analysis
Measuring latent variables
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Principal Components Analysis
Principal Component Analysis (PCA)
Dimensionality Reduction
Factor Analysis An Alternative technique for studying correlation and covariance structure.
Principal Component Analysis
Factor Analysis BMTRY 726 7/19/2018.
Chapter_19 Factor Analysis
Factor Analysis (Principal Components) Output
Principal Component Analysis
Applied Statistics Using SPSS
因子分析.
Exploratory Factor Analysis. Factor Analysis: The Measurement Model D1D1 D8D8 D7D7 D6D6 D5D5 D4D4 D3D3 D2D2 F1F1 F2F2.
Factor Analysis.
Measuring latent variables
Presentation transcript:

Lecture 8: Factor analysis (FA) Rationale and use of FA The underlying model (what is a factor analysis anyway?) Communalities and specificities Factor scores and loadings The use of and rationale for rotations Orthogonal and oblique rotations Component retention, significance, and reliability. Bio 8100s Applied Multivariate Biostatistics 2001

What is factor analysis? From a set of p variables X1, X2,…, Xp, we try and find (“extract”) a set of factors F1, F2,…, Fp that underly the observed variability in X1, X2,…, Xp The hope (sometimes faint) is that most of the variability in the original set of p variables will be accounted for by f < p factors. Bio 8100s Applied Multivariate Biostatistics 2001

The uses of factor analysis Study correlations among a larger number of variables by grouping variables into “factors” such that variables within each factor are more highly correlated with variables in that factor than with variables in other factors Summarize many variables by a few factors, and Interpret each factor according to the meaning of the measured variables Bio 8100s Applied Multivariate Biostatistics 2001

The factor model For a set of p variables X1, X2,…, Xp, the model is where the Xis are standardized variables, aijs are the factor loadings, Fj are the common factors, and ej is a factor specific to variable Xi… … and Note: in the factor model, observables (X) are linear functions of unobservables (F)! Bio 8100s Applied Multivariate Biostatistics 2001

The factor model (cont’d) Since we have Com(Xi) is the communality of Xi, (that portion of its variance that is “explained” by common factors), and Var(ei) is the specificity of Xi (the portion unexplained by common factors) Bio 8100s Applied Multivariate Biostatistics 2001

The factor model (cont’d) ..and So, the sample correlation between Xi and Xj is sum of the product of factor loadings. Hence, variables Xi and Xj can be highly correlated only if they have high loadings on the same factors. Since we have Bio 8100s Applied Multivariate Biostatistics 2001

The geometry of factors X3 Variables (Xi) are linear functions of factors (Fi). Hence, the factors are vectors in the Euclidean space defined by the set of variables {Xi}. Because the Fi s are uncorrelated, these vectors meet at right angles. F1 F2 X2 X1 Two factors (F1, F2) in the 3-D Euclidean space defined by {X1, X2, X3} Bio 8100s Applied Multivariate Biostatistics 2001

A comparison of PCA and FA models PCA model: unobservables (Zj) a linear function of observables (Xi) Therefore: PC scores are determinant (given X and a, each Z is unique) we can compute PC scores for each observation. FA model: observables (Xi) a linear function of unobservables (Fj) Therefore: Factors are indeterminant (given X and a, no F is unique) it makes no sense to “compute” factor scores for individual observations Bio 8100s Applied Multivariate Biostatistics 2001

Estimating factor loadings Estimation of the loadings for each factor can be accomplished through several different methods (e.g. least-square estimation, maximum likelihood estimation, iterated principal axis, etc.)… The extracted factors may differ depending on the method of estimation. Bio 8100s Applied Multivariate Biostatistics 2001

Factor loadings Factor loadings (aij) are the (theoretical) multiple correlation with the extracted factor. For each variable, the (loading)2 for each factor summed over all factors equals the proportion of the variable’s variance explained by all factors, i.e., the communality… … and the sum of (loading)2 for each variable is the variance explained by each factor. Bio 8100s Applied Multivariate Biostatistics 2001

More on factor loadings in factor analysis Sometimes factors include variables with similar loadings, which form a “natural” group. To assist in interpretation, we may want to choose another component frame which emphasizes these differences among groups. Rotating of factors follows the same procedure/rationale as for PCA. FACTOR(2) Factor plot Bio 8100s Applied Multivariate Biostatistics 2001

Bio 8100s Applied Multivariate Biostatistics FACTOR(2) FACTOR(2) Bio 8100s Applied Multivariate Biostatistics 2001

A final word: PCA versus FA In practice, PCA solutions to data reduction problems often are very similar to FA solutions to the same problem. “The choice of common factors or components methods often makes virtually no difference to the conclusions of a study.” N. Cliff, 1987. Analyzing multivariate data. New York: Harcourt, Brace Jovanovich. Bio 8100s Applied Multivariate Biostatistics 2001