Exploratory Factor Analysis Principal Component Analysis Chapter 17.

Slides:



Advertisements
Similar presentations
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 March 12, 2012.
Advertisements

Structural Equation Modeling
Some (Simplified) Steps for Creating a Personality Questionnaire Generate an item pool Administer the items to a sample of people Assess the uni-dimensionality.
Factor Analysis Continued
Chapter Nineteen Factor Analysis.
© LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON
Lecture 7: Principal component analysis (PCA)
Psychology 202b Advanced Psychological Statistics, II April 7, 2011.
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Factor Analysis Purpose of Factor Analysis
Factor Analysis There are two main types of factor analysis:
Determining the # Of PCs Remembering the process Some cautionary comments Statistical approaches Mathematical approaches “Nontrivial factors” approaches.
1 Carrying out EFA - stages Ensure that data are suitable Decide on the model - PAF or PCA Decide how many factors are required to represent you data When.
Education 795 Class Notes Factor Analysis II Note set 7.
Relationships Among Variables
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Factor Analysis Psy 524 Ainsworth.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Factor Analysis PowerPoint Prepared by Alfred.
What is Factor Analysis?
Factor analysis: Seeing clearly Paul Jose Thought for the day: “Goalkeeper Richard Siddall (Sheffield, England) stayed on the field for 10 minutes after.
1 Cronbach’s Alpha It is very common in psychological research to collect multiple measures of the same construct. For example, in a questionnaire designed.
SEM: Basics Byrne Chapter 1 Tabachnick SEM
Psy 427 Cal State Northridge Andrew Ainsworth PhD.
Estimation Kline Chapter 7 (skip , appendices)
MGMT 6971 PSYCHOMETRICS © 2014, Michael Kalsher
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.
Accuracy Chapter 5.1 Data Screening. Data Screening So, I’ve got all this data…what now? – Please note this is going to deviate from the book a bit and.
© 2007 Prentice Hall19-1 Chapter Nineteen Factor Analysis © 2007 Prentice Hall.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Reliability Analysis Based on the results of the PAF, a reliability analysis was run on the 16 items retained in the Task Value subscale. The Cronbach’s.
Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006.
CFA: Basics Beaujean Chapter 3. Other readings Kline 9 – a good reference, but lumps this entire section into one chapter.
Explanatory Factor Analysis: Alpha and Omega Dominique Zephyr Applied Statistics Lab University of Kenctucky.
SEM: Basics Byrne Chapter 1 Tabachnick SEM
Lecture 12 Factor Analysis.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Applied Quantitative Analysis and Practices
Exploratory Factor Analysis. Principal components analysis seeks linear combinations that best capture the variation in the original variables. Factor.
Education 795 Class Notes Factor Analysis Note set 6.
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Chapter 13.  Both Principle components analysis (PCA) and Exploratory factor analysis (EFA) are used to understand the underlying patterns in the data.
Department of Cognitive Science Michael Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Factor Analysis 1 PSYC 4310 Advanced Experimental.
Factor Analysis I Principle Components Analysis. “Data Reduction” Purpose of factor analysis is to determine a minimum number of “factors” or components.
Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.
Applied Quantitative Analysis and Practices LECTURE#19 By Dr. Osman Sadiq Paracha.
FACTOR ANALYSIS 1. What is Factor Analysis (FA)? Method of data reduction o take many variables and explain them with a few “factors” or “components”
CFA Model Revision Byrne Chapter 4 Brown Chapter 5.
Factor Analysis. Introduction 1. Factor Analysis is a set of techniques used for understanding variables by grouping them into “factors” consisting of.
Principal Component Analysis
FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
FACTOR ANALYSIS & SPSS. First, let’s check the reliability of the scale Go to Analyze, Scale and Reliability analysis.
1 FACTOR ANALYSIS Kazimieras Pukėnas. 2 Factor analysis is used to uncover the latent (not observed directly) structure (dimensions) of a set of variables.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
CFA: Basics Byrne Chapter 3 Brown Chapter 3 (40-53)
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
FACTOR ANALYSIS & SPSS.
Exploratory Factor Analysis
EXPLORATORY FACTOR ANALYSIS (EFA)
Analysis of Survey Results
Factor analysis Advanced Quantitative Research Methods
Dimension Reduction via PCA (Principal Component Analysis)
© LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Principal Component Analysis
Chapter_19 Factor Analysis
Presentation transcript:

Exploratory Factor Analysis Principal Component Analysis Chapter 17

Terminology Measured variables – the real scores from the experiment – Squares on a diagram Latent variables – the construct the measured variables are supposed to represent – Not measured directly – Circles on a diagram

Example SEM Diagram

Factors and components Factor analysis attempts to achieve parsimony by explaining the maximum amount of common variance in a correlation matrix using the smallest number of explanatory constructs. – These ‘explanatory constructs’ are called factors. PCA tries to explain the maximum amount of total variance in a correlation matrix. – It does this by transforming the original variables into a set of linear components.

EFA vs PCA Common variance = overlapping variance between items (systematic variance) Unique variance = variance only related to that item (error variance) EFA = describes the common variance PCA = describes common variance + unique variance

EFA vs PCA Communality – the common variance for the item – You can think of it as SMC: Squared multiple correlation – Created by using all other items to predict that item

Slide 7 Variance of Variable 1 Variance of Variable 2 Variance of Variable 3 Variance of Variable 4 Communality = 1 = 0 Communality = 0

R-Matrix In Factor Analysis and PCA we look to reduce the R-matrix into smaller set of dimensions. Slide 8

EFA vs PCA EFA factors cause answers on questions – Want to generalize to another sample PCA questions cause components – Want to just describe this sample Drawing here Therefore, EFA is more common in psychology

Uses of EFA/PCA Understand structure of set of variables Construct a scale to measure the latent variable Reduce data set to smaller size that still measures original information

Slide 11 Graphical Representation

Example Data RAQ – R/Statistics Anxiety Questionnaire – 23 Questions covering R and statistics anxiety – Look at the questions! Be sure to reverse code any items that need it. Libraries – car, psych, GPArotation

Before you start Assumptions: – Accuracy, Missing – Outliers (use Mahalanobis to find outliers for all items) – Linear (!!) – Normal – Homogeneity/Homoscedasticity

Before you start What about additivity? – You want items to be correlated, that’s the point but not too high cuz then the math gets screwy. – How to check if they are too small: Bartlett’s Test – if non-significant implies that your items are not correlated enough (bad!). – Not used very much because large samples are required, which usually makes small correlations significant

Bartlett’s Test Load the psych library. Run and save a correlation test (just like you would for data screening). cortest.bartlett(correlation table, n = nrow(dataset))

Before you start Sample size suggestions – participants per item – <100 is not acceptable – ARGUE LOTS OF MONTE CARLOS! – 300 is the most agreed upon best bet

Before you start Sampling adequacy – do you have a large enough sample? – Kaiser-Meyer-Olkin (KMO) test – Compares the ratio between r 2 and pr 2 – Scores closer to 1 are better, closer to 0 are bad.90+ = yay,.80 = yayish,.70 = ok,.60 = meh,.<.50 = eek!

KMO Test Load the psych library. Use the saved correlations from the previous step. KMO(correlation table)

Before you start Number of items – You need at least 3-4 items per F/C – So with a 10 question scale, you can only have 2 or 3 F/C – The more the better! Need to be at least interval measurement

Questions to Answer 1.How many factors/components do I have? 2.Can I achieve simple structure? 3.Do I have an adequate solution?

1. # of Factors/Components Ways to determine number to extract – Theory – Kaiser criterion – Scree Plots – Parallel Analysis

1. # of Factors/Components Theory – Usually you have an idea of the number of latent constructs you expect – You made the scale that way – Previous research

1. # of Factors/Components Kaiser criterion – Note this is sometimes still used, but usually not recommended – Old rule: extract the number of eigenvalues over 1 – New rule: extract the number of eigenvalues over.7

1. # of Factors/Components Eigenvalues: – A mathematical representation of the variance accounted for by that grouping of items Confusing part: – You will see the number of eigenvalues as you have items because they are calculated before extraction – Only a few should be large

1. # of Factors/Components Scree plot – a graphical representation of eigenvalues – Look for a large drop

1. # of Factors/Components Parallel Analysis – a statistical test to tell you how many eigenvalues are greater than chance – Calculates the eigenvalues for your data – Randomizes your data and recalculates the eigenvalues – Then compares them to determine if they are equal

1. # of Factors/Components What to do if they disagree? – Test both models to determine which works better (steps 2 and 3) – Simpler solutions are better (i.e. less factors)

1. # of Factors/Components How to run the analysis to get this information: nofactors = ##save the output – fa.parallel(datasetname, fm="ml", fa="fa") What is ml and fa? (see step 2) – ML = Maximum Likelihood – FA = factor analysis

1. # of Factors/Components Get the eigenvalues – nofactors$fa.values (or sum them up, see r notes) In this example: – Old criterion says 1 factor – New criterion says 2 factors

1. # of Factors/Components

Scree plot suggests 1 large factor or 5 factors with the point of inflection. What about the parallel analysis? So we have 1 factor (2), 2, 5, or 7

2. Simple Structure Fitting estimation = MATH that is used to determine factor loadings. – How to pick? – Depends on what your goal is for the analysis.

2. Simple Structure EFAPCA Maximum LikelihoodPrincipal axis factoring Alpha factoringImage factoring Principal components

2. Simple Structure Rotation – rotation helps you achieve simple structure by increasing the communality between items To aid interpretation: maximize the loading of an item on one F/C while minimizing its loading on all other F/C – Orthogonal – Oblique

Slide 35 OrthogonalOblique

2. Simple Structure Orthogonal – assumes the F/C are uncorrelated – Rotates at 90 o – Means no overlap in variance between F/C – Not suggested for psychology Types – Varimax, quartermax, equamax

2. Simple Structure Oblique – assumes some correlation between F/C – Rotates at any degree – Allows F/C to overlap – If F/C are truly uncorrelated, you get the same results as orthogonal Types – Direct oblimin, promax So why ever do orthogonal?

2. Simple Structure Loadings – the correlation between that item and the F/C What to look for: – Items to load over.300 – Remember that r =.3 is a medium effect size that is ~10% variance – You can use higher loadings to help cut out low loading questions, but really can’t go lower.

2. Simple Structure Loadings – You want each item to load on one and only one F/C – Double loadings = indicate a bad item – No loading = indicate a bad item

2. Simple Structure Loadings – F/C with only one/two items loading onto it are considered unique – You should consider eliminating that F/C – Remember three or four items are suggested

2. Simple Structure What to do if bad items? – In this step you might run several rounds – Find the bad items, run the EFA/PCA again without them Cross loadings? – If there is a good theoretical reason, but generally not accepted

2. Simple Structure Finished? – When all items have loaded adequately

2. Simple Structure fa(dataset name, ##dataset nfactors=2, ##number of factors rotate = "oblimin", ##rotation type fm = "ml") ##math type, max likelihood In reality, you would check several models, but in the interest of time, we are doing two factors.

2. Simple Structure Start by looking at the loadings. – M1 = factor 1 (organized by variance accounted for – these may switch in the second round) – M2 = factor 2 Want these to be >.300 on ONE item only – H2 = communality (want high) – U2 = uniqueness (want low) – Com = complexity (want low)

2. Simple Structure

Figure out what items to remove: – Remove 23 because it doesn’t load on either factor. Run the factor analysis again without that item. This time all the items load cleanly.

3. Adequate solution So how can I tell if that simple solution is any good? – Fit indices – Reliability – Theory

3. Adequate solution Fit indices – a measure of how well the rotated matrix matches the original matrix Two types: – Goodness of fit – Residual statistics

3. Adequate solution Goodness of fit statistics – want large values, compares reproduced correlation matrix to real correlation matrix FitNameGoodAcceptablePoor NNFI/TLINon-normed fit index, Tucker-Lewis index >.95>.90<.90 CFIComparative fix index>.95>.90<.90 NFINormed fit index>.95>.90<.90 GFI, AGFI = don’t use

3. Adequate solution Residual statistics – want small values, look at the residual matrix (i.e. reproduced – real correlation table) FitNameGoodAcceptablePoor RMSEARoot mean square error of approximation < >.10 RMSRRoot mean square of the residual< >.10

3. Adequate solution RMSR and RMSEA check out ok! The TLI is bad . What about CFI?

3. Adequate solution CFI formula = (chi square model – df model) / (chi square null – df null) Use the code! You will have to save the output first. – Note all this information is in the basic output, but it’s not the easiest to read.

3. Adequate solution Reliability – an estimate of how much your items “hang together” and might replicate Cronbach’s alpha most common –.70 or.80 is acceptable Split-half reliability for big datasets – Splits data in half, runs reliabilities, checks how similar they are

Interpreting Cronbach’s Alpha Kline (1999) – Reliable if α >.7 Depends on the number of items – More questions = bigger α Treat Subscales separately Remember to reverse score reverse phrased items! – If not, α is reduced and can even be negative

3. Adequate solution Cronbach’s: – alpha(dataset with only columns for that subscale)

3. Adequate solution Theory – Do the item loadings make any sense? – Can you label the F/C? Look at how they load and see if you can come up with a label for the F/C.