Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006.

Similar presentations


Presentation on theme: "Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006."— Presentation transcript:

1 Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006 rubab@interchange.ubc.ca

2 Topics Multivariate Analysis of Variance (MANOVA) Multivariate Analysis of Variance (MANOVA) Factor Analysis Factor Analysis –Principal Component Analysis Logistic Regression Logistic Regression

3 MANOVA Extension of ANOVA Extension of ANOVA More than one dependent variable (DV) More than one dependent variable (DV) –Conceptual reason –Statistically related Compares the groups and tells whether there are group mean differences on the combination of the DVs Compares the groups and tells whether there are group mean differences on the combination of the DVs

4 Why not just conduct a series of ANOVAs? Risk of an inflated Type 1 error: Risk of an inflated Type 1 error: The more analyses you run, the more likely you are to find a significant result, even if in reality there are no differences between groups. If you choose to do so: Bonferroni adjustment--divide your alpha value.05 by the number of tests that you are intending to perform Bonferroni adjustment--divide your alpha value.05 by the number of tests that you are intending to perform

5 MANOVA: Pros and Cons MANOVA prevents the inflation of Type 1 error MANOVA prevents the inflation of Type 1 error Controls for correlation among a set of DVs by combining them Controls for correlation among a set of DVs by combining themHowever, A complex set of procedures A complex set of procedures Additional assumptions required Additional assumptions required

6 Example Research Question: Research Question: Do adolescent boys and girls differ in their problem behaviors? What you need? What you need? –One categorical IV (i.e., gender) –Two or more continuous DVs (e.g., depression, aggression, – etc.)

7 Example (cont’) What MANOVA does What MANOVA does –Tests the null hypothesis that the population means on a set of DVs do not vary across different levels of a grouping variable Assumptions Assumptions –sample size, normality, outliers, linearity, multicollinearity, homogeneity of variance-covariance matrices

8 Interpretation of the output Descriptive Statistics Descriptive Statistics –Check N values (more subjects in each cell than the number of DVs) Box’s Test Box’s Test –Checking the assumption of variance- covariance matrices Levene’s Test Levene’s Test –Checking the assumption of equality of variance

9 Interpretation (cont’) Multivariate tests Multivariate tests –Wilks’ Lambda (most commonly used) –Pillai’s Trace (most robust) (see Tabachnick & Fidell, 2007) Tests of between-subjects effects Tests of between-subjects effects –Use a Bonferroni Adjustment –Check Sig. column

10 Interpretation (cont’) Effect size Effect size –Partial Eta Squared: the proportion of the variance in the DV that can be explained by the IV (see Cohen, 1988) Comparing group means Comparing group means –Estimated marginal means Follow-up analyses Follow-up analyses (see Hair et al., 1998; Weinfurt, 1995)

11 Factor Analysis (FA) Not designed to test hypotheses Not designed to test hypotheses Data reduction technique Data reduction technique –Whether the data may be reduced to a smaller set of components or factors Used in the development and evaluation of tests and scales Used in the development and evaluation of tests and scales

12 Two main approaches in FA Exploratory factor analysis (EFA) Exploratory factor analysis (EFA) –Explore the interrelationships among a set of variables Confirmatory factor analysis (CFA) Confirmatory factor analysis (CFA) –Confirm specific hypotheses or theories concerning the structure underlying a set of variables

13 Principal Component Analysis (PCA) A technique similar to Factor Analysis in the sense that PCA also produces a smaller number of variables that accounts for most of the variability in the pattern or correlations A technique similar to Factor Analysis in the sense that PCA also produces a smaller number of variables that accounts for most of the variability in the pattern or correlationsHowever, Factor Analysis Factor Analysis –Mathematical model: only the shared variance in the variables is analyzed Principal Component Analysis Principal Component Analysis –All the variance in the variables are used

14 PCA or FA? If you are interested in a theoretical solution, use FA If you are interested in a theoretical solution, use FA If you want an empirical summary of your data set, use PCA If you want an empirical summary of your data set, use PCA (see Tabachnick & Fidell, 2001)

15 Steps involved in PCA Assessment of the suitability of the data Assessment of the suitability of the data –Sample size (see Stevens, 1996) –Strength of the relationship among the items an inspection of the correlation matrix r >.30 –Bartlett’s test of sphericity (p <.05) –Kaiser-Meyer Olkin (KMO) This index ranges from 0 to 1, with.6 suggested as the minimum value

16 Steps involved in PCA (cont’) Factor Extraction Factor Extraction –Determine the smallest number of factors that best represent the interrelations among the set of items –Various techniques (e.g., principal factor analysis, maximum likelihood factoring) –Determine the number of factors Kaiser’s criterion (eigenvalue > 1) Kaiser’s criterion (eigenvalue > 1) Scree test (plots each eigenvalue, find the point where the shape becomes horizontal) Scree test (plots each eigenvalue, find the point where the shape becomes horizontal)

17 Steps involved in PCA (cont’) Factor rotation and interpretation Factor rotation and interpretation –Orthogonal (uncorrelated) factor solutions Varimax is the most common technique –Oblique (correlated) factor solutions Direct Oblimin is the most common technique –Simple structure (Thurstone, 1947): each factor is represented by a number of strongly loading items

18 Example Research Question: Research Question: –What is the underlying factor structure of the Subjective Age Identity (SAI) scale? What you need What you need –A set of correlated continuous variables (i.e., items of the SAI scale) What PCA does What PCA does –Attempts to identify a small set of factors that represents the underlying relationships among a group of related variables (i.e., SAI items)

19 Example (cont’) Assumptions Assumptions –Sample size N > 150+ and a ratio of at least five cases for each of the items –Factorability of the correlation matrix r =.3 or greater; KMO ≥.6; Bartlett (p <.05) –Linearity –Outliers among cases

20 Interpretation of the output Is PCA appropriate? Is PCA appropriate? –Check Correlation Matrix –Check KMO and Bartlett’s test How many factors? Eigenvalue > 1 How many factors? Eigenvalue > 1 –Check the Total Variance Explained –Look at the Scree Plot

21 Interpretation (cont’) How many components are extracted? How many components are extracted? –Component Matrix –Rotated Component Matrix Look for the highest loading items on each of the component-this can be used to identify the nature of the underlying latent variable represented by each component

22 Logistic Regression Three types of regression Three types of regression –Bivariate –Multiple –Logistic* Relationships among variables Relationships among variables (NOT mean differences) One DV + 2 or more predictors or explanatory variables One DV + 2 or more predictors or explanatory variables *The DV is dichotomous *The DV is dichotomous *Core concept: Odds Ratio (OR) *Core concept: Odds Ratio (OR)

23 Logistic Regression Program A Program B Male200100 Female50150 For males, the odds of watching Program A are: 200/100 (or 2 to 1). For females, the odds of watching Program A are: 50/150 (or 1 to 3). To obtain the ratio of the odds for gender relative to Program A: This OR = (2/1) / (1/3) = 6 >Males are six time more likely to be watching Program A.

24 Example Research Question: Research Question: Are adolescent girls more likely to have anxiety/depression? What you need? What you need? –One categorical IV (i.e., gender) –One dichotomous DV (non-depressed=0 and depressed = 1)

25 Interpretation of the output Nagelkerke R 2 Nagelkerke R 2 Is the model significant? Wald’s Test Wald’s Test At the parameter-level of inference, is the gender variable significant?

26 Selected References Pallant, J. (2004). SPSS survival manual: A step by step guide to data analysis using SPSS (2nd ed.). Maidenhead: Open University Press. Pallant, J. (2004). SPSS survival manual: A step by step guide to data analysis using SPSS (2nd ed.). Maidenhead: Open University Press. Pett, M. A., Lackey, N. R., Sullivan, J. J. (2003). Making sense of factor analysis: The use of factor analysis for instrument development in health care research. Thousand Oaks, CA: Sage. Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th.ed.). Boston: Allyn & Bacon. Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th.ed.). Boston: Allyn & Bacon.


Download ppt "Introduction to Multivariate Analysis of Variance, Factor Analysis, and Logistic Regression Rubab G. ARIM, MA University of British Columbia December 2006."

Similar presentations


Ads by Google