Two-Group Discriminant Function Analysis. Overview You wish to predict group membership. There are only two groups. Your predictor variables are continuous.

Slides:



Advertisements
Similar presentations
Multivariate Statistics An Introduction & Multidimensional Contingency Tables.
Advertisements

Correlation and Linear Regression.
Multivariate Statistics
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Chapter 17 Overview of Multivariate Analysis Methods
Structural Equation Modeling
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Discrim Continued Psy 524 Andrew Ainsworth. Types of Discriminant Function Analysis They are the same as the types of multiple regression Direct Discrim.
Discriminant Analysis – Basic Relationships
Analysis of Variance & Multivariate Analysis of Variance
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 14 Inferential Data Analysis
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Discriminant Analysis Testing latent variables as predictors of groups.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Discriminant analysis
SW388R7 Data Analysis & Computers II Slide 1 Discriminant Analysis – Basic Relationships Discriminant Functions and Scores Describing Relationships Classification.
Leedy and Ormrod Ch. 11 Gray Ch. 14
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Selecting the Correct Statistical Test
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
Chapter 13: Inference in Regression
Karl L. Wuensch Department of Psychology East Carolina University
One-Way Manova For an expository presentation of multivariate analysis of variance (MANOVA). See the following paper, which addresses several questions:
SPSS Series 1: ANOVA and Factorial ANOVA
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Discriminant Function Analysis Basics Psy524 Andrew Ainsworth.
Chapter Eighteen Discriminant Analysis Chapter Outline 1) Overview 2) Basic Concept 3) Relation to Regression and ANOVA 4) Discriminant Analysis.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Discriminant Analysis
18-1 © 2007 Prentice Hall Chapter Eighteen 18-1 Discriminant and Logit Analysis.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
ANOVA and Linear Regression ScWk 242 – Week 13 Slides.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
17-1 COMPLETE BUSINESS STATISTICS by AMIR D. ACZEL & JAYAVEL SOUNDERPANDIAN 6 th edition (SIE)
Statistical Analysis of Data1 of 38 1 of 42 Department of Cognitive Science Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 MANOVA Multivariate.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
ANOVA: Analysis of Variance.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Copyright © 2010 Pearson Education, Inc Chapter Eighteen Discriminant and Logit Analysis.
Review for Final Examination COMM 550X, May 12, 11 am- 1pm Final Examination.
Stats Lunch: Day 8 Repeated-Measures ANOVA and Analyzing Trends (It’s Hot)
MANOVA One-Way. MANOVA This is just a DFA in reverse. You predict a set of continuous variables from one or more grouping variables. Often used in an.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Discriminant Analysis Dr. Satyendra Singh Professor and Director University of Winnipeg, Canada
Tutorial I: Missing Value Analysis
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
ANCOVA.
Principal Component Analysis
D/RS 1013 Discriminant Analysis. Discriminant Analysis Overview n multivariate extension of the one-way ANOVA n looks at differences between 2 or more.
MANOVA Lecture 12 Nuance stuff Psy 524 Andrew Ainsworth.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
STATISTICAL TESTS USING SPSS Dimitrios Tselios/ Example tests “Discovering statistics using SPSS”, Andy Field.
Appendix I A Refresher on some Statistical Terms and Tests.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
BINARY LOGISTIC REGRESSION
MANOVA Dig it!.
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multivariate Statistics
Presentation transcript:

Two-Group Discriminant Function Analysis

Overview You wish to predict group membership. There are only two groups. Your predictor variables are continuous. You will create a discriminant function, D, such that the two groups differ as much as possible on D.

The Discriminant Function Weights are chosen such that were you to compute D for each case and then use ANOVA to compare the groups on D, the ratio of SS between to SS within would be as large as possible. The value of this ratio is the eigenvalue.

Harass90.sav Data are from Castellow, W. A., Wuensch, K. L., & Moore, C. H. (1990). Effects of physical attractiveness of the plaintiff and defendant in sexual harassment judgments. Journal of Social Behavior and Personality, 5, 547 ‑ 562. Download from my SPSS Data page. Bring into SPSS. Look at the data sheet.

The Variables Verdict is mock juror’s recommendation, guilty (finding in favor of the plaintiff) or not (favor of defendant) d_excit through d_happy are ratings of the defendant on various characteristics. p_excit through p_happy are ratings of the plaintiff on the same characteristics.

Do the Analysis Click Analyze, Classify, Discriminant. Put all 22 rating scales (d_excit through p_happy) in the “Independents” box. Put verdict into the “Grouping variable” box.

Now click on “verdict(? ?)” and then on “Define Range.”

Enter “1” (Guilty) and “2” (Not) Click Continue. Back on the initial page, click “Statistics.”

Check all of the Descriptives and Function Coefficients. Click Continue. Back on the initial page, click “Classify.”

Select “Compute from group sizes” for the Priors and “Summary Table” for Display. Click Continue. Back on the initial page, click “Save.”

Check “Discriminant scores” then click “Continue.” Back on the initial page, click “OK.”

Group Means Look at the Group Statistics. –Mean ratings of defendant are more favorable in the Not Guilty group than in the Guilty group. –Mean ratings of the plaintiff are more favorable in the Guilty group than in the Not Guilty group.

Univariate ANOVAs Look at the “Tests of Equality of Group Means.” –The groups differ significantly on all but four of the variables.

Box’s M Look at “Box’s Test of Equality of Covariance Matrices.” The null is that the variance-covariance matrix for the predictors is, in the populations, the same in both groups. This is assumed when pooling these matrices during the DFA. Box’s M is very powerful.

We generally do not worry unless M is significant at.001. We may use elect to use Pillai’s Trace as the test statistic, as it is more robust to violation of this assumption. Classification may be more affected by violation of this assumption than are the tests of significance.

Eigenvalue and Canonical Correlation Look at the “Eigenvalues.” –The eigenvalue, 1.187, is for comparing the groups on the discriminant function. –The canonical correlation,.737, is and is equivalent to eta from ANOVA on D or point-biserial r beteen Groups and D.

Wilks’ Lambda This is 1   = eta-squared = 1 .457 =.543 Eta-squared = canonical corr squared = =.543 The grouping variable accounts for 54% of the variance in D.

Test of Significance The smaller , the more doubt cast on the null that the groups do not differ on D. Chi-square is used to test that null. It is significant. F can also be used. There are other multivariate tests (such as Pillai’s trace) available elsewhere.

Canonical Discriminant Function Coefficients SPSS gives us both the standardized weights And the unstandardized weights

DFC Weights vs Loadings The usual cautions apply to the weights – for example, redundancy among predictors will reduce the values of weights. The loadings in the structure matrix provide a better picture of how the discriminant function relates to the original variables.

The Loadings These are correlations between D and the predictor variables. For our data, high D is associated with the defendant being rated as socially desirable and the plaintiff being rated as socially undesirable.

Standardized Weights vs Loadings The predictors with the largest standardized coefficients were D_sincerity, D_warmth, D_kindness, P_sincerity, P_strength, and P_warmth. The predictors with the highest loadings are D_sincerity and P_warmth. The standardized weight for D_warmth is negative but its loading positive, indicating suppression.

Group Centroids These are simply the group means on D. For those who found in favor of the defendant, D was high: For those who found in favor of the plaintiff, D was low: .785.

Bayes Rule For each case, probability of membership for each group is computed from the priors – p(G i ) – and the likelihoods – p(D|G i ).

Classification Each case is classified into the group for which the posterior probability – p(G i |D) – is greater. By default, SPSS uses equal priors – for two groups, that would be.5. We asked that priors be computed from group frequencies, which should result in better classification.

Prior Probabilities Look at “Prior Probabilities for Groups.” As you can see, 65.5% of the subjects voted Guilty, 34.5% Not Guilty. Using this information should produce better classification.

Fisher’s Classification Function Coefficients For each case, two scores are computed from the weight’s and the subject’s scores on the variables. D 1 is the score for Group 1 D 2 is the score for Group 2 If D 1 > D 2, the case is classified into Group 1. If D 2 > D 1, into Group 2.

Classification Results This table shows how well the classification went. Of the 95 subjects who voted Guilty, 85 were correctly classified – 89.5% success. Of the 50 subjects who voted Not Guilty, 43 were correctly classified – 86%. Overall, 88.3% were correctly classified.

How Well Would We Have Done Just By Guessing? Random Guessing – for each case, flip a fair coin. Heads you predict Group 1, tails Group 2. P(Correct) =.5(.655) +.5(.345) = 50%

Probability Matching In a situation like this, human guessers tend to take base rates into account, guessing for one or the other option in proportion to how frequently those guesses are correct. P(Correct) =.655(.655) +.345(.345) = 55% correct.

Probability Maximizing In this situation goldfish guess the more frequent option every time, and outperform humans. P(Correct) = 65.5%

Assumptions Multivariate normality Equality of variance-covariance matrices across groups

ANOVA on D Click Analyze, Compare Means, One-Way ANOVA. Put Verdict in the “Factor Box,” the Discriminant scores in the “Dependent” list. Click OK.

SS Between  SS Within =  143 = = the eigenvalue. SS Between  SS Total =  =.543 = eta-squared = SQRT(Can. Corr) = SQRT(.737) =.543 = 1   = 1 .457 =.543.

Correlate D With Groups Analyze, Correlate, Bivariate. Click OK. r =.737 = Canonical Corr.

SAS Download Harass90.dat and DFA2.sas. Edit the program to point to the data file and run the program. The output is essentially the same as that from SPSS, but arranged differently. “Likelihood Ratio” is Wilks Lambda.

For loadings and coefficients, use the “Pooled Within” tables. Classification Results for Calibration Data – here are the posterior probabilities for misclassifications.

Multiple Regression to Predict Verdict Look at the Proc Reg statement. This is a simple multiple regression to predict verdict from the ratings variables. The R-square = eta-squared from the DFA SS model  SS error = the eigenvalue from the DFA. SS error  SS total = .

The parameter estimates (unstandardized slopes) are perfectly correlated with the DFA coefficients. Multiply any of the multiple regression parameter estimates by and you get the DFA unstandardized coefficient.