Discriminant analysis

Slides:



Advertisements
Similar presentations
Discriminant Analysis Database Marketing Instructor:Nanda Kumar.
Advertisements

Chapter 17 Overview of Multivariate Analysis Methods
Chapter Seventeen Copyright © 2006 McGraw-Hill/Irwin Data Analysis: Multivariate Techniques for the Research Process.
Discrim Continued Psy 524 Andrew Ainsworth. Types of Discriminant Function Analysis They are the same as the types of multiple regression Direct Discrim.
19-1 Chapter Nineteen MULTIVARIATE ANALYSIS: An Overview.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Discriminant Analysis – Basic Relationships
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 15: Model Building
Multiple Regression – Basic Relationships
Assumption of Homoscedasticity
Discriminant Analysis Testing latent variables as predictors of groups.
Decision Tree Models in Data Mining
Relationships Among Variables
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
SW388R7 Data Analysis & Computers II Slide 1 Discriminant Analysis – Basic Relationships Discriminant Functions and Scores Describing Relationships Classification.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Correspondence Analysis Chapter 14.
Selecting the Correct Statistical Test
Multiple Discriminant Analysis and Logistic Regression.
One-Way Manova For an expository presentation of multivariate analysis of variance (MANOVA). See the following paper, which addresses several questions:
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Discriminant Function Analysis Basics Psy524 Andrew Ainsworth.
بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.
Chapter Eighteen Discriminant Analysis Chapter Outline 1) Overview 2) Basic Concept 3) Relation to Regression and ANOVA 4) Discriminant Analysis.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Multivariate Data Analysis CHAPTER seventeen.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Understanding Regression Analysis Basics. Copyright © 2014 Pearson Education, Inc Learning Objectives To understand the basic concept of prediction.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Discriminant Analysis
18-1 © 2007 Prentice Hall Chapter Eighteen 18-1 Discriminant and Logit Analysis.
Chapter 12 – Discriminant Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Association log-linear analysis and canonical correlat ion analysis Chapter.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Examining Relationships in Quantitative Research
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
17-1 COMPLETE BUSINESS STATISTICS by AMIR D. ACZEL & JAYAVEL SOUNDERPANDIAN 6 th edition (SIE)
Statistical Analysis of Data1 of 38 1 of 42 Department of Cognitive Science Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 MANOVA Multivariate.
Multiple Discriminant Analysis
Chapter 16 Data Analysis: Testing for Associations.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Analysis of variance (ANOVA) (from Chapter 7)
Copyright © 2010 Pearson Education, Inc Chapter Eighteen Discriminant and Logit Analysis.
Review for Final Examination COMM 550X, May 12, 11 am- 1pm Final Examination.
Examining Relationships in Quantitative Research
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
1 Correlation and Regression Analysis Lecture 11.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Two-Group Discriminant Function Analysis. Overview You wish to predict group membership. There are only two groups. Your predictor variables are continuous.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Unit 7 Statistics: Multivariate Analysis of Variance (MANOVA) & Discriminant Functional Analysis (DFA) Chat until class starts.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Appendix I A Refresher on some Statistical Terms and Tests.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Stats Methods at IC Lecture 3: Regression.
Chapter 12 – Discriminant Analysis
BINARY LOGISTIC REGRESSION
Multiple Discriminant Analysis and Logistic Regression
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Presentation transcript:

Discriminant analysis Chapter 11

2-groups discriminant analysis Discriminant analysis is a statistical procedure which allows us to classify cases in separate categories to which they belong on the basis of a set of characteristic independent variables called predictors or discriminant variables The target variable (the one determining allocation into groups) is a qualitative (nominal or ordinal) one, while the characteristics are measured by quantitative variables. DA looks at the discrimination between two groups Multiple discriminant analysis (MDA) allows for classification into three or more groups.

Applications of DA DA is especially useful to understand the differences and factors leading consumers to make different choices allowing them to develop marketing strategies which take into proper account the role of the predictors. Examples Determinants of customer loyalty Shopper profiling and segmentation Determinants of purchase and non-purchase

Example on the Trust data-set Purchasers of chicken at the butcher’s shop (recorded in question q8d) Respondents may belong to one of two groups those who purchase chicken at the butcher’s shop those who do not Discrimination between these groups through a set of consumer characteristics expenditure on chicken in a standard week (q5) age of the respondent (q51) whether respondents agree (on a seven-point ranking scale) that butchers sell safe chicken (q21d) trust (on a seven-point ranking scale) towards supermarkets (q43b) Does a linear combination of these four characteristics allow one to discriminate between those who buy chicken at the butcher’s and those who do not?

Discriminant analysis(DA) Two groups only, thus a single discriminating value (discriminating score) For each respondent a score is computed using the estimated linear combination of the predictors (the discriminant function) Respondents with a score above the discriminating value are expected to belong to one group, those below to the other group. When the discriminant score is standardized to have zero mean and unity variance it is called Z score DA also provides information about the discriminating power of each of the original predictors

Multiple discriminant analysis(MDA)(1) Discriminant analysis may involve more than two groups, in which case it is termed multiple discriminant analysis (MDA). Example from the Trust data-set Dependent variable: Type of chicken purchased ‘in a typical week’, choosing among four categories: value (good value for money), standard, organic and luxury Predictors: age (q50), stated relevance of taste (q24a), value for money (q24b) and animal welfare (q24k), plus an indicator of income (q60)

Multiple discriminant analysis(2) In this case there will be more than one discriminant function. The exact number of discriminant functions is equal to either (g-1), where g is the number of categories in classification or to k, the number of independent variables, whichever is the smaller Trust example: four groups and five explanatory variables, the number of discriminant functions is three (that is g-1 which is smaller than k=5).

The output of MDS Similarities with factor (principal component) analysis the first discriminant function is the most relevant for discriminating across groups, the second is the second most relevant, etc. the discriminant functions are also independent, which means that the resulting scores are non-correlated. Once the coefficients of the discriminant functions are estimated and standardized, they are interpreted in a similar fashion to the factor loadings. The larger the standardised coefficients (in absolute terms), the more relevant the respective variables to discriminating between groups There is no single discriminant score in MDA group means are computed (centroids) for each of the discriminant functions to have a clearer view of the classification rule

Running discriminant analysis (two groups) Discriminant function (Target variable: purchasers of chicken at the butcher’s shop) Discriminant score Predictors weekly expenditure on chicken age safety of butcher’s chicken trust in supermarkets The a discriminant coefficients need to be estimated

Fisher’s linear discriminant analysis The discrimant function is the starting point Two key assumptions behind linear DA the predictors are normally distributed; the covariance matrices for the predictors within each of the groups are equal. Departure from condition (a) should suggest use of alternative methods (logistic regression, see lecture 16) Departure from condition (b) requires the use of different discriminant techniques (usually quadratic discriminant functions). In most empirical cases, the use of linear DA is appropriate

Estimation The first step is the estimation of the a coefficients, also termed as discriminant coefficients or weights Estimation is similar to factor analysis or PCA, as the coefficients are those which maximize the variability between groups In MDA the first discriminating function is the one with the highest between-group variability, the second discriminating function is independent from the first and maximizes the remaining between-group variability and so on

SPSS – two groups case 1. Choose the target variable 2. Define the range of the dependent variable 3.Select the predictors

Coefficient estimates Additional statistics and diagnostics Fisher’s and standardized estimates of the discriminant function coefficients need to be asked for

Classification options Decide whether prior probabilities are equal across groups or group sizes reflect different allocation probabilities These are diagnostic indicators to evaluate how well the discriminant function predict the groups

Save classification Create new variables in the data-set, containing the predicted group membership and/or the discriminant score for each case and each function

Output – coefficient estimates Most important predictor Trust in supermarkets has a – sign (thus it reduces the discriminant score) Standardized coefficients do not depend on the measurement unit Unstandardized coefficients depend on the measurement unit

Centroids These are the means of the discriminant score for each of the two groups Thus, the group of those not purchasing chicken at the butcher’s shop have a negative centroid With two groups, the discriminating score is zero This can be computed by weighting the centroids with the initial probabilities From these prior probabilities it follows that the discriminating score is -0.307 x 0.66 + 0.594 x 0.34 = 0

Output – classification success Using the discriminant function, it is possible to correctly classify 71.2% of original cases (244 no-no + 55 yes-yes)/420

Diagnostics (1) Box’s M test. This tests whether covariances are equal across groups Wilks’ Lambda (or U statistic) tests discrimination between groups. It is related to analysis of variance. Individual Wilks’Lambda for each of the predictors in a discriminant function; univariate ANOVA (are there significant differences in the predictor’s means between the groups?), p-value from the F distribution. Wilks’ Lambda for the function as a whole. Are there significant differences in the group means for the discriminant function p-value from the Chi-square distribution? The overall Wilks’ Lambda is especially helpful in multiple discriminant analysis as it allows one to discard those functions which do not contribute towards explaining differences between groups.

Diagnostics (2) DA returns one eigenvalue (or more eigenvalues for MDA) of the discriminant function. These can be interpreted as in principal component analysis In MDA (more than one discriminant function) eigenvalues are exploited to compute how each function contributes to explain variability The canonical correlation measures the intensity of the relationship between the groups and the single discriminant function

Trust example: diagnostics Covariance matrices are not equal Statistic P-value Box's M statistic 37.3 0.000 Overall Wilks' Lambda 0.85 Wilks Lambda for Expenditure 0.98 0.002 Age 0.97 0.001 Safer for Butcher 0.91 Trust in Supermarket Eigenvalue 0.18 Canonical correlation 0.39 % of correct predictions 71.2% The overall discriminating power of the DF is good All of the predictors are relevant to discriminating between the two groups The eigenvalue is the ratio between variances between and variance within groups (the larger the better) Square root of the ratio between variability between and total variability

MDA To run MDA in SPSS the only difference is that the range has more than two categories

Predictors Three predictors only appear to be relevant in discriminating among preferred types of chicken Null rejected at 95% c.l., but not at 99% c.l.

Discriminant functions Three discriminant functions (four groups minus one) can be estimated The first two discriminant functions have a significant discriminating power.

Coefficients Value for money is very relevant for the second function Income is very relevant for the first function

Structure matrix The values in the structure matrix are the correlations between the individual predictors and the scores computed on the discriminant functions. For example, the income variable has a strong correlation with the scores of the first function The structure matrix help interpreting the functions Value and taste Income Age

Centroids The second allows some discrimination standard-organic, value-standard, organic-luxury (taste and value matter) The first function discriminates well between value and organic (income matters to organic buyers)

Plot of two functions Tick ‘separate-groups’ to show graphs of the first two functions for each individual group The ‘territorial map’ shows the scores for the first two functions considering all groups

Plots: individual groups Example: organic chicken Most cases tend to be relatively high on function 1 (income) Example: organic chicken Most cases tend to be relatively high on function 1 (income)

Plots – all groups

Prediction results The functions do not predict well; most units are allocated to standard chicken – on average only 56.3% of the cases are allocated correctly

Stepwise discriminant analysis As for linear regression it is possible to decide whether all predictors should appear in the equation regardless of their role in discriminating (the Enter option) or a sub-set of predictors is chosen on the basis of their contribution to discriminating between groups (the Stepwise method)

The step-wise method A one-way ANOVA test is run on each of the predictors, where the target grouping variable determines the treatment levels. The ANOVA test provides a criterion value and tests statistics (usually the Wilks Lambda). According to the criterion value, it is possible to identify the predictor which is most relevant in discriminating between the groups The predictor with the lowest Wilks Lambda (or which meets an alternative optimality criterion) enters the discriminating function, provided the p-value is below the set threshold (for example 5%). An ANCOVA test is run on the remaining predictors, where the covariates are the target grouping variables and the predictors that have already entered the model. The Wilks Lambda is computed for each of the ANCOVA options. Again, the criteria and the p-value determine which variable (if any) enter the discriminating function (and possibly whether some of the entered variables should leave the model). The procedure goes back to step 3 and continues until none of the excluded variables have a p-value below the threshold and none of the entered variables have a p-value above the threshold (the stopping rule is met).

Alternative criteria Unexplained variance Smallest F ratio Mahalanobis distance Rao’s V

In SPSS The step-wise method allows selection of relevant predictors

Output of the step-wise method Only two predictors are kept in the model