Multiple Discriminant Analysis and Logistic Regression.

Slides:



Advertisements
Similar presentations
MANOVA (and DISCRIMINANT ANALYSIS) Alan Garnham, Spring 2005
Advertisements

A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Chapter 17 Overview of Multivariate Analysis Methods
Linear Discriminant Function LDF & MANOVA LDF & Multiple Regression Geometric example of LDF & multivariate power Evaluating & reporting LDF results 3.
Discrim Continued Psy 524 Andrew Ainsworth. Types of Discriminant Function Analysis They are the same as the types of multiple regression Direct Discrim.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Discriminant Analysis – Basic Relationships
What Is Multivariate Analysis of Variance (MANOVA)?
Discriminant Analysis Objective Classify sample objects into two or more groups on the basis of a priori information.
Chapter 5 Multiple Discriminant Analysis
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Multivariate Analysis Techniques
Discriminant analysis
Segmentation Analysis
An Illustrative Example of Logistic Regression
Three-Group Illustrative Example of Discriminant Analysis
Multivariate Data Analysis Chapter 8 - Canonical Correlation Analysis.
بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.
Chapter Eighteen Discriminant Analysis Chapter Outline 1) Overview 2) Basic Concept 3) Relation to Regression and ANOVA 4) Discriminant Analysis.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Multivariate Data Analysis CHAPTER seventeen.
1 Multivariate Analysis (Source: W.G Zikmund, B.J Babin, J.C Carr and M. Griffin, Business Research Methods, 8th Edition, U.S, South-Western Cengage Learning,
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Discriminant Analysis
18-1 © 2007 Prentice Hall Chapter Eighteen 18-1 Discriminant and Logit Analysis.
Chapter 12 – Discriminant Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Factor Analysis © 2007 Prentice Hall. Chapter Outline 1) Overview 2) Basic Concept 3) Factor Analysis Model 4) Statistics Associated with Factor Analysis.
Two-Group Illustrative Example of Discriminant Analysis
Business Intelligence and Decision Modeling Week 11 Predictive Modeling (2) Logistic Regression.
Slide 1 A Problem in Personnel Classification This problem is from Phillip J. Rulon, David V. Tiedeman, Maurice Tatsuoka, and Charles R. Langmuir. Multivariate.
Multivariate Data Analysis Chapter 5 – Discrimination Analysis and Logistic Regression.
Customer Research and Segmentation Basis for segmentation is heterogeneous markets Three imp. Definitions: Mkt. Segment, Mkt. Segmentation, Target market.
Lecture Slide #1 Logistic Regression Analysis Estimation and Interpretation Hypothesis Tests Interpretation Reversing Logits: Probabilities –Averages.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Stats Multivariate Data Analysis. Instructor:W.H.Laverty Office:235 McLean Hall Phone: Lectures: M W F 9:30am - 10:20am McLean Hall.
Multiple Discriminant Analysis
Copyright © 2010 Pearson Education, Inc Chapter Eighteen Discriminant and Logit Analysis.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Linear Discriminant Analysis and Its Variations Abu Minhajuddin CSE 8331 Department of Statistical Science Southern Methodist University April 27, 2002.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Explain the difference between dependence and interdependence.
Multivariate Data Analysis Chapter 1 - Introduction.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
Logistic Regression. Linear Regression Purchases vs. Income.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
Linear Discriminant Analysis and Logistic Regression.
Multidimensional Scaling and Correspondence Analysis © 2007 Prentice Hall21-1.
Two-Group Discriminant Function Analysis. Overview You wish to predict group membership. There are only two groups. Your predictor variables are continuous.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
D/RS 1013 Discriminant Analysis. Discriminant Analysis Overview n multivariate extension of the one-way ANOVA n looks at differences between 2 or more.
Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.
MANOVA Lecture 12 Nuance stuff Psy 524 Andrew Ainsworth.
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
Conjoint Analysis. 1. Managers frequently want to know what utility a particular product feature or service feature will have for a consumer. 2. Conjoint.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Chapter 12 – Discriminant Analysis
BINARY LOGISTIC REGRESSION
MANOVA Dig it!.
Customer Research and Segmentation
CH 5: Multivariate Methods
Multiple Discriminant Analysis and Logistic Regression
Multiple Discriminant Analysis and Logistic Regression
Multivariate Data Analysis
Response Analysis.
Product moment correlation
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Chapter 7 Multivariate Analysis of Variance
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Presentation transcript:

Multiple Discriminant Analysis and Logistic Regression

Multiple Discriminant Analysis Appropriate when dep. var. is categorical and indep. var. are metric MDA derives variate that best distinguishes between a priori groups MDA sets variate’s weights to maximize between-group variance relative to within- group variance

MDA For each observation we can obtain a Discriminant Z-score Average Z score for a group gives Centroid Classification done using Cutting Scores which are derived from group centroids Statistical significance of Discriminant Function done using distance bet. group centroids LR similar to 2-group discriminant analysis

The MDA Model Six-stage model building for MDA Stage 1: Research problem/Objectives a. Evaluate differences bet. avg. scores for a priori groups on a set of variables b. Determine which indep. variables account for most of the differences bet. groups c. Classify observations into groups

The MDA Model Stage 2: Research design a. Selection of dep. and indep. variables b. Sample size considerations c. Division of sample into analysis and holdout sample

The MDA Model Stage 3: Assumptions of MDA a. Multivariate normality of indep. var. b. Equal Covariance matrices of groups c. Indep. vars. should not be highly correlated. d. Linearity of discriminant function Stage 4: Estimation of MDA and assessing fit a. Estimation can be i. Simultaneous ii. Stepwise

The MDA Model Step 4: Estimation and assessing fit (contd) b. Statistical significance of discrim function i. Wilk’s lambda, Hotelling’s trace, Pillai’s criterion, Roy’s greatest root ii. For stepwise method, Mahalanobis D 2, iii. Test stat sig. of overall discrimination between groups and of each discriminant function

MDA and LR (contd) Step 4: Estimation and assessing fit (contd) c. Assessing overall fit i. Calculate discrim. Z-score for each obs. ii. Evaluate group differences on Z scores iii. Assess group membership prediction accuracy. To do this we need to address following - rationale for classification matrices

The MDA Model Step 4: Estimation and assessing fit (contd) c. Assessing overall fit(contd.) iii. Address the following (contd.) - cutting score determination - consider costs of misclassification - constructing classification matrices - assess classification accuracy - casewise diagnostics

The MDA Model Stage 5: Interpretation of results a. Methods for single discrim. function i. Discriminant weights ii. Discriminant loadings iii. Partial F-values b. Additional methods for more than 2 functions i. Rotation of discrim. functions ii. Potency index iii. Stretched attribute vectors

The MDA Model Stage 6: Validation of results

Logistic Regression For 2 groups LR is preferred to MDA because 1. More robust to failure of MDA assumptions 2. Similar to regression, so intuitively appealing 3. Has straightforward statistical tests 4. Can accommodate non-linearity easily 5. Can accommodate non-metric indep var. through dummy variable coding

The LR Model Six stage model building for LR Stage 1: Research prob./objectives (same as MDA) Stage 2: Research design (same as MDA) Stage 3: Assumptions of LR (same as MDA) Stage 4: Estimating LR and assessing fit a. Estimation uses likelihood of an event’s occurrence

The LR Model Stage 4: Estimating LR and assessing fit (contd) b. Assessing fit i. Overall measure of fit is -2LL ii.Calculation of R 2 for Logit iv. Assess predictive accuracy

The LR Model Step 5: Interpretation of results a. Many MDS methods can be used b. Test significance of coefficients Step 6: Validation of results

Example: Discriminant Analysis HATCO is a large industrial supplier A marketing research firm surveyed 100 HATCO customers There were two different types of customers: Those using Specification Buying and those using Total Value Analysis HATCO mgmt believes that the two different types of customers evaluate their suppliers differently

Example: Discriminant Analysis In a B2B situation, HATCO wanted to know the perceptions that its customers had about it The mktg res firm gathered data on 7 variables 1. Delivery speed 2. Price level 3. Price flexibility 4. Manufacturer’s image 5. Overall service 6. Salesforce image 7. Product quality Each var was measured on a 10 cm graphic rating scale PoorExcellent

Example: Discriminant Analysis Stage 1: Objectives of Discriminant Analysis Which perceptions of HATCO best distinguish firms using each buying approach? Stage 2: Research design a. Dep var is the buying approach of customers. It is categorical. Indep var are X 1 to X 7 as mentioned above b. Overall sample size is 100. Each group exceeded the minimum of 20 per group c. Analysis sample size was 60 and holdout sample size was 40

Example: Discriminant Analysis Stage 3: Assumptions of MDA All the assumptions were met Stage 4: Estimation of MDA and assessing fit Before estimation, we first examine group means for X 1 to X 7 and the significances of difference in means a. Estimation is done using the Stepwise procedure. - The indep var which has the largest Mahalanobis D 2 distance is selected first and so on, till none of the remaining var are significant - The discriminant function is obtained from the unstandardized coefficients

Example: Discriminant Analysis Stage 4: Estimation of MDA and assessing fit (cont) b. Univariate and multivariate aspects show significance c. Discrim Z-score for each observation and group centriods were calculated - The cutting score was calculated n A =Number in Group A (Total Value Analysis) n B =Number in Group B (Specification Buying) z A =Centroid of Group A z B =Centroid of Group B Cutting Score, z C = (n A z B +n B z A )/(n A +n B )

Example: Discriminant Analysis Stage 4: - The cutting score was calculated as Classification matrix was calculated by classifying an observation as Specification buying/Total value analysis if it’s Z-score was less/greater than – Classification accuracy was obtained and assessed using certain benchmarks

Example: Discriminant Analysis Step 5: Interpretation -Since we have a single discriminant function, we will look at the discriminant weights, loadings and partial F values - Discriminant loadings are more valid for interpretation. We see that X 7 discriminates the most followed by X 1 and then X 3 - Going back to table of group means, we see that firms employing Specification Buying focus on ‘Product quality’, whereas firms using Total Value Analysis focus on ‘Delivery speed’ and ‘Price flexibility’ in that order

Example: Logistic Regression A cataloger wants to predict response to mailing Draws sample of 20 customers Uses three variables - RESPONSE (0=no/1=yes) the dep var - AGE (in years) an indep var - GENDER (0=male/1=female) an indep var Use Dummy variables for categorical variables

Example: Logistic Regression Running the logistic regression program gives G = AGE GENDER Here G is the Logit of a yes response to mailing Consider a male of age 40. His G or logit score is G(0, 40) = * *0 =.37 logit A female customer of same age would have G(1, 40) = * *1 = 2.67 logits Logits can be converted to Odds which can be converted to probabilities For the 40 year old male/female prob is p =.59/.93