Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?

Similar presentations


Presentation on theme: " Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?"— Presentation transcript:

1

2  Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?

3 42 75 43 56  Can think of this analysis as MANOVA turned inside out. WomenMen Left Right Right Women Right Men Left Woman Left Men

4  Given your mean, can we predict what group you would be in?  IVs = predictors  DVs = groups, grouping variables

5  You have classifications – 1 (groups basically) ways to linearly combine IVs to create group membership ◦ So if you have 3 groups, then there are several ways to combine variables to classify

6  If you have a third group, you may have a second discriminate function.  Trying to understand what IVs are used to discriminate between groups can be kind of limited here

7  Find a function to predict group membership.

8  Significant predictions – can membership be predicted reliably, can we do better than chance?

9  Number of significant discriminate functions ◦ Groups/classifications – 1 is the max number you will get ◦ 1 st discriminate function accounts for the most variance, provides best separation between groups ◦ 2 nd function is orthogonal, but may still be significant

10  Dimensions of discrimination – what are the IVs that separate the groups? ◦ What is the pattern of relationships of IVs to the discriminate function (similar to EFA).

11  Classification function – how can we weight scores to discriminate (creates regression equations)

12  Adequacy of classification – how many cases are classified correctly? ◦ When there are mistakes, where do they happen?

13  Effect size? ◦ Finds a canonical correlate with each function, so you can see how much of the variance in groups is accounted for by each discriminate function.

14  Which predictors are the most important?  If covariates, what IVs are important after the controls?  Estimation of group means (centroids) – which means do the discriminate functions separate?

15  Are the groups naturally occurring or did we randomly assign them?

16  Since it’s mostly about classification, it’s ok if distributions are a bit weird as long as the discriminate function is good.  Whenever MANOVA works best, discriminate works best.

17  Unequal N – not a big deal. ◦ But does influence with very small cells ( ◦ Discriminate function will be biased because that probability of that cell is so small ◦ Sample size of smallest group > IVs

18  Missing data – needs to be replaced or eliminated. ◦ Dependent data ◦ Category data

19  Normality ◦ Robust but we assume linear combinations of the IVs are normal – but not a good way to test this idea. ◦ Start to have problems if there are unequal N and the sample size is small.  Want 20 cases in smallest sample.

20  Outliers – you are trying to predict an individuals’ group, so outliers = no good. ◦ Univariate and multivariate outliers need to eliminated. ◦ BUT run outliers separately for each classification/group.

21  Homogeneity – Box’s M is still sensitive ◦ Can also check out scatter plots of scores on 1 st and 2 nd discriminate functions separated for each group (SPSS plots) ◦ If fails:  Use nonparametric – log regression

22  Linearity – making linear combinations of IVs AND we are making regressions. Definitely need. ◦ Less serious errors because it just reduces power.

23  Multicollinearity – since this is regression you do not want 2 predictors that measure the same thing.

24  Same as MANOVA (to a point).  MANOVA – tests if there are differences in combinations of means (DVs) for groups (IVs) ◦ Then tests which DVs are best for group separation.

25  Discriminate function – ◦ D = dz1 + dz2 + dz3 … ◦ D = discriminate score ◦ d = discriminate function coefficient  Get by doing canonical variates  Basically canonical correlation as: Group member predictor

26  Discriminate function – ◦ D = dz1 + dz2 + dz3 … ◦ d = chosen to maximize group differences  Very similar to beta ◦ Z = DVs – they are z-scored because then that makes it easy to see ds weight in equation and gives D:  SD = 1, Mean = 0  Important for categorical prediction

27  Separates as so:  0-> 1 = group 1  -1 -> 0 = group 2

28  Classification equation ◦ You are assigned to the group where you have the highest classification score

29  Classification ◦ C = Constant + cX1 + cX2 ◦ C = classification group ◦ c = classification coefficient ◦ X = raw score

30  Standard (direct) – each predictor enters the equation at the same time and only assigned unique variance ◦ Test of means = MANOVA ◦ Test of discriminate functions – canonical correlation

31  Sequential (hierarchical) – you determine order predictors enter discriminate function ◦ You are testing if a new predictor adds better classification to this equation  Similar to MANCOVA  Good for smaller number of predictors and theory driven arguments  (unfortunately, there’s not a good way to do this in SPSS, instead do it as a hierarchical regression where DV is coded as 0 and 1).

32  Stepwise (statistical) – predictors enter equation based on some cut off you use. ◦ If you have 10 of the same predictors, it might be a good way to eliminate overlapping ones. ◦ But but! Dependent on sample you select ◦ You can use R2, F, change in group centers, etc.

33  Inference ◦ Criteria for overall statistical significance  Use Wilk’s lambda as with MANOVA since it’s the same test.  Stepwise – you get two more options  Mahalanobis D2 and Rao’s V based on group centroid differences

34  Inference ◦ Number of discriminate functions  If you have a lot of groups, you’ll get several discriminate functions but they may not be significant, usually around 2 are significant.  Evaluates like canonical correlation – eigenvalues, % of variance, chi-square

35  Interpretation Discriminate 1 Discriminate 2 Dots are discriminate function centroids (means) of each group

36  Cross validation – see if your discriminate function correctly classifies a new sample ◦ Split half testing ◦ Can do this by jackknifed classification (leave-it- out option in SPSS).


Download ppt " Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?"

Similar presentations


Ads by Google