Slide 1 Incorporating Nonmetric Data with Dummy Variables For many of the multivariate techniques we will study, it is assumed that the independent or dependent variables in the analysis are metric variables. If we have a nonmetric, or categorical, variable we can incorporate it into our analysis by converting the categorical variable to a set of dichotomous, dummy-coded variables. To dummy-code a variable, we first identify one category or subgroup of the nonmetric variable as the reference or comparison group. The effects which we identify in our analysis will be differences from the reference or comparison group. For example, suppose that I were contrasting salaries of men and women in some group of employees and I was interested in how women's salaries differed from men's salaries. Assuming there is a gender variable in my data set coded either male or female, I would select male as the reference category on the nonmetric gender variable. After we have identified the reference category, we create a new variable for each of the remaining categories or subgroups of the nonmetric variable. Thus, a nonmetric variable will be represented in the analysis by a number of new dichotomous variables equal to one less than the number of categories in the original nonmetric variable. In the salary example given above, with male selected as my reference group, the remaining group on the nonmetric gender variable is female, so I create a new variable called woman. (I usually would name it female but I don't want to have to have two entities with the same name in this example.) Finally, I code the new variable with one of two dichotomous values, usually 1 and 0. The new variable is assigned a 1 if the original variable indicated membership in the category represented by the new variable. If the subject was not a member of the category designated by the new variable, the new variable is coded 0 for that subject. In the example above, if a subject was in the female group of the gender variable, her code for the new woman variable is 1. If a subject was not in the female group of the gender variable, his code for the new woman variable is 0. Incorporating Nonmetric Data with Dummy Variables

Slide 2 Incorporating Nonmetric Data with Dummy Variables - 2 If the original nonmetric variable had three or more categories, we would create two or more new variables and code them with the same scheme. Suppose for example, that we have a variable for political identification, named partyid which contains three values for Republican, Democrat, and Independent. I select Independent as my reference category because I am interested in the effect of being a Republican or a Democrat. Dummy-coding requires that I create and code two new variables, one for Republican which I will name Repub and one for Democrat which I will name Demo. Each subject in the data set will be assigned a value for both the new variables, Repub and Demo, using the following scheme. If a subject is a Republican on the original partyid variable, they are assigned a value of 1 for the new Repub variable and a value of 0 for the new Demo variable. If a subject is a Democrat on the original partyid variable, they are assigned a value of 0 for the new Repub variable and a value of 1 for the new Demo variable. If a subject is an Independent on the original partyid variable, they are assigned a value of 0 for the new Repub variable and a value of 0 for the new Demo variable, because they are not Republican and they are not Democrat. Incorporating Nonmetric Data with Dummy Variables

Slide 3 Incorporating Nonmetric Data with Dummy Variables - 3 The task gets a little more confusing because the original variable is usually coded with numbers instead of text representing categories, as we will see in the SPSS example. In the World95.SAV data base is a nonmetric variable Region that contains six categories: Incorporating Nonmetric Data with Dummy Variables 1OECD (Organization for Economic Cooperation and Development) 2East Europe 3Pacific/Asia 4Africa 5Middle East 6Latin America

Slide 4 Incorporating Nonmetric Data with Dummy Variables - 4 Incorporating Nonmetric Data with Dummy Variables Suppose we wanted to dummy code the region variable and use OECD countries (coded number 1) as the reference or comparison group. We would recode this variable in SPSS as a set of five dummy variables (for the five remaining regions) with a series of 5 Recode commands. The dummy-coding scheme for the new variables is summarized in the following table. The first column contains the coding for the original variable. The five new columns show the 1 and 0 codes that would be assigned to each of the five new variables for each of the six values of the original variable. Original Variable "Region" and its coding First New Variable "easteuro" Second New Variable "pacific" Third New Variable "africa" Fourth New Variable "mideast" Fifth New Variable "latamer" 1 = OECD00000 2 = East Europe10000 3 = Pacific/Asia01000 4 = Africa00100 5 = Middle East00010 6 = Latin America00001 We will create the new variables in our SPSS data set.

