Presentation is loading. Please wait.

Presentation is loading. Please wait.

SW388R7 Data Analysis & Computers II Slide 1 Incorporating Nonmetric Data with Dummy Variables The logic of dummy-coding Dummy-coding in SPSS.

Similar presentations


Presentation on theme: "SW388R7 Data Analysis & Computers II Slide 1 Incorporating Nonmetric Data with Dummy Variables The logic of dummy-coding Dummy-coding in SPSS."— Presentation transcript:

1 SW388R7 Data Analysis & Computers II Slide 1 Incorporating Nonmetric Data with Dummy Variables The logic of dummy-coding Dummy-coding in SPSS

2 SW388R7 Data Analysis & Computers II Slide 2 Dummy-coding variables  For many of the multivariate techniques we will study, it is assumed that the independent or dependent variables in the analysis are metric variables. If we have a nonmetric, or categorical, variable we can incorporate it into our analysis by converting the categorical variable to a set of dichotomous, dummy-coded variables.  A dichotomous variable arguably satisfies the interval level of measurement. On some construct, one of the categories represents more or less of the construct, so the definition of ordinal data is satisfied. Moreover, since there are only two categories, the unit of measure between them must be equal for all categories, satisfying the definition of interval data.

3 SW388R7 Data Analysis & Computers II Slide 3 Selecting a reference group  To dummy-code a variable, we first identify one category or subgroup of the nonmetric variable as the reference or comparison group.  The effects which we identify in our analysis will be differences from the reference or comparison group.  For example, suppose that I were contrasting salaries of men and women in some group of employees and I was interested in how women's salaries differed from men's salaries. Assuming there is a "gender" variable in my data set coded either "male" or "female," I would select "male" as the reference category on the nonmetric "gender" variable.

4 SW388R7 Data Analysis & Computers II Slide 4 A two-category example - 1  After we have identified the reference category, we create a new variable for each of the remaining categories or subgroups of the nonmetric variable.  Thus, a nonmetric variable will be represented in the analysis by a number of new dichotomous variables equal to one less than the number of categories in the original nonmetric variable.  In the salary example given above, with "male" selected as my reference group, the remaining group on the nonmetric gender variable is "female," so I create a new variable called "women." (I usually would name it "female" but I don't want to have to have two entities with the same name in this example.)

5 SW388R7 Data Analysis & Computers II Slide 5 A two-category example - 2  Finally, I code the new variable with one of two dichotomous values, usually 1 and 0.  The new variable is assigned a 1 if the original variable indicated membership in the category represented by the new variable.  If the subject was not a member of the category designated by the new variable, the new variable is coded 0 for that subject.  In the example above, if a subject was in the "female" group of the "gender" variable, her code for the new "women" variable is 1. If a subject was not in the "female" group of the "gender" variable, his code for the new "women" variable is 0.

6 SW388R7 Data Analysis & Computers II Slide 6 A three-category example - 1  If the original nonmetric variable had three or more categories, we would create two or more new variables and code them with the same scheme.  Suppose for example, that we have a variable for political identification, named "partyid" which contains three values for "Republican," "Democrat," and "Independent." I select "Independent" as my reference category because I am interested in the effect of being a Republican or a Democrat.  Dummy-coding requires that I create and code two new variables, one for "Republican" which I will name "Repub" and one for "Democrat" which I will name "Demo."

7 SW388R7 Data Analysis & Computers II Slide 7 A three-category example - 2  Each subject in the data set will be assigned a value for both the new variables, "Repub" and "Demo," using the following scheme:  If a subject is a "Republican" on the original "partyid" variable, they are assigned a value of 1 for the new "Repub" variable and a value of 0 for the new "Demo" variable.  If a subject is a "Democrat" on the original "partyid" variable, they are assigned a value of 0 for the new "Repub" variable and a value of 1 for the new "Demo" variable.  If a subject is an "Independent" on the original "partyid" variable, they are assigned a value of 0 for the new "Repub" variable and a value of 0 for the new "Demo" variable, because they are not Republican and they are not Democrat.

8 SW388R7 Data Analysis & Computers II Slide 8 Example in SPSS  In GSS2000R, the variable marital contains five categories: married, widowed, divorce, separated, and never married.  Assuming my research question dealt with marital experiences, the never married category is selected as the reference category.  We will create four other variables to represent each of the other marital experiences, with each variable representing one experience. The variables will be married, widowed, divorced, and separatd (using the 8 allowable characters for SPSS variable names).

9 SW388R7 Data Analysis & Computers II Slide 9 Coding scheme for new variables Original Variable Coding Coding for New Variables marriedwidoweddivorcedseparatd 1 = married1000 2 = widowed0100 3 = divorced0010 4 = separated0001 5 = never married0000 The coding scheme for the new variables in shown in the table below.

10 SW388R7 Data Analysis & Computers II Slide 10 Using Recoding in SPSS to Create New Variables Select the Recode | Into Different Variables command from the Transform menu.

11 SW388R7 Data Analysis & Computers II Slide 11 Creating the married variable First, select the variable to be dummy-coded, marital, from the list of variables and move it to the Numeric Variable -> Output Variable list box. Second, type in the name for the new variable and click on the Change button to replace the ? with this new variable name.

12 SW388R7 Data Analysis & Computers II Slide 12 Assigning values to new variable Next, click on the Old and New Values button to assign values to the new variable.

13 SW388R7 Data Analysis & Computers II Slide 13 Preserving missing values Third, click on the Add button to include this recoding for the variable First, mark the System- or user-missing option button on the Old Value panel. Second, mark the System-missing option button on the New Value panel. If we forget to explicitly assign missing values, cases with missing data will be recoded with a 0 and become part of the reference group.

14 SW388R7 Data Analysis & Computers II Slide 14 Coding the married category Third, click on the Add button to include this recoding for the variable First, to recode the 1 = married category to the dummy variable, mark the Value option button and type a 1 in the text box on the Old Value panel. Second, mark the Value option button and type a 1 in the text box on the New Value panel. This coding says: if they were originally in the married category for marital, they are assigned a value of 1 for the married dummy variable.

15 SW388R7 Data Analysis & Computers II Slide 15 Coding the other categories Third, click on the Add button to include this recoding for the variable First, to identify subjects in the categories other than married, mark the All other values option button on the Old Value panel. Second, mark the Value option button and type a 0 in the text box on the New Value panel. This coding says: if they were originally NOT in the married category for marital, they are assigned a value of 0 for the married dummy variable.

16 SW388R7 Data Analysis & Computers II Slide 16 Completing the re-coding When we have completed the coding for the new variable, click on the Continue button.

17 SW388R7 Data Analysis & Computers II Slide 17 Completing the married variable Click on the OK button to create the new variable in the data editor.

18 SW388R7 Data Analysis & Computers II Slide 18 Variable and coding for widowed variable Following the same steps, we create the dummy variable for subjects who were 2 = widowed on the original marital variable. The coding is similar to that for married subjects, except the category that was originally coded 2 = widowed is translated into a 1 on the new variable.

19 SW388R7 Data Analysis & Computers II Slide 19 Variable and coding for divorced variable Following the same steps, we create the dummy variable for subjects who were 3 = divorced on the original marital variable. The coding is similar to that for married subjects, except the category that was originally coded 3 = divorced is translated into a 1 on the new variable.

20 SW388R7 Data Analysis & Computers II Slide 20 Variable and coding for separated variable Following the same steps, we create the dummy variable for subjects who were 4 = separated on the original marital variable. The coding is similar to that for married subjects, except the category that was originally coded 4 = separated is translated into a 1 on the new variable.

21 SW388R7 Data Analysis & Computers II Slide 21 Dummy-coded variables for married subjects Subjects with a code value of 1 = married on the original marital variable now have a 1 for married and a 0 for the other new variables.

22 SW388R7 Data Analysis & Computers II Slide 22 Dummy-coded variables for widowed subjects Subjects with a code value of 2 = widowed on the original marital variable now have a 1 for widowed and a 0 for the other new variables.

23 SW388R7 Data Analysis & Computers II Slide 23 Dummy-coded variables for divorced subjects Subjects with a code value of 3 = divorced on the original marital variable now have a 1 for divorced and a 0 for the other new variables.

24 SW388R7 Data Analysis & Computers II Slide 24 Dummy-coded variables for never married subjects Subjects with a code value of 5 = never married on the original marital variable now have a 0 for all new variables. This was the reference category.


Download ppt "SW388R7 Data Analysis & Computers II Slide 1 Incorporating Nonmetric Data with Dummy Variables The logic of dummy-coding Dummy-coding in SPSS."

Similar presentations


Ads by Google