USING DUMMY VARIABLES IN REGRESSION MODELS

Qualitative Variables Qualitative variables can be introduced into regression models using dummy variables Dummy variables can take on only two values – 0 or 1 Suppose you feel a person’s income may be affected by his/her gender –X = 1 (male) X = 0 (female) or vice versa –Add a column of gender to the model Column of 1’s and 0’s

Multiple Values It is felt that the starting salary for a business school graduate is affected by whether the graduate majored in MIS, accounting, finance, or another business discipline (“Other”). –Use 3 dummy variables to represent business disciplines. x 1 = MIS x 2 = Accounting x 3 = Finance –Use k-1 dummy variables if there are k choices for the qualitative variable. If the entries for x 1, x 2, and x 3 were all 0, this indicates “Other” Never have more than one “1” for x 1, x 2, x 3

Example Suppose Bill is a business graduate who majored in accounting and received a staring salary of $27,000. Ellen is a second business graduate who majored in marketing (“Other”) and received a starting salary of $29,000. –The corresponding values for y and x 1, x 2, and x 3 for these graduate would be: Bill:y = x 1 = 0 x 2 = 1 x 3 = 0 Ellen:y = x 1 = 0 x 2 = 0 x 3 = 0

Models with both Quantitative and Qualitative Variables Many models include both quantitative and qualitative variables. Interpretation of coefficient of dummy variable (x) – how y is affected if x goes from 0 to 1. There is no “in-between” interpretation for the dummy variable x

Excel Example It is conjectured that starting salaries for business school graduates are a function of the major (MIS, Accounting, Finance, Other), gender (Male, Female), and college grade point average. A sample of 20 students is taken. –Use 3 dummy variables for the 4 choices of major. –Use 1 dummy variable for the 2 choices of gender. –GPA is a quantitative variable. Use Excel’s IF statement to translate the qualitative responses into 0’s and 1’s.

=IF(I2=“MIS”,1,0) =IF(I2=“Accounting”,1,0) =IF(I2=“Finance”,1,0)=IF(J2=“Male”,1,0) Drag cells B2:E2 to B21:E21 X Range is contiguous

Regression Equation Salary = MIS Accounting Finance – Male GPA

Review Dummy variables are regression variables that can only take on the values of 0 or 1. Multiple dummy variables can be used to represent different values of a qualitative variable. Use one less dummy variable than the number of possible values the qualitative variable – all 0’s represent the last value. Dummy variables can be used with other quantitative variables in regression models. Excel – Use of IF statement

