Presentation is loading. Please wait.

Presentation is loading. Please wait.

DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES Example: the cost of running a school depends on the number of pupils, but it also depends on whether.

Similar presentations


Presentation on theme: "DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES Example: the cost of running a school depends on the number of pupils, but it also depends on whether."— Presentation transcript:

1 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES Example: the cost of running a school depends on the number of pupils, but it also depends on whether the school is an occupational school. Dummy variables always have two values, 0 or 1. If OCC is equal to 0, the cost function becomes that for regular schools. If OCC is equal to 1, the cost function becomes that for occupational schools. 11  11 1+1+ Combined equationCOST =  1 +  OCC +  2 N + u OCC = 0 Regular schoolCOST =  1 +  2 N + u OCC = 1 Occupational schoolCOST =  1 +  +  2 N + u

2 COST =  1  +  T TECH +  W WORKER +  V VOC +  2 N + u General SchoolCOST =  1  +  2 N + u (TECH = WORKER = VOC = 0) Technical SchoolCOST = (  1  +  T ) +  2 N + u (TECH = 1; WORKER = VOC = 0) Skilled Workers’ SchoolCOST = (  1  +  W ) +  2 N + u (WORKER = 1; TECH = VOC = 0) Vocational SchoolCOST = (  1  +  V ) +  2 N + u (VOC = 1; TECH = WORKER = 0) DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES Now the qualitative variable has four categories. The standard procedure is to choose one category as the reference category and to define dummy variables for each of the others. Note: you must leave out the reference category, otherwise your model will be perfectly collinear! 16

3 COST N 1+T1+T 1+W1+W 1+V1+V 11 Workers’ Vocational WW VV TT DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES The diagram illustrates the model graphically. The  coefficients are the extra overhead costs of running technical, skilled workers’, and vocational schools, relative to the overhead cost of general schools. 17 Technical General

4 COST N 1+T1+T 1+W1+W 1+V1+V 11 Workers’ Vocational WW VV TT DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES We chose general academic schools as the reference (omitted) category and defined dummy variables for the other categories. This means that we can only compare other schools to general schools, and not to each other. 17 Technical General

5 COST N 1+T1+T 1+W1+W 1+V1+V 11 Workers’ Vocational WW VV TT DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES However, suppose that we were interested in testing whether the overhead costs of skilled workers’ schools were different from those of the other types of school. How could we do this? It is simplest to re-run the regression making skilled workers’ schools the reference category. 17 Technical General

6 7 TWO SETS OF DUMMY VARIABLES The explanatory variables in a regression model may include multiple sets of dummy variables. Now you need to think about every combination, and the reference category is the one in which all dummy variables are zero. COST =  1  +   OCC +   RES +  2 N + u Regular, nonresidentialCOST =  1  +  2 N + u (OCC = RES = 0) Regular, residentialCOST = (  1  +   ) +  2 N + u (OCC = 0; RES = 1) Occupational, nonresidentialCOST = (  1  +   ) +  2 N + u (OCC = 1; RES = 0) Occupational, residentialCOST = (  1  +  +   ) +  2 N + u (OCC = RES = 1)

7 7 TWO SETS OF DUMMY VARIABLES In the case of a non-residential occupational school, RES is 0 and OCC is 1, so the overhead cost increases by . If the school is both occupational and residential, it increases by (  +  ). COST =  1  +   OCC +   RES +  2 N + u Regular, nonresidentialCOST =  1  +  2 N + u (OCC = RES = 0) Regular, residentialCOST = (  1  +   ) +  2 N + u (OCC = 0; RES = 1) Occupational, nonresidentialCOST = (  1  +   ) +  2 N + u (OCC = 1; RES = 0) Occupational, residentialCOST = (  1  +  +   ) +  2 N + u (OCC = RES = 1)

8 COST N  1 +  +  1+1+ 1+1+ 11 Occupational, residential Regular, nonresidential    +  8  Occupational, nonresidential Regular, residential TWO SETS OF DUMMY VARIABLES The diagram illustrates the model graphically. Note that the effects of the different components of the model are assumed to be separate and additive in this specification. In particular, we are assuming that the extra overhead cost of a residential school is the same for regular and occupational schools: there is no interaction effect.

9 SLOPE DUMMY VARIABLES 2 The specification of the model incorporates the assumption that the marginal cost per student is the same for occupational and regular schools. Hence the cost functions have the same slope: the same coefficient on N. This is a restriction we have placed on the model.

10 SLOPE DUMMY VARIABLES 3 This is not a realistic assumption. Occupational schools incur expenditure on training materials that is related to the number of students. Also, the staff-student ratio has to be higher in occupational schools.

11 SLOPE DUMMY VARIABLES 5 Looking at the scatter diagram, you can see that the cost function for the occupational schools should be steeper, and that for the regular schools should be flatter. The two lines should have different slopes.

12 SLOPE DUMMY VARIABLES We will relax the assumption of the same marginal cost by introducing what is known as a slope dummy variable. This is NOCC, defined as the product of N and OCC. For example, in the case of an occupational school, OCC is equal to 1 and NOCC is equal to N. The equation simplifies as shown. 8 COST =  1  +   OCC +  2 N + NOCC + u Regular schoolCOST =  1  +  2 N + u (OCC = NOCC = 0) Occupational schoolCOST = (  1  +   ) + (  2  + N + u (OCC = 1; NOCC = N)

13 COST N  1 +  11 Occupational Regular  SLOPE DUMMY VARIABLES The diagram illustrates the model graphically. 10

14 7 INTERACTING DUMMY VARIABLES If we interact dummy variables, we get new dummy variables, but we must interpret carefully. The reference category is obtained by setting all dummies equal to zero. Then write down the earnings function for each subgroup separately to make the effects of various coefficients clear. LGEARN =  1  +  2 S +   F +  W + FW + u Black maleLGEARN =  1  +  2 S + u (F = W = 0) White maleLGEARN =  1  +  2 S +  W + u (F = 0; W = 1) Black femaleLGEARN =  1  +  2 S +   F + u (F = 1; W = 0) White femaleLGEARN =  1  +  2 S +   F +  W + FW + u (F = W = 1)

15 Copyright Christopher Dougherty 2000–2006. This slideshow may be freely copied for personal use. 24.06.06


Download ppt "DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES Example: the cost of running a school depends on the number of pupils, but it also depends on whether."

Similar presentations


Ads by Google