Download presentation

Presentation is loading. Please wait.

Published byJamison Lenton Modified about 1 year ago

1
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the dummy variable trap Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 5). [Teaching Resource] © 2012 The Author This version available at: http://learningresources.lse.ac.uk/131/http://learningresources.lse.ac.uk/131/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/ http://creativecommons.org/licenses/by-sa/3.0/ http://learningresources.lse.ac.uk/

2
THE DUMMY VARIABLE TRAP 1 Suppose that you have a regression model with Y depending on a set of ordinary variables X 2,..., X k and a qualitative variable.

3
THE DUMMY VARIABLE TRAP 2 Suppose that the qualitative variable has s categories. We choose one of them as the omitted category (without loss of generality, category 1) and define dummy variables D 2,..., D s for the rest.

4
THE DUMMY VARIABLE TRAP 3 What would happen if we did not drop the reference category? Suppose we defined a dummy variable D 1 for it and included it in the specification. What would happen then?

5
THE DUMMY VARIABLE TRAP 4 We would fall into the dummy variable trap. I would be impossible to fit the model as specified.

6
THE DUMMY VARIABLE TRAP 5 We will start with an intuitive explanation. The coefficient of each dummy variable represents the increase in the intercept relative to that for the basic category. But there is no basic category for such a comparison.

7
THE DUMMY VARIABLE TRAP 6 1 represents the fixed component of Y for the basic category. But again, there is no basic category. Thus the model does not have any logical interpretation.

8
THE DUMMY VARIABLE TRAP 7 Mathematically, we have a special case of exact multicollinearity. If there is no omitted category, there is an exact linear relationship between X 1 and the dummy variables. The table gives an example where there are 4 categories. Observation CategoryX 1 D 1 D 2 D 3 D 4 1410001 23 10010 31 11000 42 10100 52 10100 63 10010 7111000 8410001

9
THE DUMMY VARIABLE TRAP 8 X 1 is the variable whose coefficient is 1. It is equal to 1 in all observations. Usually we do not write it explicitly because there is no need to do so. Observation CategoryX 1 D 1 D 2 D 3 D 4 1410001 23 10010 31 11000 42 10100 52 10100 63 10010 7111000 8410001

10
THE DUMMY VARIABLE TRAP 9 If there is an exact linear relationship among a set of the variables, it is impossible in principle to estimate the separate coefficients of those variables. To understand this properly, one needs to use linear algebra. Observation CategoryX 1 D 1 D 2 D 3 D 4 1410001 23 10010 31 11000 42 10100 52 10100 63 10010 7111000 8410001

11
THE DUMMY VARIABLE TRAP 10 If you tried to run the regression anyway, the regression application should detect the problem and do one of two things. It may simply refuse to run the regression. Observation CategoryX 1 D 1 D 2 D 3 D 4 1410001 23 10010 31 11000 42 10100 52 10100 63 10010 7111000 8410001

12
THE DUMMY VARIABLE TRAP 11 Alternatively, it may run it, dropping one of the variables in the linear relationship, effectively defining the omitted category by itself. Observation CategoryX 1 D 1 D 2 D 3 D 4 1410001 23 10010 31 11000 42 10100 52 10100 63 10010 7111000 8410001

13
THE DUMMY VARIABLE TRAP 12 There is another way of avoiding the dummy variable trap. That is to drop the intercept (and X 1 ). There is no longer a problem because there is no longer an exact linear relationship linking the variables. Observation CategoryX 1 D 1 D 2 D 3 D 4 1410001 23 10010 31 11000 42 10100 52 10100 63 10010 7111000 8410001

14
THE DUMMY VARIABLE TRAP 13 The parameters are now the intercepts in the relationship for the individual categories. For example, if the observation relates to category 2, all the dummy variables except D 2 will be equal to 0. D 2 = 1, and hence the relationship for that observation has intercept 2. Observation CategoryX 1 D 1 D 2 D 3 D 4 1410001 23 10010 31 11000 42 10100 52 10100 63 10010 7111000 8410001

15
Copyright Christopher Dougherty 2011. These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 5.2 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/http://www.oup.com/uk/orc/bin/9780199567089/. Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx or the University of London International Programmes distance learning course 20 Elements of Econometrics www.londoninternational.ac.uk/lsewww.londoninternational.ac.uk/lse. 11.07.25

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google