Presentation is loading. Please wait.

Presentation is loading. Please wait.

Christopher Dougherty EC220 - Introduction to econometrics (chapter 7) Slideshow: weighted least squares and logarithmic regressions Original citation:

Similar presentations


Presentation on theme: "Christopher Dougherty EC220 - Introduction to econometrics (chapter 7) Slideshow: weighted least squares and logarithmic regressions Original citation:"— Presentation transcript:

1 Christopher Dougherty EC220 - Introduction to econometrics (chapter 7) Slideshow: weighted least squares and logarithmic regressions Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 7). [Teaching Resource] © 2012 The Author This version available at: http://learningresources.lse.ac.uk/133/http://learningresources.lse.ac.uk/133/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/ http://creativecommons.org/licenses/by-sa/3.0/ http://learningresources.lse.ac.uk/

2 1 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i This sequence presents two methods for dealing with the problem of heteroscedasticity. We will start with the general case, where the variance of the distribution of the disturbance term in observation i is  ui 2.

3 2 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i If we knew  ui in each observation, we could derive a homoscedastic model by dividing the equation through by it.

4 3 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i The population variance of the disturbance term in the revised model is now equal to 1 in all observations, and so the disturbance term is homoscedastic.

5 4 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS In the revised model, we regress Y' on X' and H, as defined. Note that there is no intercept in the revised model.  1 becomes the slope coefficient of the artificial variable 1/  ui., not constant for all i

6 5 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i The revised model is described as a weighted regression model because we are weighting observation i by a factor 1/  ui. Note that we are automatically giving the highest weights to the most reliable observations (those with the lowest values of  ui ).

7 6 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i Of course in practice we do not know the value of  i in each observation. However it may be reasonable to suppose that it is proportional to some measurable variable, Z i.

8 7 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i If this is the case, we can make the model homoscedastic by dividing through by Z i.

9 8 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i The disturbance term in the revised model has constant variance 2. We do not need to know the value of 2. The crucial point is that, by assumption, it is constant.

10 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS We will illustrate this procedure with the UNIDO data on manufacturing output and GDP. We will try scaling by population. A regression of manufacturing output per capita on GDP per capita is less likely to be subject to heteroscedasticity. 9

11 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS Here is the revised scatter diagram. Does it look homoscedastic? Actually, no. This is still a classic pattern of heteroscedasticity. 10

12 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS RSS 2 is much larger than RSS 1. RSS 1 = 5,378,000 RSS 2 = 17,362,000 11

13 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS However, the subsamples are small and high ratios can occur on a pure chance basis. The null hypothesis of homoscedasticity is only just rejected at the 5% level. RSS 1 = 5,378,000 RSS 2 = 17,362,000 12

14 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i Often the X variable itself is a suitable scaling variable. After all, the Goldfeld–Quandt test assumes that the standard deviation of the disturbance term is proportional to it. 13

15 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS, not constant for all i Note that when we scale though by it, the  2 term becomes the intercept in the revised model. 14

16 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS It follows that when we interpret the regression results, the slope coefficient is an estimate of  1 in the original model and the intercept is an estimate of  2., not constant for all i 15

17 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS Here is the corresponding scatter diagram. Is there any evidence of heteroscedasticity? 16

18 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS No longer. The residual sums of squares for the two subsamples are almost identical, indeed closer than one would usually expect on a pure chance basis under the null hypothesis. RSS 1 = 0.065 RSS 2 = 0.070 17

19 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS As a consequence, the F statistic is not significant. The heteroscedasticity has been eliminated. RSS 2 = 0.070 RSS 1 = 0.065 18

20 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS We will now consider an alternative approach to the problem. It is possible that the heteroscedasticity has been caused by an inappropriate mathematical specification. Suppose, in particular, that the true relationship is in fact logarithmic. 19

21 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS Here is the corresponding scatter diagram. No sign of heteroscedasticity. 20

22 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS We confirm this with the Goldfeld–Quandt test. In this case there is no point in calculating the conventional test statistic. RSS 2 is smaller than RSS 1, so it cannot be significantly greater than RSS 1. RSS 2 = 1.037 RSS 1 = 2.140 21

23 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS In this situation we should test whether there is evidence that the standard deviation of the disturbance term is inversely proportional to the X variable. For this purpose, the F statistic is the inverse of the conventional one. RSS 2 = 1.037 RSS 1 = 2.140 22

24 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS The null hypothesis of homoscedasticity is not rejected. RSS 2 = 1.037 RSS 1 = 2.140 23

25 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS Now an additive disturbance term in the logarithmic model is equivalent to a multiplicative one in the original model. 24

26 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS This means that the absolute size of the effect of the disturbance term is large for large values of the X variable and small for small ones, when the scatter diagram is redrawn with the variables in their original form. 25

27 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS For example, Singapore and South Korea have relatively large manufacturing sectors, and Greece and Mexico relatively small ones. South Korea Mexico Singapore Greece 26

28 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS The variations for these countries are similar when plotted on the logarithmic scale, but those for South Korea and Mexico are much larger when the variables are plotted in natural units. South Korea Mexico Singapore Greece 27

29 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS Here is a summary of the regressions using the four alternative specifications of the model. 28

30 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS The first regression suggests that, for every increase of $1 million in GDP, manufacturing output increases by $194,000. Thus, at the margin, manufacturing accounts for 0.19 of GDP. The intercept does not have any plausible meaning. 29

31 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS However, this regression was subject to severe heteroscedasticity. Although the estimate of the coefficient of GDP is unbiased, it is likely to be relatively inaccurate. Also, and this is a separate effect of heteroscedasticity, the standard errors, t tests and F test are invalid. 30

32 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS In the second regression, the estimate of the slope coefficient was a little lower. However for this regression also the null hypothesis of homoscedasticity was rejected, but only at the 5% level. 31

33 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS In the third regression the model was scaled through by GDP. As a consequence, the intercept became an estimator of the original slope coefficient, and vice versa. 32

34 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS For this model the null hypothesis of homoscedasticity was not rejected. In principle, therefore, it should yield more accurate estimates of the coefficients than the first two, and we are able to perform tests. 33

35 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS For the logarithmic model also the null hypothesis of homoscedasticity was not rejected. So we have two models which survive the Goldfeld–Quandt test. Which do you prefer? Think about it. 34

36 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS You probably went for the logarithmic model, attracted by the high R 2. However, in this example, there is little to choose between the third and fourth models. Substantively, they have the same interpretation. 35

37 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS In the third model, 1/GDP has a low t statistic and appears to be an irrelevant variable. The model is telling us that manufacturing output, as a proportion of GDP, is constant. Because it is constant, R 2 is effectively 0. 36

38 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS The fourth model is telling us that the elasticity of manufacturing output with respect to GDP is equal to 1. In other words, manufacturing output increases proportionally with GDP and remains a constant proportion of it. 37

39 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS Converting the logarithmic equation back into natural units, you obtain the equation shown. Like the third equation, it implies that manufacturing output accounts for a little over 0.18 of GDP, at the margin. 38

40 Copyright Christopher Dougherty 2011. These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 7.3 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/http://www.oup.com/uk/orc/bin/9780199567089/. Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx or the University of London International Programmes distance learning course 20 Elements of Econometrics www.londoninternational.ac.uk/lsewww.londoninternational.ac.uk/lse. 11.07.25


Download ppt "Christopher Dougherty EC220 - Introduction to econometrics (chapter 7) Slideshow: weighted least squares and logarithmic regressions Original citation:"

Similar presentations


Ads by Google