Presentation is loading. Please wait.

Presentation is loading. Please wait.

Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction.

Similar presentations


Presentation on theme: "Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction."— Presentation transcript:

1 Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 5). [Teaching Resource] © 2012 The Author This version available at: http://learningresources.lse.ac.uk/131/http://learningresources.lse.ac.uk/131/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/ http://creativecommons.org/licenses/by-sa/3.0/ http://learningresources.lse.ac.uk/

2 1 CHOW TEST Sometimes in regression analysis there are two types of observation in the sample data.

3 2 CHOW TEST If this is the case, it is sensible to investigate whether one regression model applies to both categories or whether you need separate ones for them. To do this, you can perform a Chow test.

4 3 CHOW TEST We will illustrate it using the data for the 74 secondary schools in Shanghai. The scatter diagram plots the data on annual recurrent expenditure and number of students.

5 . reg COST N Source | SS df MS Number of obs = 74 ---------+------------------------------ F( 1, 72) = 46.82 Model | 5.7974e+11 1 5.7974e+11 Prob > F = 0.0000 Residual | 8.9160e+11 72 1.2383e+10 R-squared = 0.3940 ---------+------------------------------ Adj R-squared = 0.3856 Total | 1.4713e+12 73 2.0155e+10 Root MSE = 1.1e+05 ------------------------------------------------------------------------------ COST | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- N | 339.0432 49.55144 6.842 0.000 240.2642 437.8222 _cons | 23953.3 27167.96 0.882 0.381 -30205.04 78111.65 ------------------------------------------------------------------------------ CHOW TEST 4 Here is the regression output when COST is regressed on N, making no distinction between the different types of school.

6 CHOW TEST 5 This is the scatter diagram with the regression line.

7 CHOW TEST 6 Now we make a distinction between occupational schools and regular schools and run separate regressions for the two subsamples.

8 . reg COST N if OCC==1 Source | SS df MS Number of obs = 34 ---------+------------------------------ F( 1, 32) = 55.52 Model | 6.0538e+11 1 6.0538e+11 Prob > F = 0.0000 Residual | 3.4895e+11 32 1.0905e+10 R-squared = 0.6344 ---------+------------------------------ Adj R-squared = 0.6229 Total | 9.5433e+11 33 2.8919e+10 Root MSE = 1.0e+05 ------------------------------------------------------------------------------ COST | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- N | 436.7769 58.62085 7.451 0.000 317.3701 556.1836 _cons | 47974.07 33879.03 1.416 0.166 -21035.26 116983.4 ------------------------------------------------------------------------------ CHOW TEST 7 This is the regression output when COST is regressed on N using the subsample of 34 occupational schools.

9 . reg COST N if OCC==0 Source | SS df MS Number of obs = 40 ---------+------------------------------ F( 1, 38) = 13.53 Model | 4.3273e+10 1 4.3273e+10 Prob > F = 0.0007 Residual | 1.2150e+11 38 3.1973e+09 R-squared = 0.2626 ---------+------------------------------ Adj R-squared = 0.2432 Total | 1.6477e+11 39 4.2249e+09 Root MSE = 56545 ------------------------------------------------------------------------------ COST | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- N | 152.2982 41.39782 3.679 0.001 68.49275 236.1037 _cons | 51475.25 21599.14 2.383 0.022 7750.064 95200.43 ------------------------------------------------------------------------------ CHOW TEST 8 And this is the regression output when COST is regressed on N for the subsample of 40 regular schools.

10 CHOW TEST 9 Here are the regression lines for the two subsamples.

11 CHOW TEST 10 The regression line for the pooled sample (entire sample, making no distinction) is shown for comparison.

12 CHOW TEST 11 The diagram shows the residuals in the regression using only the occupational schools.

13 CHOW TEST 12 Now the corresponding residuals for the regression using the pooled sample are shown.

14 CHOW TEST 13 RSS = 3.49 x 10 11 RSS = 5.55 x 10 11 The two sets of residuals are isolated for comparison. RSS is smaller for the residuals from the subsample regression. This must be the case. Why? (Try to answer before continuing.)

15 CHOW TEST 14 RSS = 3.49 x 10 11 RSS = 5.55 x 10 11 The regression line for the subsample regression is located so as to minimize the sum of the squares of the residuals for the occupational school observations. This is the principle underlying OLS.

16 CHOW TEST 15 RSS = 3.49 x 10 11 RSS = 5.55 x 10 11 The regression line for the pooled sample is located to give the best overall fit for the sample as a whole, including the regular schools.

17 CHOW TEST 16 RSS = 3.49 x 10 11 RSS = 5.55 x 10 11 Its location is therefore a compromise between the best fit for the occupational school observations and the best fit for the regular school observations. Because it is a compromise, its fit will be inferior to that for the subsample regression.

18 CHOW TEST 17 Next we turn to the regular schools. Here are the residuals for the subsample regression.

19 CHOW TEST 18 And now those for the same observations in the pooled regression.

20 RSS = 12.2 x 10 11 RSS = 33.6 x 10 11 The two sets of residuals are shown for comparison. Again, RSS must be higher for the pooled sample regression. CHOW TEST 19

21 RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate3.491.224.71 RSS P Pooled5.553.368.91 CHOW TEST 20 The table summarizes the RSS data for the two types of school in the separate and pooled regressions.

22 RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate3.491.224.71 RSS P Pooled5.553.368.91 CHOW TEST 21 The residual sums of squares for the separate regressions for the occupational and regular schools will be denoted RSS 1 and RSS 2, respectively.

23 RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate3.491.224.71 RSS P Pooled5.553.368.91 CHOW TEST 22 Adding them together, we get the total residual sum of squares when separate regressions are run for the two subsamples.

24 RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate3.491.224.71 RSS P Pooled5.553.368.91 CHOW TEST 23 We compare this total with RSS P, the residual sum of squares from the pooled sample regression.

25 CHOW TEST 24 This is obtained directly from the original pooled regression. There is no need to calculate the occupational and regular components. We are interested only in the total.

26 CHOW TEST 25 We are interested in seeing whether there is a significant reduction in the total when we run separate regressions for the two subsamples. RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate3.491.224.71 RSS P Pooled5.553.368.91

27 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 26 The test statistic is the F statistic defined as shown.

28 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 27 The first argument of the F statistic is k, the cost, in terms of degrees of freedom, of running separate regressions.

29 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 28 The cost is k because two sets of k parameters are estimated when separate regressions are run, instead of only one set with the pooled regression.

30 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 29 The second argument of the F statistic is n – 2k, the total number of degrees of freedom remaining when separate regressions are run.

31 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 30 There are n observations and k degrees of freedom are used up by each regression when separate regressions are run.

32 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 31 The numerator of the F statistic consists of the overall improvement in the fit on splitting the sample, divided by the cost in terms of degrees of freedom when separate regressions are run.

33 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 32 The denominator of the F statistic is the total RSS remaining after splitting the sample, divided by the number of degrees of freedom remaining.

34 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining (RSS 1 +RSS 2 ) 4.71 RSS P 8.91 33 In the case of the school cost functions, the reduction in the residual sum of squares has already been tabulated.

35 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 34 There are only two parameters in the model, the constant and the coefficient of N, so the first argument of the F statistic is 2. (RSS 1 +RSS 2 ) 4.71 RSS P 8.91

36 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 35 The residual sum of squares remaining after splitting the sample is the sum of RSS 1 and RSS 2. (RSS 1 +RSS 2 ) 4.71 RSS P 8.91

37 CHOW TEST 36 F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining There are 74 observations and so there are 70 degrees of freedom remaining after estimating two sets of parameters. (RSS 1 +RSS 2 ) 4.71 RSS P 8.91

38 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 37 The F statistic is thus 31.2. The critical value of F(2,70) is 7.6 at the 0.1% significance level.

39 CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 38 The reduction in the residual sum of squares is therefore significant at the 0.1% level. We conclude that the pooled cost function is an inadequate specification and that we should run separate regressions for the two types of school.

40 Copyright Christopher Dougherty 2011. These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 5.4 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/http://www.oup.com/uk/orc/bin/9780199567089/. Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx or the University of London International Programmes distance learning course 20 Elements of Econometrics www.londoninternational.ac.uk/lsewww.londoninternational.ac.uk/lse. 11.07.25


Download ppt "Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction."

Similar presentations


Ads by Google