Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction.

Slides:



Advertisements
Similar presentations
CHOW TEST AND DUMMY VARIABLE GROUP TEST
Advertisements

EC220 - Introduction to econometrics (chapter 5)
EC220 - Introduction to econometrics (chapter 10)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: slope dummy variables Original citation: Dougherty, C. (2012) EC220 -
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.16 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: a Monte Carlo experiment Original citation: Dougherty, C. (2012) EC220.
EC220 - Introduction to econometrics (chapter 7)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: dynamic model specification Original citation: Dougherty, C. (2012)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
EC220 - Introduction to econometrics (chapter 7)
EC220 - Introduction to econometrics (chapter 2)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: expected value of a function of a random variable Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification iii: consequences for diagnostics Original.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: confidence intervals Original citation: Dougherty, C. (2012) EC220.
EC220 - Introduction to econometrics (chapter 1)
1 INTERPRETATION OF A REGRESSION EQUATION The scatter diagram shows hourly earnings in 2002 plotted against years of schooling, defined as highest grade.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: semilogarithmic models Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: nonlinear regression Original citation: Dougherty, C. (2012) EC220 -
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: maximum likelihood estimation of regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the normal distribution Original citation: Dougherty, C. (2012)
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: two sets of dummy variables Original citation: Dougherty, C. (2012) EC220.
EC220 - Introduction to econometrics (review chapter)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: sampling and estimators Original citation: Dougherty, C. (2012)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: autocorrelation, partial adjustment, and adaptive expectations Original.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: Tobit models Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: conflicts between unbiasedness and minimum variance Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 8) Slideshow: measurement error Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 TWO SETS OF DUMMY VARIABLES The explanatory variables in a regression model may include multiple sets of dummy variables. This sequence provides an example.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
EC220 - Introduction to econometrics (chapter 8)
MULTIPLE RESTRICTIONS AND ZERO RESTRICTIONS
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: footnote: the Cochrane-Orcutt iterative process Original citation: Dougherty,
Simple regression model: Y =  1 +  2 X + u 1 We have seen that the regression coefficients b 1 and b 2 are random variables. They provide point estimates.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 9) Slideshow: instrumental variable estimation: variation Original citation: Dougherty,
. reg LGEARN S WEIGHT85 Source | SS df MS Number of obs = F( 2, 537) = Model |
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: multiple restrictions and zero restrictions Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: alternative expression for population variance Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: exercise 6.13 Original citation: Dougherty, C. (2012) EC220 - Introduction.
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION 1 Ramsey’s RESET test of functional misspecification is intended to provide a simple indicator of evidence.
1 NONLINEAR REGRESSION Suppose you believe that a variable Y depends on a variable X according to the relationship shown and you wish to obtain estimates.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
SEMILOGARITHMIC MODELS 1 This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable.
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: confidence intervals Original citation: Dougherty, C. (2012) EC220 -
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
WHITE TEST FOR HETEROSCEDASTICITY 1 The White test for heteroscedasticity looks for evidence of an association between the variance of the disturbance.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: independence of two random variables Original citation: Dougherty,
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: simple regression model Original citation: Dougherty, C. (2012) EC220.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 5). [Teaching Resource] © 2012 The Author This version available at: Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms

1 CHOW TEST Sometimes in regression analysis there are two types of observation in the sample data.

2 CHOW TEST If this is the case, it is sensible to investigate whether one regression model applies to both categories or whether you need separate ones for them. To do this, you can perform a Chow test.

3 CHOW TEST We will illustrate it using the data for the 74 secondary schools in Shanghai. The scatter diagram plots the data on annual recurrent expenditure and number of students.

. reg COST N Source | SS df MS Number of obs = F( 1, 72) = Model | e e+11 Prob > F = Residual | e e+10 R-squared = Adj R-squared = Total | e e+10 Root MSE = 1.1e COST | Coef. Std. Err. t P>|t| [95% Conf. Interval] N | _cons | CHOW TEST 4 Here is the regression output when COST is regressed on N, making no distinction between the different types of school.

CHOW TEST 5 This is the scatter diagram with the regression line.

CHOW TEST 6 Now we make a distinction between occupational schools and regular schools and run separate regressions for the two subsamples.

. reg COST N if OCC==1 Source | SS df MS Number of obs = F( 1, 32) = Model | e e+11 Prob > F = Residual | e e+10 R-squared = Adj R-squared = Total | e e+10 Root MSE = 1.0e COST | Coef. Std. Err. t P>|t| [95% Conf. Interval] N | _cons | CHOW TEST 7 This is the regression output when COST is regressed on N using the subsample of 34 occupational schools.

. reg COST N if OCC==0 Source | SS df MS Number of obs = F( 1, 38) = Model | e e+10 Prob > F = Residual | e e+09 R-squared = Adj R-squared = Total | e e+09 Root MSE = COST | Coef. Std. Err. t P>|t| [95% Conf. Interval] N | _cons | CHOW TEST 8 And this is the regression output when COST is regressed on N for the subsample of 40 regular schools.

CHOW TEST 9 Here are the regression lines for the two subsamples.

CHOW TEST 10 The regression line for the pooled sample (entire sample, making no distinction) is shown for comparison.

CHOW TEST 11 The diagram shows the residuals in the regression using only the occupational schools.

CHOW TEST 12 Now the corresponding residuals for the regression using the pooled sample are shown.

CHOW TEST 13 RSS = 3.49 x RSS = 5.55 x The two sets of residuals are isolated for comparison. RSS is smaller for the residuals from the subsample regression. This must be the case. Why? (Try to answer before continuing.)

CHOW TEST 14 RSS = 3.49 x RSS = 5.55 x The regression line for the subsample regression is located so as to minimize the sum of the squares of the residuals for the occupational school observations. This is the principle underlying OLS.

CHOW TEST 15 RSS = 3.49 x RSS = 5.55 x The regression line for the pooled sample is located to give the best overall fit for the sample as a whole, including the regular schools.

CHOW TEST 16 RSS = 3.49 x RSS = 5.55 x Its location is therefore a compromise between the best fit for the occupational school observations and the best fit for the regular school observations. Because it is a compromise, its fit will be inferior to that for the subsample regression.

CHOW TEST 17 Next we turn to the regular schools. Here are the residuals for the subsample regression.

CHOW TEST 18 And now those for the same observations in the pooled regression.

RSS = 12.2 x RSS = 33.6 x The two sets of residuals are shown for comparison. Again, RSS must be higher for the pooled sample regression. CHOW TEST 19

RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate RSS P Pooled CHOW TEST 20 The table summarizes the RSS data for the two types of school in the separate and pooled regressions.

RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate RSS P Pooled CHOW TEST 21 The residual sums of squares for the separate regressions for the occupational and regular schools will be denoted RSS 1 and RSS 2, respectively.

RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate RSS P Pooled CHOW TEST 22 Adding them together, we get the total residual sum of squares when separate regressions are run for the two subsamples.

RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate RSS P Pooled CHOW TEST 23 We compare this total with RSS P, the residual sum of squares from the pooled sample regression.

CHOW TEST 24 This is obtained directly from the original pooled regression. There is no need to calculate the occupational and regular components. We are interested only in the total.

CHOW TEST 25 We are interested in seeing whether there is a significant reduction in the total when we run separate regressions for the two subsamples. RESIDUAL SUM OF SQUARES (x10 11 ) RegressionOccupationalRegularTotal RSS 1 RSS 2 (RSS 1 +RSS 2 ) Separate RSS P Pooled

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 26 The test statistic is the F statistic defined as shown.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 27 The first argument of the F statistic is k, the cost, in terms of degrees of freedom, of running separate regressions.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 28 The cost is k because two sets of k parameters are estimated when separate regressions are run, instead of only one set with the pooled regression.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 29 The second argument of the F statistic is n – 2k, the total number of degrees of freedom remaining when separate regressions are run.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 30 There are n observations and k degrees of freedom are used up by each regression when separate regressions are run.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 31 The numerator of the F statistic consists of the overall improvement in the fit on splitting the sample, divided by the cost in terms of degrees of freedom when separate regressions are run.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 32 The denominator of the F statistic is the total RSS remaining after splitting the sample, divided by the number of degrees of freedom remaining.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining (RSS 1 +RSS 2 ) 4.71 RSS P In the case of the school cost functions, the reduction in the residual sum of squares has already been tabulated.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 34 There are only two parameters in the model, the constant and the coefficient of N, so the first argument of the F statistic is 2. (RSS 1 +RSS 2 ) 4.71 RSS P 8.91

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 35 The residual sum of squares remaining after splitting the sample is the sum of RSS 1 and RSS 2. (RSS 1 +RSS 2 ) 4.71 RSS P 8.91

CHOW TEST 36 F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining There are 74 observations and so there are 70 degrees of freedom remaining after estimating two sets of parameters. (RSS 1 +RSS 2 ) 4.71 RSS P 8.91

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 37 The F statistic is thus The critical value of F(2,70) is 7.6 at the 0.1% significance level.

CHOW TEST F(k, n – 2k) = overall reduction in RSS when separate regressions are run cost in degrees of freedom total RSS remaining when separate regressions are run degrees of freedom remaining 38 The reduction in the residual sum of squares is therefore significant at the 0.1% level. We conclude that the pooled cost function is an inadequate specification and that we should run separate regressions for the two types of school.

Copyright Christopher Dougherty These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 5.4 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics or the University of London International Programmes distance learning course 20 Elements of Econometrics