EC220 - Introduction to econometrics (chapter 3)

EC220 - Introduction to econometrics (chapter 3)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: f tests in a multiple regression model Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 3). [Teaching Resource] © 2012 The Author This version available at: Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms.

F TESTS OF GOODNESS OF FIT
at least one This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates to the goodness of fit of the equation as a whole. 1

at least one We will consider the general case where there are k – 1 explanatory variables. For the F test of goodness of fit of the equation as a whole, the null hypothesis, in words, is that the model has no explanatory power at all. 2

at least one Of course we hope to reject it and conclude that the model does have some explanatory power. 3

at least one The model will have no explanatory power if it turns out that Y is unrelated to any of the explanatory variables. Mathematically, therefore, the null hypothesis is that all the coefficients b2, ..., bk are zero. 4

at least one The alternative hypothesis is that at least one of these b coefficients is different from zero. 5

at least one In the multiple regression model there is a difference between the roles of the F and t tests. The F test tests the joint explanatory power of the variables, while the t tests test their explanatory power individually. 6

at least one In the simple regression model the F test was equivalent to the (two-sided) t test on the slope coefficient because the ‘group’ consisted of just one variable. 7

at least one The F statistic for the test was defined in the last sequence in Chapter 2. ESS is the explained sum of squares and RSS is the residual sum of squares. 8

at least one It can be expressed in terms of R2 by dividing the numerator and denominator by TSS, the total sum of squares. 9

at least one ESS / TSS is the definition of R2. RSS / TSS is equal to (1 – R2). (See the last sequence in Chapter 2.) 10

The educational attainment model will be used as an example. We will suppose that S depends on ASVABC, the ability score, and SM, and SF, the highest grade completed by the mother and father of the respondent, respectively. 11

The null hypothesis for the F test of goodness of fit is that all three slope coefficients are equal to zero. The alternative hypothesis is that at least one of them is non-zero. 12

We now come to the other F test of goodness of fit. This is a test of the joint explanatory power of a group of variables when they are added to a regression model. 25

For example, in the original specification, Y may be written as a simple function of X2. In the second, we add X3 and X4. 26

or or both and The null hypothesis for the F test is that neither X3 nor X4 belongs in the model. The alternative hypothesis is that at least one of them does, perhaps both. 27

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining For this F test, and for several others which we will encounter, it is useful to think of the F statistic as having the structure indicated above. 28

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The ‘reduction in RSS’ is the reduction when the change is made, in this case, when the group of new variables is added. 29

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The ‘cost in d.f.’ is the reduction in the number of degrees of freedom remaining after making the change. In the present case it is equal to the number of new variables added, because that number of new parameters are estimated. 30

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining (Remember that the number of degrees of freedom in a regression equation is the number of observations, less the number of parameters estimated. In this example, it would fall from n – 2 to n – 4 when X3 and X4 are added.) 31

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The ‘RSS remaining’ is the residual sum of squares after making the change. 32

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The ‘degrees of freedom remaining’ is the number of degrees of freedom remaining after making the change. 33

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The improvement in the fit on adding the parental variables is the reduction in the residual sum of squares. 36

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The cost is 2 degrees of freedom because 2 additional parameters have been estimated. 37

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The remaining unexplained is the residual sum of squares after adding SM and SF. 38

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The number of degrees of freedom remaining is n – k, that is, 540 – 4 = 536. 39

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The F statistic is 40

or or both and F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The critical value of F(2,500) at the 0.1% level is The critical value of F(2,536) must be lower, so we reject H0 and conclude that the parental education variables do have significant joint explanatory power. 41

This sequence will conclude by showing that t tests are equivalent to marginal F tests when the additional group of variables consists of just one variable. 42

Suppose that in the original model Y is a function of X2 and X3, and that in the revised model X4 is added. 43

The null hypothesis for the F test of the explanatory power of the additional ‘group’ is that all the new slope coefficients are equal to zero. There is of course only one new slope coefficient, b4. 44

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The F test has the usual structure. We will illustrate it with an educational attainment model where S depends on ASVABC and SM in the original model and on SF as well in the revised model. 45

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The reduction in the residual sum of squares is the reduction on adding SF. 48

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The cost is just the single degree of freedom lost when estimating b4. 49

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The RSS remaining is the residual sum of squares after adding SF. 50

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The number of degrees of freedom remaining after adding SF is 540 – 4 = 536. 51

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining Hence the F statistic is 52

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The critical value of F at the 0.1% significance level with 500 degrees of freedom is The critical value with 536 degrees of freedom must be lower, so we reject H0 at the 0.1% level. 53

F (cost in d.f., d.f. remaining) = reduction in RSS cost in d.f. RSS remaining degrees of freedom remaining The null hypothesis we are testing is exactly the same as for a two-sided t test on the coefficient of SF. 54

Copyright Christopher Dougherty 2011.
These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 3.5 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics or the University of London International Programmes distance learning course 20 Elements of Econometrics

EC220 - Introduction to econometrics (chapter 3)

Similar presentations

Presentation on theme: "EC220 - Introduction to econometrics (chapter 3)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

EC220 - Introduction to econometrics (chapter 3)

Similar presentations

Presentation on theme: "EC220 - Introduction to econometrics (chapter 3)"— Presentation transcript:

Similar presentations

About project

Feedback