© Christopher Dougherty 1999–2006 A.1: The model is linear in parameters and correctly specified. A.2: There does not exist an exact linear relationship.

Slides:



Advertisements
Similar presentations
CHOW TEST AND DUMMY VARIABLE GROUP TEST
Advertisements

EC220 - Introduction to econometrics (chapter 5)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: slope dummy variables Original citation: Dougherty, C. (2012) EC220 -
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Instrumental Variables Estimation and Two Stage Least Square
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
ELASTICITIES AND DOUBLE-LOGARITHMIC MODELS
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
INTERPRETATION OF A REGRESSION EQUATION
Lecture 4 This week’s reading: Ch. 1 Today:
EC220 - Introduction to econometrics (chapter 2)
Chapter 4 Multiple Regression.
Evaluating Hypotheses
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
© Christopher Dougherty 1999–2006 VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE We will now investigate the consequences of misspecifying.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: testing a hypothesis relating to a regression coefficient (2010/2011.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
Chapter 4 – Nonlinear Models and Transformations of Variables.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
BINARY CHOICE MODELS: LOGIT ANALYSIS
EC220 - Introduction to econometrics (chapter 3)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: semilogarithmic models Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction.
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: Tobit models Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
Regression Method.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
CHAPTER 14 MULTIPLE REGRESSION
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
© Christopher Dougherty 1999–2006 The denominator has been rewritten a little more carefully, making it explicit that the summation of the squared deviations.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 3: Basic techniques for innovation data analysis. Part II: Introducing regression.
Simple regression model: Y =  1 +  2 X + u 1 We have seen that the regression coefficients b 1 and b 2 are random variables. They provide point estimates.
A.1The model is linear in parameters and correctly specified. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS 1 Moving from the simple to the multiple.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
Chapter 5: Dummy Variables. DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 We’ll now examine how you can include qualitative explanatory variables.
POSSIBLE DIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY 1 What can you do about multicollinearity if you encounter it? We will discuss some possible.
(1)Combine the correlated variables. 1 In this sequence, we look at four possible indirect methods for alleviating a problem of multicollinearity. POSSIBLE.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: exercise 6.13 Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
SEMILOGARITHMIC MODELS 1 This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable.
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

© Christopher Dougherty 1999–2006 A.1: The model is linear in parameters and correctly specified. A.2: There does not exist an exact linear relationship among the regressors in the sample. A.3The disturbance term has zero expectation A.4The disturbance term is homoscedastic A.5The values of the disturbance term have independent distributions A.6The disturbance term has a normal distribution PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS Moving from the simple to the multiple regression model, we start by restating the regression model assumptions. Only A.2 is different. Previously it was stated that there must be some variation in the X variable. We will explain the difference in one of the following lectures. Provided that the regression model assumptions are valid, the OLS estimators in the multiple regression model are unbiased and efficient, as in the simple regression model.

© Christopher Dougherty 1999–2006 We will not attempt to prove efficiency. We will however outline a proof of unbiasedness. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 The first step, as always, is to substitute for Y from the true relationship. The Y ingredients of b 2 are actually in the form of Y i minus its mean, so it is convenient to obtain an expression for this. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 After simplifying, we find that b 2 can be decomposed into the true value  2 plus a weighted linear combination of the values of the disturbance term in the sample. This is what we found in the simple regression model. The difference is that the expression for the weights, which depend on all the values of X 2 and X 3 in the sample, is considerably more complicated. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Having reached this point, proving unbiasedness is easy. Taking expectations,  2 is unaffected, being a constant. The expectation of a sum is equal to the sum of expectations. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 The a* terms are nonstochastic since they depend only on the values of X 2 and X 3, and these are assumed to be nonstochastic. Hence the a* terms may be taken out of the expectations as factors. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 By Assumption A.3, E(u i ) = 0 for all i. Hence E(b 2 ) is equal to  2 and so b 2 is an unbiased estimator. Similarly b 3 is an unbiased estimator of  3. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Finally we will show that b 1 is an unbiased estimator of  1. This is quite simple, so you should attempt to do this yourself, before looking at the rest of this sequence. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 First substitute for the sample mean of Y. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Now take expectations. The first three terms are nonstochastic, so they are unaffected by taking expectations. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 The expected value of the mean of the disturbance term is zero since E(u) is zero in each observation. We have just shown that E(b 2 ) is equal to  2 and that E(b 3 ) is equal to  3. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Hence b 1 is an unbiased estimator of  1. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS This sequence investigates the variances and standard errors of the slope coefficients in a model with two explanatory variables. The expression for the variance of b 2 is shown above. The expression for the variance of b 3 is the same, with the subscripts 2 and 3 interchanged.

© Christopher Dougherty 1999–2006 The first factor in the expression is identical to that for the variance of the slope coefficient in a simple regression model. The variance of b 2 depends on the variance of the disturbance term, the number of observations, and the mean square deviation of X 2 for exactly the same reasons as in a simple regression model. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 The difference is that in multiple regression analysis the expression is multiplied by a factor which depends on the correlation between X 2 and X 3. The higher is the correlation between the explanatory variables, positive or negative, the greater will be the variance. This is easy to understand intuitively. The greater the correlation, the harder it is to discriminate between the effects of the explanatory variables on Y, and the less accurate will be the regression estimates. Note that the variance expression above is valid only for a model with two explanatory variables. When there are more than two, the expression becomes much more complex and it is sensible to switch to matrix algebra. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 The standard deviation of the distribution of b 2 is of course given by the square root of its variance. With the exception of the variance of u, we can calculate the components of the standard deviation from the sample data. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 The variance of u has to be estimated. The mean square of the residuals provides a consistent estimator, but it is biased downwards by a factor (n – k) / n, where k is the number of parameters, in a finite sample. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Obviously we can obtain an unbiased estimator by dividing the sum of the squares of the residuals by n – k instead of n. We denote this unbiased estimator s u. 2 PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Thus the estimate of the standard deviation of the probability distribution of b 2, known as the standard error of b 2 for short, is given by the expression above. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 We will use this expression to analyze why the standard error of S is larger for the union subsample than for the non-union subsample in earnings function regressions using Data Set 21.. reg EARNINGS S EXP if COLLBARG==1 Source | SS df MS Number of obs = F( 2, 98) = 9.72 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP if COLLBARG==1 Source | SS df MS Number of obs = F( 2, 98) = 9.72 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | To select a subsample in Stata, you add an ‘if’ statement to a command. The COLLBARG variable is equal to 1 for respondents whose rates of pay are determined by collective bargaining, and it is 0 for the others. Note that in tests for equality, Stata requires the = sign to be duplicated. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP if COLLBARG==1 Source | SS df MS Number of obs = F( 2, 98) = 9.72 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | In the case of the union subsample, the standard error of S is PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP if COLLBARG==0 Source | SS df MS Number of obs = F( 2, 436) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | In the case of the non-union subsample, the standard error of S is , less than half as large. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 We will explain the difference by looking at the components of the standard error. Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP if COLLBARG==1 Source | SS df MS Number of obs = F( 2, 98) = 9.72 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | We will start with s u. Here is RSS for the union subsample. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP if COLLBARG==1 Source | SS df MS Number of obs = F( 2, 98) = 9.72 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | There are 101 observations in the union subsample. k is equal to 3. Thus n – k is equal to 98. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 RSS / (n – k) is equal to To obtain s u, we take the square root. This is reg EARNINGS S EXP if COLLBARG==1 Source | SS df MS Number of obs = F( 2, 98) = 9.72 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union We place this in the table, along with the number of observations. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Similarly, in the case of the non-union subsample, s u is the square root of , which is We also note that the number of observations in that subsample is reg EARNINGS S EXP if COLLBARG==0 Source | SS df MS Number of obs = F( 2, 436) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union We place these in the table. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union We calculate the mean square deviation of S for the two subsamples from the sample data. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006. cor S EXP if COLLBARG==1 (obs=101) | S EXP S | EXP | cor S EXP if COLLBARG==0 (obs=439) | S EXP S | EXP | The correlation coefficients for S and EXP are – and – for the union and non-union subsamples, respectively. (Note that "cor" is the Stata command for computing correlations.) PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union These entries complete the top half of the table. We will now look at the impact of each item on the standard error, using the mathematical expression at the top. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union The s u components need no modification. It is a little larger for the non- union subsample, having an adverse effect on the standard error. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union The number of observations is much larger for the non-union subsample, so the second factor is much smaller than that for the union subsample. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union Perhaps surprisingly, the variance in schooling is a little larger for the union subsample. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union The correlation between schooling and work experience is greater for the union subsample, and this has an adverse effect on its standard error. Note that the sign of the correlation makes no difference since it is squared. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

Decomposition of the standard error of S Component s u n MSD(S) r S, EXP s.e. Union – Non-union – Factor product Union Non-union We see that the reason that the standard error is smaller for the non-union subsample is that there are far more observations than in the non-union subsample. Otherwise the standard errors would have been about the same. The greater correlation between S and EXP has an adverse effect on the union standard error, but this is just about offset by the smaller s u and the larger variance of S. PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

© Christopher Dougherty 1999–2006 X 2 X 3 Y MULTICOLLINEARITY Suppose that Y = 2 + 3X 2 + X 3 and that X 3 = 2X 2 – 1. There is no disturbance term in the equation for Y, but that is not important. Suppose that we have the six observations shown.

© Christopher Dougherty 1999–2006 The three variables are plotted as line graphs above. Looking at the data, it is impossible to tell whether the changes in Y are caused by changes in X 2, by changes in X 3, or jointly by changes in both X 2 and X 3. Y X3X3 X2X2 MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Change Change Change X 2 X 3 Y in X 2 in X 3 in Y Numerically, Y increases by 5 in each observation when X 2 changes by 1. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Hence the true relationship could have been Y = 1 + 5X 2. Y X3X3 X2X2 Y = 1 + 5X 2 ? MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 What would happen if you tried to run a regression when there is an exact linear relationship among the explanatory variables? We will investigate, using the model with two explanatory variables shown above. [Note: A disturbance term has now been included in the true model, but it makes no difference to the analysis.] MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 The expression for the multiple regression coefficient b 2 is shown above. We will substitute for X 3 using its relationship with X 2. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 First, we will replace the terms highlighted with the expression derived below. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Next, the terms that are highlighted now. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Finally this term. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 After all the replacements, it turns out that the numerator and the denominator are both equal to zero. The regression coefficient is not defined. It is unusual for there to be an exact relationship among the explanatory variables in a regression. When this occurs, it is typically because there is a logical error in the specification. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | However, it often happens that there is an approximate relationship. For example, when relating earnings to schooling and work experience, it if often reasonable to suppose that the effect of work experience is subject to diminishing returns. A standard way of allowing for this is to include EXPSQ, the square of EXP, in the specification. According to the hypothesis of diminishing returns,  4 should be negative. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | We fit this specification using Data Set 21. The schooling component of the regression results is not much affected by the inclusion of the EXPSQ term. The coefficient of S indicates that an extra year of schooling increases hourly earnings by $2.75. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP Source | SS df MS Number of obs = F( 2, 537) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | (Looking back at slide 21:) In the specification without EXPSQ it was 2.68, not much different. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | The standard error, 0.23 in the specification without EXPSQ, is also little changed and the coefficient remains highly significant. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | By contrast, the inclusion of the new term has had a dramatic effect on the coefficient of EXP. Now it is negative, which makes little sense, and insignificant! MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Previously it had been positive and highly significant.. reg EARNINGS S EXP Source | SS df MS Number of obs = F( 2, 537) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | _cons | MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | The reason for these problems is that EXPSQ is highly correlated with EXP. This makes it difficult to discriminate between the individual effects of EXP and EXPSQ, and the regression estimates tend to be erratic.. cor EXP EXPSQ (obs=540) | EXP EXPSQ EXP | EXPSQ | MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 When high correlations among the explanatory variables lead to erratic point estimates of the coefficients, large standard errors and unsatisfactorily low t statistics, the regression is said to said to be suffering from multicollinearity. Multicollinearity may also be caused by an approximate linear relationship among the explanatory variables. When there are only 2, an approximate linear relationship means there will be a high correlation, but this is not always the case when there are more than 2. MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 ALLEVIATION OF MULTICOLLINEARITY What can you do about multicollinearity if you encounter it? We will discuss some possible measures, looking at the model with two explanatory variables. Before doing this, two important points should be emphasized. First, multicollinearity does not cause the regression coefficients to be biased. Their probability distributions are still centered over the true values, if the regression specification is correct, but they have unsatisfactorily large variances. Second, the standard errors and t tests remain valid. The standard errors are larger than they would have been in the absence of multicollinearity, warning us that the regression estimates are erratic. Since the problem of multicollinearity is caused by the variances of the coefficients being unsatisfactorily large, we will seek ways of reducing them.

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (1)Reduce by including further relevant variables in the model. We will focus on the slope coefficient and look at the various components of its variance. We might be able to reduce it by bringing more variables into the model and reducing  u 2, the variance of the disturbance term. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 The estimator of the variance of the disturbance term is the residual sum of squares divided by n – k, where n is the number of observations (540) and k is the number of parameters (4). Here it is reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ MALE ASVABC Source | SS df MS Number of obs = F( 5, 534) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | We now add two new variables that are often found to be determinants of earnings: MALE, sex of respondent, and ASVABC, the composite score on the cognitive tests in the Armed Services Vocational Aptitude Battery. MALE is a qualitative variable and the treatment of such variables will be explained in Chapter 5. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ MALE ASVABC Source | SS df MS Number of obs = F( 5, 534) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | Both MALE and ASVABC have coefficients significant at the 0.1% level. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = reg EARNINGS S EXP EXPSQ MALE ASVABC Source | SS df MS Number of obs = F( 5, 534) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = However they account for only a small proportion of the variance in earnings and the reduction in the estimate of the variance of the disturbance term is likewise small. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | reg EARNINGS S EXP EXPSQ MALE ASVABC EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | As a consequence the impact on the standard errors of EXP and EXPSQ is negligible. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | _cons | reg EARNINGS S EXP EXPSQ MALE ASVABC EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | Note how unstable the coefficients are. This is often a sign of multicollinearity. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (2)Increase the number of observations. Surveys: increase the budget, or use clustering Time series: use quarterly instead of annual data The next factor to look at is n, the number of observations. If you are working with cross- section data (individuals, households, enterprises, etc) and you are undertaking a survey, you could increase the size of the sample by negotiating a bigger budget. You select a number of these randomly, perhaps using random sampling to make sure that metropolitan, other urban, and rural areas are properly represented. You then confine the survey to the areas selected. This reduces the travel time and cost of the fieldworkers, allowing them to interview a greater number of respondents. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (2)Increase the number of observations. Surveys: increase the budget, use clustering Time series: use quarterly instead of annual data If you are working with time series data, you may be able to increase the sample by working with shorter time intervals for the data, for example quarterly or even monthly data instead of annual data. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ MALE ASVABC Source | SS df MS Number of obs = F( 5, 2708) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | Here is the result of running the regression with all 2,714 observations in the EAEF data set. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ MALE ASVABC Number of obs = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | reg EARNINGS S EXP EXPSQ MALE ASVABC Number of obs = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | Comparing this result with that using Data Set 21, we see that the standard errors are much smaller, as expected. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg EARNINGS S EXP EXPSQ MALE ASVABC Number of obs = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | reg EARNINGS S EXP EXPSQ MALE ASVABC Number of obs = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | As a consequence, the t statistics of the variables are higher. However the correlation between EXP and EXPSQ is as high as in the smaller sample and the increase in the sample size has not been large enough to have much impact on the problem of multicollinearity. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (3)Increase MSD(X 2 ). A third possible way of reducing the problem of multicollinearity might be to increase the variation in the explanatory variables. This is possible only at the design stage of a survey. For example, if you were planning a household survey with the aim of investigating how expenditure patterns vary with income, you should make sure that the sample included relatively rich and relatively poor households as well as middle-income households. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (4) Reduce. Another possibility might be to reduce the correlation between the explanatory variables. This is possible only at the design stage of a survey and even then it is not easy. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (5)Combine the correlated variables. If the correlated variables are similar conceptually, it may be reasonable to combine them into some overall index. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 That is precisely what has been done with the three cognitive ASVAB variables. ASVABC has been calculated as a weighted average of ASVAB02 (arithmetic reasoning), ASVAB03 (word knowledge), and ASVAB04 (paragraph comprehension). The three components are highly correlated and by combining them as a weighted average, rather than using them individually, one avoids a potential problem of multicollinearity.. reg EARNINGS S EXP EXPSQ MALE ASVABC Source | SS df MS Number of obs = F( 5, 534) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval] S | EXP | EXPSQ | MALE | ASVABC | _cons | ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (6)Drop some of the correlated variables. Dropping some of the correlated variables, if they have insignificant coefficients, may alleviate multicollinearity. However, this approach to multicollinearity is dangerous. It is possible that some of the variables with insignificant coefficients really do belong in the model and that the only reason their coefficients are insignificant is because there is a problem of multicollinearity. If that is the case, their omission may cause omitted variable bias, to be discussed in Chapter 6. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (7)Empirical restriction A further way of dealing with the problem of multicollinearity is to use extraneous information, if available, concerning the coefficient of one of the variables. For example, suppose that Y in the equation above is the demand for a category of consumer expenditure, X is aggregate disposable personal income, and P is a price index for the category. To fit a model of this type you would use time series data. If X and P are highly correlated, which is often the case with time series variables, the problem of multicollinearity might be eliminated in the following way. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (7)Empirical restriction Obtain data on income and expenditure on the category from a household survey and regress Y' on X'. (The ' marks are to indicate that the data are household data, not aggregate data.) This is a simple regression because there will be relatively little variation in the price paid by the households. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (7)Empirical restriction Now substitute b' for  2 in the time series model. Subtract b' X from both sides, and regress Z = Y – b' X on price. This is a simple regression, so multicollinearity has been eliminated ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (7)Empirical restriction There are some problems with this technique. First, the  2 coefficients may be conceptually different in time series and cross-section contexts. Second, since we subtract the estimated income component b' X, not the true income component  2 X, from Y when constructing Z, we have introduced an element of measurement error in the dependent variable. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (8)Theoretical restriction Last, but by no means least, is the use of a theoretical restriction, which is defined as a hypothetical relationship among the parameters of a regression model. It will be explained using an educational attainment model as an example. Suppose that we hypothesize that highest grade completed, S, depends on ASVABC, and highest grade completed by the respondent's mother and father, SM and SF, respectively. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | A one-point increase in ASVABC increases S by 0.13 years. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | S increases by 0.05 years for every extra year of schooling of the mother and 0.11 years for every extra year of schooling of the father. Mother's education is generally held to be at least, if not more, important than father's education for educational attainment, so this outcome is unexpected. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | It is also surprising that the coefficient of SM is not significant, even at the 5% level, using a one-sided test. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | However, assortative mating leads to correlation between SM and SF and the regression appears to be suffering from multicollinearity.. cor SM SF (obs=540) | SM SF SM | SF | ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (8)Theoretical restriction Suppose that we hypothesize that mother's and father's education are equally important. We can then impose the restriction  3 =  4. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (8)Theoretical restriction This allows us to rewrite the equation as shown. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 Possible measures for alleviating multicollinearity (8)Theoretical restriction Defining SP to be the sum of SM and SF, the equation may be rewritten as shown. The problem caused by the correlation between SM and SF has been eliminated. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. g SP=SM+SF. reg S ASVABC SP Source | SS df MS Number of obs = F( 2, 537) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SP | _cons | The estimate of  3 is now and highly significant. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. g SP=SM+SF. reg S ASVABC SP S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SP | _cons | reg S ASVABC SM SF S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | Not surprisingly, this is a compromise between the coefficients of SM and SF in the previous specification. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. g SP=SM+SF. reg S ASVABC SP S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SP | _cons | reg S ASVABC SM SF S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | The standard error of SP is much smaller than those of SM and SF. The use of the restriction has led to a large gain in efficiency and the problem of multicollinearity has been eliminated. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006. g SP=SM+SF. reg S ASVABC SP S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SP | _cons | reg S ASVABC SM SF S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | The t statistic is very high. Thus it would appear that imposing the restriction has improved the regression results. However, the restriction may not be valid. We should test it. Testing theoretical restrictions is one of the topics in Chapter 6. ALLEVIATION OF MULTICOLLINEARITY

© Christopher Dougherty 1999–2006 F TESTS OF GOODNESS OF FIT This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates to the goodness of fit of the equation as a whole. We will consider the general case where there are k – 1 explanatory variables. For the F test of goodness of fit of the equation as a whole, the null hypothesis, in words, is that the model has no explanatory power at all. The model will have no explanatory power if it turns out that Y is unrelated to any of the explanatory variables. Mathematically, therefore, the null hypothesis is that all the coefficients  2,...,  k are zero. The alternative hypothesis is that at least one of these  coefficients is different from zero. In the multiple regression model there is a difference between the roles of the F and t tests. The F test tests the joint explanatory power of the variables, while the t tests test their explanatory power individually. In the simple regression model the F test was equivalent to the (two-sided) t test on the slope coefficient because the ‘group’ consisted of just one variable.

© Christopher Dougherty 1999–2006 ESS / TSS is the definition of R 2. RSS / TSS is equal to (1 – R 2 ). (See the last sequence in Chapter 2.) F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 The educational attainment model will be used as an example. We will suppose that S depends on ASVABC, the ability score, and SM, and SF, the highest grade completed by the mother and father of the respondent, respectively. The null hypothesis for the F test of goodness of fit is that all three slope coefficients are equal to zero. The alternative hypothesis is that at least one of them is non-zero. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | Here is the regression output using Data Set 21. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | The numerator of the F statistic is the explained sum of squares divided by k – 1. In the Stata output these numbers are given in the Model row. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | The denominator is the residual sum of squares divided by the number of degrees of freedom remaining. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | Hence the F statistic is All serious regression packages compute it for you as part of the diagnostics in the regression output. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | The critical value for F(3,536) is not given in the F tables, but we know it must be lower than F(3,500), which is given. At the 0.1% level, this is Hence we easily reject H 0 at the 0.1% level. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 It is unusual for the F statistic not to be significant if some of the t statistics are significant. In principle it could happen though. Suppose that you ran a regression with 40 explanatory variables, none being a true determinant of the dependent variable. Then the F statistic should be low enough for H 0 not to be rejected. However, if you are performing t tests on the slope coefficients at the 5% level, with a 5% chance of a Type I error, on average 2 of the 40 variables could be expected to have ‘significant’ coefficients. The opposite can easily happen, though. Suppose you have a multiple regression model which is correctly specified and the R 2 is high. You would expect to have a highly significant F statistic. However, if the explanatory variables are highly correlated and the model is subject to severe multicollinearity, the standard errors of the slope coefficients could all be so large that none of the t statistics is significant. In this situation you would know that your model is a good one, but you are not in a position to pinpoint the contributions made by the explanatory variables individually.

© Christopher Dougherty 1999–2006 We now come to the other F test of goodness of fit. This is a test of the joint explanatory power of a group of variables when they are added to a regression model. For example, in the original specification, Y may be written as a simple function of X 2. In the second, we add X 3 and X 4. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 The null hypothesis for the F test is that neither X 3 nor X 4 belongs in the model. The alternative hypothesis is that at least one of them does, perhaps both. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 F(cost, d.f. remaining) = improvementcost remaining unexplained degrees of freedom remaining For this F test, and for several others which we will encounter, it is useful to think of the F statistic as having the structure indicated above. The ‘improvement’ is the reduction in the residual sum of squares when the change is made, in this case, when the group of new variables is added. The ‘cost’ is the reduction in the number of degrees of freedom remaining after making the change. In the present case it is equal to the number of new variables added, because that number of new parameters are estimated. (Remember that the number of degrees of freedom in a regression equation is the number of observations, less the number of parameters estimated. In this example, it would fall from n – 2 to n – 4 when X 3 and X 4 are added.) The ‘remaining unexplained’ is the residual sum of squares after making the change. The ‘degrees of freedom remaining’ is the number of degrees of freedom remaining after making the change. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC Source | SS df MS Number of obs = F( 1, 538) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | _cons | We will illustrate the test with an educational attainment example. Here is S regressed on ASVABC using Data Set 21. We make a note of the residual sum of squares. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | Now we have added the highest grade completed by each parent. Does parental education have a significant impact? Well, we can see that a t test would show that SF has a highly significant coefficient, but we will perform the F test anyway. We make a note of RSS. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 The F statistic is The critical value of F(2,500) at the 0.1% level is The critical value of F(2,536) must be lower, so we reject H 0 and conclude that the parental education variables do have significant joint explanatory power. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 This sequence will conclude by showing that t tests are equivalent to marginal F tests when the additional group of variables consists of just one variable. Suppose that in the original model Y is a function of X 2 and X 3, and that in the revised model X 4 is added. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 The null hypothesis for the F test of the explanatory power of the additional ‘group’ is that all the new slope coefficients are equal to zero. There is of course only one new slope coefficient,  4. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 The F test has the usual structure. We will illustrate it with an educational attainment model where S depends on ASVABC and SM in the original model and on SF as well in the revised model. F(cost, d.f. remaining) = improvementcost remaining unexplained degrees of freedom remaining F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM Source | SS df MS Number of obs = F( 2, 537) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | _cons | Here is the regression of S on ASVABC and SM. We make a note of the residual sum of squares. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 Now we add SF and again make a note of the residual sum of squares.. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 F(cost, d.f. remaining) = improvementcost remaining unexplained degrees of freedom remaining The improvement on adding SF is the reduction in the residual sum of squares. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 F(cost, d.f. remaining) = improvementcost remaining unexplained degrees of freedom remaining The cost is just the single degree of freedom lost when estimating  4. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 F(cost, d.f. remaining) = improvementcost remaining unexplained degrees of freedom remaining The remaining unexplained is the residual sum of squares after adding SF. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 F(cost, d.f. remaining) = improvementcost remaining unexplained degrees of freedom remaining The number of degrees of freedom remaining after adding SF is 540 – 4 = 536. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 F(cost, d.f. remaining) = improvementcost remaining unexplained degrees of freedom remaining The critical value of F at the 0.1% significance level with 500 degrees of freedom is The critical value with 536 degrees of freedom must be lower, so we reject H 0 at the 0.1% level. The null hypothesis we are testing is exactly the same as for a two-sided t test on the coefficient of SF. F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 We will perform the t test. The t statistic is reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | The critical value of t at the 0.1% level with 500 degrees of freedom is The critical value with 536 degrees of freedom must be lower. So we reject H 0 again. F TESTS OF GOODNESS OF FIT

. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | It can be shown that the F statistic for the F test of the explanatory power of a ‘group’ of one variable must be equal to the square of the t statistic for that variable. (The difference in the last digit is due to rounding error.) F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006. reg S ASVABC SM SF Source | SS df MS Number of obs = F( 3, 536) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = S | Coef. Std. Err. t P>|t| [95% Conf. Interval] ASVABC | SM | SF | _cons | It can also be shown that the critical value of F must be equal to the square of the critical value of t. (The critical values shown are for 500 degrees of freedom, but this must also be true for 536 degrees of freedom.) F TESTS OF GOODNESS OF FIT

© Christopher Dougherty 1999–2006 Hence the conclusions of the two tests must coincide. This result means that the t test of the coefficient of a variable is a test of its marginal explanatory power, after all the other variables have been included in the equation. If the variable is correlated with one or more of the other variables, its marginal explanatory power may be quite low, even if it genuinely belongs in the model. If all the variables are correlated, it is possible for all of them to have low marginal explanatory power and for none of the t tests to be significant, even though the F test for their joint explanatory power is highly significant. If this is the case, the model is said to be suffering from the problem of multicollinearity discussed earlier. F TESTS OF GOODNESS OF FIT