1 Lecture 4:F-Tests SSSII Gwilym Pryce www.gpryce.com.

Slides:



Advertisements
Similar presentations
Multiple Regression.
Advertisements

Things to do in Lecture 1 Outline basic concepts of causality
1 Review Lecture: Guide to the SSSII Assignment Gwilym Pryce 5 th March 2006.
Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
CHAPTER 3: TWO VARIABLE REGRESSION MODEL: THE PROBLEM OF ESTIMATION
The Multiple Regression Model.
1 Javier Aparicio División de Estudios Políticos, CIDE Otoño Regresión.
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 8 Regression: Relationships between continuous variables Slides available from Statistics & SPSS page of Social.
1 Module II Lecture 3: Misspecification: Non-linearities Graduate School Quantitative Research Methods Gwilym Pryce.
1 Module II Lecture 4:F-Tests Graduate School 2004/2005 Quantitative Research Methods Gwilym Pryce
Classical Regression III
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
Lecture 3 Cameron Kaplan
The Simple Linear Regression Model: Specification and Estimation
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
Multiple Regression Models
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Econ 140 Lecture 171 Multiple Regression Applications II &III Lecture 17.
Multiple Regression Applications
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Simple Linear Regression Analysis
EC Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Lorelei Howard and Nick Wright MfD 2008
Simple Linear Regression Analysis
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Relationships Among Variables
Multiple Linear Regression Analysis
Example of Simple and Multiple Regression
Introduction to Linear Regression and Correlation Analysis
5.1 Basic Estimation Techniques  The relationships we theoretically develop in the text can be estimated statistically using regression analysis,  Regression.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Six.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
PY 603 – Advanced Statistics II TR 12:30-1:45pm 232 Gordon Palmer Hall Jamie DeCoster.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Lecture 11 Preview: Hypothesis Testing and the Wald Test Wald Test Let Statistical Software Do the Work Testing the Significance of the “Entire” Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Environmental Modeling Basic Testing Methods - Statistics III.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Module II Lecture 1: Multiple Regression
Regression and Correlation
Simple Linear Regression
Tutorial 1: Misspecification
Presentation transcript:

1 Lecture 4:F-Tests SSSII Gwilym Pryce

2 Plan: (1) Testing a set of linear restrictions – the general case (2) Testing homogenous Restrictions (3) Testing for a relationship – Special Case of Homogenous Restrictions (4) Testing for Structural Breaks

3 (1) Testing a set of linear Restrictions - The General Procedure E.g. Does Monetarism Explain Everything about inflation? –Suppose we want to test whether there are any country specific effects in the relationship between inflation and the money supply: INFL = a + b MS + g 1 COUNTRY 1 + …. + g 42 COUNTRY 2 I.e. we want to test the following null hypothesis: H 0 : g 1 = g 2 = g 3 =…. = g 42 = 0 i.e. idiosyncrasies of countries (their culture, history, economic structure, level of development etc) have no effect on inflation. The money supply explains everything about inflation. –Then we can think of this as being equivalent to comparing two regressions, one restricted and one unrestricted:

4 The Unrestricted regression (“qualified monetarism”) is: INFL = a + b MS + g 1 COUNTRY 1 + …. + g 42 COUNTRY 2 The Restricted regression (“pure monetarism”) is: INFL = a + b MS We can test whether all the g coefficients (country specific effects) equal zero using the F-test:

5 The General formula for F: Where: RSS U = restricted residual sum of squares = RSS under H 1 RSS R = unrestricted residual sum of squares = RSS under H 0 r = number of restrictions = diff. in no. parameters between restricted and unrestricted equations df u =df from unrestricted regression = n - k where k is all coefficients including the intercept. NB RSS is a measure of the total amount of error in a model. RSS R is always greater than RSS U since imposing a restriction on an equation can never reduce the RSS. Question is whether there’s a large increase in RSS from imposing a restriction.

6 Using the F-test: If the null hypothesis is true (i.e. restrictions are satisfied) then we would expect the restricted and unrestricted regressions to give similar results –I.e. RSS R and RSS U will be similar –so we accept H 0 when the test statistic gives a small value for F. But if one of the restrictions does not hold, then the restricted regression will have had an invalid restriction imposed upon it and will be mispecified. –  higher residual variation  higher RSS R –so we reject H 0 when the test stat. gives a large value

7 Test Procedure: (i) Compute RSS U –Run the unrestricted form of the regression in SPSS and take a note of the residual sum of squares = RSS U (ii) Compute RSS R –Run the restricted form of the regression in SPSS and take a note of the residual sum of squares = RSS R (iii) Calculate r and df U (iv) Substitute RSS U, RSS R, r and df U in the equation for F and find the significance level associated with the value of F you have calculated.

8 Example 1: H o : no country effects (R and U regressions have the same dependent variable) Step (i) RSS U =

9 Step (ii) RSS R =

10 Step (iii) r and df u r = number of restrictions = difference in no. of parameters between the restricted and unrestricted equations = 3 df u =df from unrestricted regression = n U - k U where k is total number of all coefficients including the intercept = = 511

11 (iv) Substitute RSS U, RSS R, r and df U in the formula for F F = (RSS R - RSS U ) / r = ( )/3 RSS U /df U / 511 = = df numerator = r = 3 df denominator = df U = 511 From Tables, we know that at P = 0.01, the value for F[3,511] would be 3.88 (I.e. Prob(F > 3.88) = 0.01) F we have calculated is > 3.88, so we know that P < 0.01 (I.e. Prob(F > ) <0.01)  Reject Ho

12

13

14 Alternatively use Excel calculator: F-Tests.xls First Paste ANOVA tables of U and R models:

15 Second, check cell formulas, & let Excel do the rest:

16 Example 2: H o : b 2 + b 3 = 1 y = b 1 + b 2 x 2 + b 3 x 3 + u (R and U regressions have different dependent variables) (i) Compute RSS U –Run the unrestricted form of the regression in SPSS and take a note of the residual sum of squares = RSS U (ii) Compute RSS R –Run the restricted form of the regression in SPSS by: substituting the restrictions into the equation rearrange the equation so that each parameter appears only once create new variables where necessary and estimate by OLS –and take a note of the residual sum of squares = RSS R (iii) Calculate r and df U (iv) Calculate F and find the significance level

17 If the restriction is: b 2 + b 3 = 1 How would you incorporate this information into: y = b 1 + b 2 x 2 + b 3 x 3 + u to derive the restricted model?

18 Unrestricted regression: y = b 1 + b 2 x 2 + b 3 x 3 + u H 0 : b 2 + b 3 = 1; If H 0 is true, then: b 3 = 1 - b 2 and: y = b 1 + b 2 x 2 + (1-b 2 )x 3 + u = b 1 + b 2 x 2 + x 3 - b 2 x 3 + u = b 1 + b 2 (x 2 - x 3 )+ x 3 + u y - x 3 = b 1 + b 2 (x 2 - x 3 )+ u Restricted regression: z = b 1 + b 2 (v)+ u where z = y - x 3 ; v = x 2 - x 3

19 Example 3: H o : b 2 = b 3 (R and U regressions have the same dependent variable) If the unrestricted regression is: y = b 1 + b 2 x 2 + b 3 x 3 + u How would you derive the restricted regression?

20 Unrestricted regression: y = b 1 + b 2 x 2 + b 3 x 3 + u H 0 : b 2 = b 3 ; H 1 : b 2  b 3 If H 0 is true, then: y = b 1 + b 2 x 2 + b 2 x 3 + u = b 1 + b 2 (x 2 + x 3 ) + u Restricted regression: y = b 1 + b 2 (w)+ u where w = x 2 + x 3 ;

21 Example 4: H o : b 3 = b (R and U regressions have the different dependent variables) If the unrestricted regression is: Infl = b 1 + b 2 MS_GDP + b 3 MP_GDP + u How would you derive the restricted regression?

22 Unrestricted regression: Infl = b 1 + b 2 MS_GDP + b 3 MP_GDP + u H o : b 3 = b If H 0 is true, then: Infl = b 1 + b 2 MS_GDP + (b 2 +1)MP_GDP + u = b 1 + b 2 MS_GDP + b 2 MP_GDP + 1  MP_GDP + u = b 1 + b 2 (MS_GDP + MP_GDP) + MP_GDP + u Infl - MP_GDP = b 1 + b 2 (MS_GDP + MP_GDP) + u Restricted regression: z = b 1 + b 2 (v)+ u where z = Infl - MP_GDP ; v = MS_GDP + MP_GDP

23 Step (i) RSS U =

24 Step (ii) RSS R = SPSS syntax for creating Z and V COMPUTE Z = Infl - MP_GDP. EXECUTE. COMPUTE V = MS_GDP + MP_GDP. EXECUTE. SPSS syntax for Restricted Regression: REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /NOORIGIN /DEPENDENT Z /METHOD=ENTER V. SPSS ANOVA Output:

25 Step (iii) r and df u r = number of restrictions = difference in no. of parameters between the restricted and unrestricted equations = 1 df u =df from unrestricted regression = n U - k U where k is total number of all coefficients including the intercept = = 513

26 (iv) Substitute RSS U, RSS R, r and df U in the formula for F F = (RSS R - RSS U ) / r = ( )/1 RSS U /df U / 513 = / = df numerator = r = 1 df denominator = df U = 513 From Excel = FDIST(0.309,1,513), we know that Prob(F > 0.309) = 0.58 (I.e. 58% chance of Type I Error) –I.e. if we reject H 0 then there is more than a one in two chance that we have rejected H0 incorrectly –  Accept H 0 that b 3 = b 2 + 1

27 Using F-Tests.xls:

28

29 (2) Testing a set of linear Restrictions - When the Restrictions are Homogenous When linear restrictions are homogenous: –e.g. H 0 : b 2 = b 3 = 0 –e.g. H 0 : b 2 = b 3 we do not need to transform the dependent variable of the restricted equation. For restrictions of this type –I.e. where the dependent variable is the same in the restricted and unrestricted regressions we can re-write our F-ratio test statistic in terms of R 2 s:

30 F-ratio test statistic for homogenous restrictions: Where: RSS U = unrestricted residual sum of squares = RSS under H 1 RSS R = unrestricted residual sum of squares = RSS under H 0 r = number of restrictions = diff. in no. parameters between restricted and unrestricted equations df u =df from unrestricted regression = n - k where k is all coefficients including the intercept.

31 Proof of simpler formula for homogenous restrictions: If the dependent variable is the same in both the restricted and unrestricted equations, then the TSS will be the same We can then make use of the fact that RSS = (1 - R 2 ) TSS, which implies that: RSS R = (1- R R 2 ) TSS RSS U = (1- R U 2 ) TSS

32 Proof continued... Substituting RSS R = (1- R R 2 ) TSS and RSS U = (1- R U 2 ) TSS into our original formula for the F-ratio, we find that:

33 Example 1: H o : no country effects (R and U regressions have the same dependent variable) Our approach to this restriction when we tested it above was to use the RSSs as follows: Since it is a homogenous restriction (I.e. dep var is same in restricted and unrestricted models), we shall now attempt the same test but using the R 2 formulation of the F-ratio formula: F = (RSS R - RSS U ) / r = ( )/3 RSS U /df U / 511 = Prob(F > F[3,511] ) = 1.028E-14  Reject H 0

34 Unrestricted model: R U 2 = Restricted model: R R 2 = 0.016

35 F = ( R U 2 - R R 2 ) / r = ( )/3 = = (1- R U 2) /df U ( ) /

36

37

38 (3) Testing a set of linear Restrictions - When the Restrictions say that b i = 0  i A special case of homogenous restrictions is where we test for the existence of a relationship –I.e. H 0 : all slope coefficients are zero: Unrestricted regression: y = b 1 + b 2 x 2 + b 3 x 3 + u H 0 : b 2 = b 3 = 0; If H 0 is true, then y = b 1 In this case, Restricted regression does no explaining at all and so R R 2 = 0

39 And the homogenous restriction F-ratio test statistic reduces to: This is the F-test we came across in MII Lecture 2, and is the one automatically calculated in the SPSS ANOVA table where, r = k -1 df U = n - k

40 (4) Testing for Structural Breaks The F-test also comes into play when we want to test whether the estimated coefficients change significantly if we split the sample in two at a given point These tests are sometimes called “Chow Tests” after one of its proponents. There are actually two versions of the test: –Chow’s first test –Chow’s second test

41 (a) Chow’s First Test Use where n 2 > k (1) Run the regression on the first set of data (i = 1, 2, 3, … n 1 ) & let its RSS be RSS n1 (2) Run the regression on the second set of data (i = n 1 +1, n 1 +2, …, end of data) & let its RSS be RSS n2 (3) Run the regression on the two sets of data combined (i = 1, …, end of data) & let its RSS be RSS n1 + n2

42 (4) Compute RSS U, RSS R, r and df U : –RSS U = RSS n1 + RSS n2 –RSS R = RSS n1 + n2 –r = k = total no. of coeffts including the constant –df U = n 1 + n 2 -2k (5) Use RSS U, RSS R, r and df U to calculate F using the general formula for F and find the sig. Level:

43 (b) Chow’s Second Test Use where n 2 < k (I.e. when you have insufficient observations on 2 nd subsample to do Chow’s 1 st test) (1) Run the regression on the first set of data (i = 1, 2, 3, … n 1 ) & let its RSS be RSS n1 (2) Run the regression on the two sets of data combined (i = 1, …, end of data) & let its RSS be RSS n1 + n2

44 (3) Compute RSS U, RSS R, r and df U : RSS U = RSS n1 RSS R = RSS n1 + n2 r = n 2 df U = n 1 - k (4) Use RSS U, RSS R, r and df U to calculate F using the general formula for F and find the sig.:

45 Example of Chow’s 1 st Test: n 1 : before 1986: n 2 : 1986 and after

46

47

48 Summary: (1) Testing a set of linear restrictions – the general case (2) Testing homogenous Restrictions (3) Testing for a relationship – Special Case of Homogenous Restrictions (4) Testing for Structural Breaks

49 Reading Kennedy (1998) “A Guide to Econometrics”, Chapters 4 and 6 Maddala, G.S. (1992) “Introduction to Econometrics” p