The Multiple Regression Model.

Slides:



Advertisements
Similar presentations
Properties of Least Squares Regression Coefficients
Advertisements

Managerial Economics in a Global Economy
Multiple Regression Analysis
Topic 12: Multiple Linear Regression
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
The Simple Regression Model
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Hypothesis Testing Steps in Hypothesis Testing:
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Objectives (BPS chapter 24)
Chapter 13 Multiple Regression
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
Chapter 12 Multiple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Linear Regression Example Data
SIMPLE LINEAR REGRESSION
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression and Correlation
Introduction to Regression Analysis, Chapter 13,
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Multiple Linear Regression Analysis
Lecture 5 Correlation and Regression
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Objectives of Multiple Regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
CHAPTER 14 MULTIPLE REGRESSION
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Lecture 10: Correlation and Regression Model.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Statistics for Managers Using Microsoft® Excel 5th Edition
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Multiple Regression.
The simple linear regression model and parameter estimation
Chapter 20 Linear and Multiple Regression
Inference for Least Squares Lines
Chapter 11 Simple Regression
Relationship with one independent variable
Quantitative Methods Simple Regression.
Correlation and Simple Linear Regression
Multiple Regression.
Correlation and Simple Linear Regression
Relationship with one independent variable
Simple Linear Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
Chapter 7: The Normality Assumption and Inference with OLS
SIMPLE LINEAR REGRESSION
The Multiple Regression Model
Introduction to Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

The Multiple Regression Model

Two Explanatory Variables yt = 1 + 2xt2 + 3xt3 + εt xt affect yt separately yt xt2 = 2 xt3 yt = 3 But least squares estimation of 2 now depends upon both xt2 and xt3 .

Correlated Variables yt = 1 + 2xt2 + 3xt3 + εt yt = output xt2 = capital xt3 = labor Always 5 workers per machine. If number of workers per machine is never varied, it becomes impossible to tell if the machines or the workers are responsible for changes in output.

The General Model yt = 1 + 2xt2 + 3xt3 +. . .+ KxtK + εt The parameter 1 is the intercept (constant) term. The variable attached to 1 is xt1= 1. Usually, the number of explanatory variables is said to be K1 (ignoring xt1= 1), while the number of parameters is K. (Namely: 1 . . . K).

Statistical Properties of εt 1. E(εt) = 0 2. var(εt) = 2 covεt , εs= for t  s 4. εt ~ N(0, 2)

Statistical Properties of yt 1. E (yt) = 1 + 2xt2 +. . .+ KxtK 2. var(yt) = var(εt) = 2 cov(yt ,ys) = cov(εt , εs)= 0 t  s 4. yt ~ N(1+2xt2 +. . .+KxtK, 2)

Assumptions 1. yt = 1 + 2xt2 +. . .+ KxtK + εt 2. E (yt) = 1 + 2xt2 +. . .+ KxtK 3. var(yt) = var(εt) = 2 cov(yt ,ys) = cov(εt , εs) = 0 t  s 5. The values of xtk are not random 6. yt ~ N(1+2xt2 +. . .+KxtK, 2)

Least Squares Estimation yt = 1 + 2xt2 + 3xt3 + εt T S  S(1, 2, 3) = yt12xt23xt3 t=1 yt = yt  y * Define: xt2 = xt2  x2 * xt3 = xt3  x3 *

Least Squares Estimators b1 = y – b2x2 – b3x3 b2 = yt xt2xt3 yt xt3xt2xt3 * 2 xt2 xt3 xt2xt3 b3 = yt xt3xt2 yt xt2xt3xt2 * 2 xt2 xt3 xt2xt3

Dangers of Extrapolation Statistical models generally are good only within the relevant range. This means that extending them to extreme data values outside the range of the original data often leads to poor and sometimes ridiculous results. If height is normally distributed and the normal ranges from minus infinity to plus infinity, pity the man minus three feet tall.

Interpretation of Coefficients bj represents an estimate of the mean change in y responding to a one-unit change in xj when all other independent variables are held constant. Hence, bj is called the partial coefficient. Note that regression analysis cannot be interpreted as a procedure for establishing a cause-and-effect relationship between variables.

Universal Set B x3 / x2 x2 / x3 x2  x3

Error Variance Estimation Unbiased estimator of the error variance: 2 ^ =  εt   Transform to a chi-square distribution: 2 2 2 ^   

Gauss-Markov Theorem Under the first of five assumptions of the multiple regression model, the ordinary least squares estimators have the smallest variance of all linear and unbiased estimators. This means that the least squares estimators are the Best Linear U nbiased Estimators (BLUE).

(xt2  x2)(xt3  x3) (xt2  x2)2 (xt3  x3)2 Variances 2 2 yt = 1 + 2xt2 + 3xt3 + εt var(b2) = (1 r23)(xt2  x2)2 2 2 When r23 = 0 these reduce to the simple regression formulas. var(b3) = (1 r23)(xt3  x3)2 2 2 (xt2  x2)2 (xt3  x3)2 where r23 = (xt2  x2)(xt3  x3)

Variance Decomposition The variance of an estimator is smaller when: 1. The error variance, 2, is smaller: 2 0 . 2. The sample size, T, is larger: (xt2  x2)2 . 3. The variable values are more spread out: (xt2  x2)2 . 4. The correlation is close to zero: r23 0 . 2 t = 1 T

(xt2  x2)(xt3  x3) (xt2  x2)2 (xt3  x3)2 Covariances yt = 1 + 2xt2 + 3xt3 + εt (1 r23) (xt2  x2)2 (xt3  x3)2 cov(b2,b3) = 2  r23 2 where r23 = (xt2  x2)2 (xt3  x3)2 (xt2  x2)(xt3  x3)

Covariance Decomposition The covariance between any two estimators is larger in absolute value when: 1. The error variance, 2, is larger. 2. The sample size, T, is smaller. 3. The values of the variables are less spread out. 4. The correlation, r23, is high.

Var-Cov Matrix yt = 1 + 2xt2 + 3xt3 + εt The least squares estimators b1, b2, and b3 have covariance matrix: var(b1) cov(b1,b2) cov(b1,b3) cov(b1,b2,b3) = cov(b1,b2) var(b2) cov(b2,b3) cov(b1,b3) cov(b2,b3) var(b3)

Normal yt = 1 + 2x2t + 3x3t +. . .+ KxKt + εt yt ~N (1 + 2x2t + 3x3t +. . .+ KxKt), 2 εt ~ N(0, 2) This implies and is implied by: Since bk is a linear function of the yt: bk ~ N k, var(bk) z = ~ N(0,1) for k = 1,2,...,K bk  k var(bk)

Student-t ^ ^ t has a Student-t distribution with df=(TK). Since generally the population variance of bk , var(bk) , is unknown, we estimate it with which uses 2 instead of 2. var(bk) ^ bk  k var(bk) ^ t = = se(bk) t has a Student-t distribution with df=(TK).

Interval Estimation tc is critical value for (T-K) degrees of freedom bk k se(bk) P tc < < tc = 1  tc is critical value for (T-K) degrees of freedom such that P( t > tc ) =  /2. P bk tc se(bk) < k < bk + tc se(bk) = 1  bk tc se(bk) , bk + tc se(bk) Interval endpoints:

Student - t Test yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: 1 = 0 Student-t tests can be used to test any linear combination of the regression coefficients: H0: 1 = 0 H0: 2 + 3 + 4 = 1 H0: 32  73 = 21 H0: 2  3 < 5 Every such t-test has exactly TK degrees of freedom where K = # of coefficients estimated(including the intercept).

One Tail Test yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: 3 < 0 b3 se(b3) t = ~ t (TK) df = TK = T4   tc

Two Tail Test yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: 2 = 0 b2 se(b2) t = ~ t (TK) df = TK = T4    -tc tc

(yt y)2 R2 = = 0 < R2 < 1 Goodness - of - Fit ^ SSR SST Coefficient of Determination SST R2 = = (yt y)2 t = 1 T ^ SSR 0 < R2 < 1

R2 = = 1  R2 = 1  Adjusted R-Squared SSR SSE SST SSE/(TK) SST/(T1) Adjusted Coefficient of Determination Original: SST = 1  SSE R2 = SSR Adjusted: SST/(T1) R2 = 1  SSE/(TK)

Computer Output b2 t = = se(b2) 6.642 2.081 3.191 Table 8.2 Summary of Least Squares Results Variable Coefficient Std Error t-value p-value constant 104.79 6.48 16.17 0.000 price 6.642 3.191 2.081 0.042 advertising 2.984 0.167 17.868 0.000 b2 se(b2) t = = 6.642 3.191 2.081

Reporting Your Results Reporting standard errors: ^ yt =   Xt2 + Xt3 (6.48) (3.191) (0.167) (s.e.) Reporting t-statistics: ^ yt =   Xt2 + Xt3 (16.17) (-2.081) (17.868) (t)

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: 2 = 0 H1: 2 = 0 H0: yt = 1 + 3Xt3 + 4Xt4 + εt H1: yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: Restricted Model H1: Unrestricted Model

Single Restriction F-Test yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: 2 = 0 H1: 2 = 0 Under H0 (SSER  SSEU)/J SSEU/(TK) F = ~ FJ, T-K (1964.758  1805.168)/1 1805.168/(52 3) = dfn = J = 1 = 4.33 dfd = TK = 49 By definition this is the t-statistic squared: t =  2.081 F = t2 = 

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: 2 = 0, 4 = 0 H1: H0 not true H0: yt = 1 + 3Xt3 + εt H1: yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt H0: Restricted Model H1: Unrestricted Model

Multiple Restriction F-Test yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt (SSER  SSEU)/J SSEU/(TK) F = H0: 2 = 0, 4 = 0 H1: H0 not true Under H0 ~ F J, T-K First run the restricted regression by dropping Xt2 and Xt4 to get SSER. dfn = J = 2 (J: The number of hypothesis) dfd = TK = 49 Next run unrestricted regression to get SSEU .

F-Tests f(F) F = ~ F J, T-K   F F-Tests of this type are always right-tailed, even for left-sided or two-sided hypotheses, because any deviation from the null will make the F value bigger (move rightward). f(F) (SSER  SSEU)/J SSEU/(TK) F = ~ F J, T-K   Fc F

F-Test of Entire Equation yt = 1 + 2Xt2 + 3Xt3 + εt We ignore 1. Why? H0: 2 = 3 = 0 H1: H0 not true (SSER  SSEU)/J SSEU/(TK) F = dfn = J = 2 (13581.35  1805.168)/2 1805.168/(52 3) = dfd = TK = 49 = 0.05 = 159.828 Reject H0! F2, 49, 0.005 = 3.187

R2 = = ANOVA Table SSR SST 0.867 Table 8.3 Analysis of Variance Table Sum of Mean Source DF Squares Square F-Value Regression 2 11776.18 5888.09 159.828 Error 49 1805.168 36.84 Total 51 13581.35 p-value: 0.0001 SST R2 = = SSR 0.867 11776.18 13581.35

Nonsample Information A certain production process is known to be Cobb-Douglas with constant returns to scale. ln(yt) = 1 + 2 ln(Xt2) + 3 ln(Xt3) + 4 ln(Xt4) + εt where 2 + 3 + 4 = 1 4 = (1  2  3) ln(yt /Xt4) = 1 + 2 ln(Xt2/Xt4) + 3 ln(Xt3 /Xt4) + εt yt = 1 + 2 Xt2 + 3 Xt3 + εt * Run least squares on the transformed model. Interpret coefficients same as in original model.

Collinear Variables The term independent variables means an explanatory variable is independent of of the error term, but not necessarily independent of other explanatory variables. Since economists typically have no control over the implicit experimental design, explanatory variables tend to move together which often makes sorting out their separate influences rather problematic.

Effects of Collinearity A high degree of collinearity will produce: 1. no least squares output when collinearity is exact. 2. large standard errors and wide confidence intervals. 3. insignificant t-values even with high R2 and a significant F-value. 4. estimates sensitive to deletion or addition of a few observations or insignificant variables. 5.The OLS estimators retain all their desired properties (BLUE and consistency), but the problem is that the influential procedure may be uninformative.

Identifying Collinearity Evidence of high collinearity include: 1. a high pairwise correlation between two explanatory variables (greater than .8 or .9). a high R-squared (called Rj2) when regressing one explanatory variable (Xj) on the other explanatory variables. Variance inflation factor (VIF): VIF (bj) = 1 / (1  Rj2) ( > 10) 3. high R2 and a statistically significant F-value when the t-values are statistically insignificant.

Mitigating Collinearity High collinearity is not a violation of any least squares assumption, but rather a lack of adequate information in the sample: 1. Collect more data with better information. 2. Impose economic restrictions as appropriate. Impose statistical restrictions when justified. Delete the variable which is highly collinear with other explanatory variables.

Prediction yt = 1 + 2Xt2 + 3Xt3 + εt y0 = b1 + b2X02 + b3X03 Given a set of values for the explanatory variables, (1 X02 X03), the best linear unbiased predictor of y is given by: y0 = b1 + b2X02 + b3X03 ^ This predictor is unbiased in the sense that the average value of the forecast error is zero.