Properties of Least Squares Regression Coefficients

Slides:



Advertisements
Similar presentations
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Advertisements

Things to do in Lecture 1 Outline basic concepts of causality
Estimation of Means and Proportions
Economics 20 - Prof. Anderson1 The Simple Regression Model y =  0 +  1 x + u.
Multiple Regression Analysis
Lesson 10: Linear Regression and Correlation
The Simple Regression Model
CHAPTER 3: TWO VARIABLE REGRESSION MODEL: THE PROBLEM OF ESTIMATION
The Multiple Regression Model.
3.3 Omitted Variable Bias -When a valid variable is excluded, we UNDERSPECIFY THE MODEL and OLS estimates are biased -Consider the true population model:
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Objectives (BPS chapter 24)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 6: Interpreting Regression Results Logarithms (Chapter 4.5) Standard Errors (Chapter.
The Simple Linear Regression Model: Specification and Estimation
2.5 Variances of the OLS Estimators
CHAPTER 3 ECONOMETRICS x x x x x Chapter 2: Estimating the parameters of a linear regression model. Y i = b 1 + b 2 X i + e i Using OLS Chapter 3: Testing.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
The Simple Regression Model
The Simple Regression Model
Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random.
Introduction to Probability and Statistics Linear Regression and Correlation.
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
Business Statistics - QBM117 Statistical inference for regression.
Introduction to Regression Analysis, Chapter 13,
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Ordinary Least Squares
Lecture 5 Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Hypothesis Testing in Linear Regression Analysis
Regression Method.
Inferences for Regression
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
3.4 The Components of the OLS Variances: Multicollinearity We see in (3.51) that the variance of B j hat depends on three factors: σ 2, SST j and R j 2.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Lecturer: Ing. Martina Hanová, PhD..  How do we evaluate a model?  How do we know if the model we are using is good?  assumptions relate to the (population)
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
The simple linear regression model and parameter estimation
Regression Analysis AGEC 784.
6. Simple Regression and OLS Estimation
Ch. 2: The Simple Regression Model
Multiple Regression Analysis: Estimation
The Simple Linear Regression Model: Specification and Estimation
The Simple Regression Model
ECONOMETRICS DR. DEEPTI.
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
Multiple Regression Analysis
Ch. 2: The Simple Regression Model
Correlation and Regression
Inference about the Slope and Intercept
The regression model in matrix form
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
No notecard for this quiz!!
Inference about the Slope and Intercept
Simple Linear Regression
The Simple Regression Model
Inferences for Regression
Ch3 The Two-Variable Regression Model
Regression Models - Introduction
Presentation transcript:

Properties of Least Squares Regression Coefficients In addition to the overall fit of the model, we now need to ask how accurate each individual estimated OLS coefficient is (R2 is a summary measure of the model as a whole – doesn’t distinguish between different variables) Measure statistical performance of estimated coefficients using 2 concepts Bias Efficiency

Unlike the estimated residual, To do this need to make some assumptions about the behaviour of the (true) residual term that underlies our view of the world (Gauss-Markov assumptions) Unlike the estimated residual, the true residual u is never observed (remember the true state of the world Y=b0 +b1X + u is never observed, only an estimate of it So all we can ever do is make some assumptions about the behaviour of u

(NOT proved to be equal to zero unlike the OLS residual) and the 1st assumption is that 1. E(ui) = 0 The expected (average or mean) value of the true residual is assumed to be zero (NOT proved to be equal to zero unlike the OLS residual) - sometimes positive, sometimes negative, but there is never any systematic behaviour in this random variable so that on average its value is zero

The 2nd assumption about the unknown true residuals is that Var(ui / Xi) = σu2 = constant ie the spread of residual values is constant for all X values in the data set (homoskedasticity) - think of a value of the X variable and look at the different values of the residual at this value of X. The distribution of these residual values around this point should be no different than the distribution of residual values at any other value of X - this is a useful assumption since it implies that no particular value of X carries any more information about the behaviour of Y than any other

3. Cov(ui uj) = 0 for all i ≠ j (no autocorrelation) there is no systematic association between values of the residuals so knowledge of the value of one residual imparts no information about the value of any other residual.

4. Cov(X, ui) = 0 - there is zero covariance (association) between the residual and any value of X – ie X and the residual are independent so no value of X gives any information about the value of any residual - This means that we can distinguish the individual contributions of X and u in explaining Y (Note that this assumption is automatically satisfied if X is non-stochastic ie non-random so can be treated like a constant, measured with certainty) Given these 4 assumptions we can proceed to establish the properties of OLS estimates

The 1st desirable feature of any estimate of any coefficient is that it should, on average, be as accurate an estimate of the true coefficient as possible. Accuracy in this context is given by the “bias” This means that we would like the expected, or average, value of the estimator to equal the true (unknown) value for the population of interest ie if continually re-sampled and re-estimated the same model and plotted the distribution of estimates then would expect the mean value of these estimates to equal the true value (which would only be obtained if sampled everyone in the relevant population) In this case the estimator is said to be “unbiased”

Are OLS estimates biased?

Given a model Y=b0+b1X+u We now know the OLS estimate of the slope is calculated as Sub. In for Y=b0+b1X+u and using the rules of covariance (from problem set 0) Cov(X,Y) = Cov(X, b0+b1X+ u) = Cov(X, b0) + Cov(X, b1X) + Cov(X, u) Consider the 1st term Cov(X, b0) since b0 is a constant it has no variance no matter the value of X, so Cov(X, b0)=0

So Cov(X,Y) = 0 + Cov(X, b1X) + Cov(X, u) Consider the 2nd term Cov(X, b1X) since b1 is a constant can take it outside the bracket so Cov(X, b1X) = b1Cov(X, X) (see problem set 0) = b1Var(X) (since Cov(X,X) is just Var(X) ) Hence Cov(X,Y) = 0+b1Var(X) + Cov(X,u) Sub. this into OLS slope formula Using rules for fractions

Given that now since Cov(X,u) is assumed =0, implies that Now need expected values to establish the extent of any bias. It follows that taking expectations And since term on right hand side is a constant so that, on average, the OLS estimate of the slope will be equal to the true (unknown) value ie OLS estimates are unbiased So don’t need to sample entire population since OLS on a sub-sample will give an unbiased estimate of the truth (Can show unbiased property also holds for ols estimate of constant – see problem set 2)

How to obtain estimates by OLS Goodness of fit measure, R2 Bias & Efficiency of OLS Hypothesis testing

Precision of OLS Estimates So this result means that if OLS done on a 100 different (random) samples would not expect to get same result every time – but the average of those estimates would equal the true value Given 2 (unbiased) estimates will prefer the one whose range of estimates are more concentrated around the true value. Measure efficiency of any estimate by its dispersion – based on the variance (or more usually its square root – the standard error) Can show (see Gujarati Chap. 3 for proof) that variance of the OLS estimates of the intercept and the slope are (where σ2u = Var(u) = variance of true (not estimated) residuals)

This formula makes intuitive sense since 1) the variance of the OLS estimate of the slope is proportional to the variance of the residuals, σ2u the more there is random unexplained behaviour in the population, the less precise the estimates 2) the larger the sample size, N, the lower (the more efficient) the variance of the OLS estimate more information means estimates likely to be more precise 3) the larger the variance in the X variable the more precise (efficient) the OLS estimates the more variation in X the more likely it is to capture any variation in the Y variable

Estimating σu2 In practice never know variation of true residuals used in the formula Can show, however, that an unbiased estimate of σ2u is given by which since means Or equivalently (learn this equation) Sub. this into (1) and (2) gives the formula for the precision of the OLS estimates.

So substitute into the equations below gives the working formula needed to calculate the precision of the OLS estimates. At same time usual to work with the square root to give standard errors of the estimates (standard deviation refers to the known variance, standard error refers to the estimated variance) (learn this)

Gauss-Markov Theorem Given Gauss-Markov assumptions 1-4 hold 1. E(ui) = 0 2. Var(ui / Xi) = σu2 = constant 3. 3. Cov(ui uj) = 0 for all i ≠ j 4. Cov(X, ui) = 0 then can prove a very important result: that OLS estimates are unbiased and will have the smallest variance of all (linear) unbiased estimators - there may be other ways of obtaining unbiased estimates, but OLS estimates will have the smallest standard errors regardless of the sample size