Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random.

Slides:



Advertisements
Similar presentations
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Advertisements

Properties of Least Squares Regression Coefficients
Multiple Regression Analysis
The Simple Regression Model
CHAPTER 3: TWO VARIABLE REGRESSION MODEL: THE PROBLEM OF ESTIMATION
Lecture 3 Today: Statistical Review cont’d:
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
The General Linear Model. The Simple Linear Model Linear Regression.
Lecture 4 Econ 488. Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
The Simple Linear Regression Model: Specification and Estimation
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
2.5 Variances of the OLS Estimators
CHAPTER 3 ECONOMETRICS x x x x x Chapter 2: Estimating the parameters of a linear regression model. Y i = b 1 + b 2 X i + e i Using OLS Chapter 3: Testing.
Simple Linear Regression
Chapter 3 Simple Regression. What is in this Chapter? This chapter starts with a linear regression model with one explanatory variable, and states the.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
3-variable Regression Derive OLS estimators of 3-variable regression
All rights reserved by Dr.Bill Wan Sing Hung - HKBU 4A.1 Week 4a Multiple Regression The meaning of partial regression coefficients.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Econ 140 Lecture 71 Classical Regression Lecture 7.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Multivariate Regression Model y =    x1 +  x2 +  x3 +… +  The OLS estimates b 0,b 1,b 2, b 3.. …. are sample statistics used to estimate 
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Chapter 4-5: Analytical Solutions to OLS
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
7.1 Multiple Regression More than one explanatory/independent variable This makes a slight change to the interpretation of the coefficients This changes.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin The Two-Variable Model: Hypothesis Testing chapter seven.
405 ECONOMETRICS Domodar N. Gujarati Prof. M. El-Sakka
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
1 We will now look at the properties of the OLS regression estimators with the assumptions of Model B. We will do this within the context of the simple.
Estimators and estimates: An estimator is a mathematical formula. An estimate is a number obtained by applying this formula to a set of sample data. 1.
6. Simple Regression and OLS Estimation Chapter 6 will expand on concepts introduced in Chapter 5 to cover the following: 1) Estimating parameters using.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
5. Consistency We cannot always achieve unbiasedness of estimators. -For example, σhat is not an unbiased estimator of σ -It is only consistent -Where.
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Week 21 Statistical Assumptions for SLR  Recall, the simple linear regression model is Y i = β 0 + β 1 X i + ε i where i = 1, …, n.  The assumptions.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Chapter 4. The Normality Assumption: CLassical Normal Linear Regression Model (CNLRM)
6. Simple Regression and OLS Estimation
Multiple Regression Analysis: Estimation
Objectives By the end of this lecture students will:
Simultaneous equation system
Evgeniya Anatolievna Kolomak, Professor
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
The regression model in matrix form
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Statistical Assumptions for SLR
Simple Linear Regression
Linear Regression Summer School IFPRI
Regression Models - Introduction
Presentation transcript:

Topic4 Ordinary Least Squares

Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random variable  with E(  ) = 0 That is, E(Y) =   +   X Or, Y =   +   X + 

O X Y..... Observed points

O X Y Actual Line.. Y=  1 +  2 x...

O X Y Actual Line. Y=  1 +  2 x....

O X Y Actual Line. Y=  1 +  2 x....

O X Y Actual Line Y=  1 +  2 x.....

O X Y Actual Line Y=  1 +  2 x.....

O X Y. Actual Line Y=  1 +  2 x....

O X Y. Actual Line Y=  1 +  2 x... Y= b 1 + b 2 x Fitted Line. BC is an error of Estimation AC is an effect of the random factor C B. A.

The Ordinary Least Squares (OLS) estimates are obtained by minimising the sum of the squares of each of these errors. The OLS estimates are obtained from the values of X and the actual Y values (Y A ) as follows:

Error of estimation (e)  Y A –Y E | where Y E is the estimated value of Y.  e 2  Y A –Y E ] 2  e 2  Y A –(b 1 + b 2 X)] 2  e 2 /  b 1  Y A –(b 1 + b 2 X)] (-1) =0  e 2 /  b 2  Y A –(b 1 + b 2 X)] (-X) = 0

 Y –(b 1 + b 2 X)] (-1) = 0 -NY MEAN + N b 1 + b 2 NX MEAN = 0 b 1 = Y MEAN – b 2 X MEAN ….. (1)

 e 2 /  b 2  Y –(b 1 + b 2 X)] (-X) = 0  Y –(b 1 + b 2 X)] (-X) = 0 b 1  X –b 2  X 2 =  XY ………..(2) b 1 = Y MEAN - b 2 X MEAN ….. (1)

These estimates are given below (with the superscripts for Y dropped).   ^ 1 = (∑Y)(ΣX 2 ) – (∑X)(∑XY) N∑ X 2 - (∑X) 2  ^ 2 = N∑YX – (∑X)(∑Y) N∑ X 2 - (∑X) 2

Alternatively,  ^ 1 = Y MEAN -  ^ 2 X MEAN  ^ 2 = Covariance(X,Y) Variance(X)

(a)  e i  (Y i – Y i E ) = 0 and (b)  X 2i e i   X 2i (Y i – Y i E ) = 0 where Y i E is the estimated value of Y i. X 2i is the same as X i from before Proof:  (Y i – Y i E )=  Y i –  ^ 1 -  ^ 2 X 2i ) =  Y i –   ^ 1 -  ^ 2 X 2i = nY MEAN – n  ^ 1 - n  ^ 2 X MEAN = n(Y MEAN –  ^ 1 -  ^ 2 X MEAN ) = 0 [ since  ^ 1 = Y MEAN -  ^ 2 X MEAN ] Two Important Results

See the lecture notes for a proof of part (b) Total sum of squares (TSS)  (Y i – Y MEAN ) 2 Residual sum of squares (RSS)   (Y i – Y i E ) 2 Explained sum of squares (ESS)   (Y i E – Y MEAN ) 2

To prove that TSS = RSS + ESS TSS ≡  (Y i – Y MEAN ) 2 =  {(Y i – Y i E + Y i E – Y MEAN )} 2 =  (Y i – Y i E ) 2 +  (Y i E – Y MEAN )} 2  (Y i – Y i E )(Y i E – Y MEAN ) = RSS + ESS  (Y i – Y i E )(Y i E – Y MEAN )

 (Y i – Y i E )(Y i E – Y MEAN )  Y i – Y i E )(Y i E ) -Y MEAN  Y i – Y i E )   Y i – Y i E )(Y i E ) [by (a) above]  Y i – Y i E )(Y i E ) =  Y i – Y i E )(  ^ 1  ^ 2 X i ) =  ^ 1  Y i – Y i E )  ^ 2  X i  Y i – Y i E ) = 0 [by (a) and (b) above]

R 2 ≡ ESS/TSS Since TSS = RSS + ESS, it follows that 0  R 2 

Topic 5 Properties of Estimators

In the discussion that follows,  ^  is an estimator of the parameter of interest,  Bias of  ^ ≡ E(  ^) -   ^ is unbiased if Bias of  ^ = 0.  ^ is negatively biased if Bias of  ^ < 0.  ^ is positively biased if Bias of  ^ > 0.

Mean Squared Errors (MSE) of estimation for  ^ is given as MSE   ^ ≡ E[(  ^-  )] 2 MSE   ^ ≡ E[(  ^-  ) 2 ] ≡ E[{  ^-E(  ^) +E(  ^)-    ≡ E[{  ^-E(  ^)} 2 ] + E[{E(  ^)-    2E[{  ^-E(  ^)}*{E(  ^)-  ≡ Var(  ^) + {E(  ^)-    2E[{  ^-E(  ^)}*{E(  ^)- 

Now, E[{  ^-E(  ^)}*{E(  ^)-  ≡ {E(  ^)-E(  ^)}*{E(  ^)-  MSE   ^ ≡ Var(  ^) + {E(  ^)-   MSE   ^ ≡ Var(  ^) + (bias) 2. ≡ 0*{E(  ^)- 

If  ^ is unbiased, that is, if E(  ^)-  = 0. then we have, MSE   ^ ≡ Var(  ^) An unbiased estimator  ^ of a parameter  is efficient if and only if it has the smallest variance of all unbiased estimators TT hat is, for any other unbiased estimator p of  Var(  ^)≤ Var(p)

An estimator  ^ is said to be consistent if it converges in probability to . That is, Lim n  Prob(|  ^-  | >  ) = 0 for every  > 0.

When the above condition holds,  ^ is said to be the probability limit of , that is, plim  ^  Sufficient conditions for consistency: If the mean of  ^ converges to  and var(  ^) converges to zero (as n approaches  ) then  ^ ii s consistent.

That is,  ^ n is consistent if it can be shown that Lim n  E(  ^ n  And Lim n  Var(  ^ n 

The Regression Model with TWO Variables The Model :: Y =     X +  Y is the DEPENDENT variable X is the INDEPENDENT variable Y i    X 1i   X 2i  i 

The OLS estimates  ^ 1 and  ^ 2 are sample statistics used to estimate  1  and   2 respectively Y i    X 1i   X 2i  i  Here X 1i ≡ 1 for all i and X 2 is nothing but X.

Assumptions about X 2 : (1a) X 2 is non-random (chosen by the investigator) (1b) Random sampling is performed from a population of fixed values of X 2. (1c) : Lim (1/n)  x 2 2i ) = Q > 0 n  [ where x 2i  X 2i – X 2MEAN.] (1c) : Lim (1/n)  X 2i ) = P > 0 n 

Assumptions about the disturbance term  2a. E(  ) = 0 2b. Var(  i ) =  2 for all i. 2c. Cov(  i  j ) = 0 for i  j. (The  values are uncorrelated across observations). 2d. The  i all have a normal distribution Homoskedasticity

Result  ^ 2 is linear in the dependent variable Y i  ^ 2 = Covariance(X,Y) Variance(X)  ^ 2 =  Y i –Y MEAN )  X i –X MEAN )  X i –X MEAN ) 2 Proof:

 ^ 2 =  Y i  X i –X MEAN )  X i –X MEAN ) 2 + K   C i Y i  K where the C i and  K are constants

Therefore,  ^ 2 is a linear function of Y i Since, Y i    X 1i   X 2i  i   ^ 2 is a linear function of  i and hence is normally distributed

Similarly,  ^ 1 is a linear function of Y i (and hence  i ) and is normally distributed Both  ^ 1 and  ^ 2 are unbiased estimates of  1 and  2 respectively. That is, E(  ^ 1 ) =  1 and E(  ^ 2 ) =  2

Each of  ^ 1 and  ^ 2 is an efficient estimators of  1 and  2 respectively. Thus, each of  ^ 1 and  ^ 2 is a Best (efficient) Linear (in the dependent variable Y i ) Unbiased Estimator of  1 and  2 respectively. Each of  ^ 1 and  ^ 2 is a consistent estimator of  1 and  2 respectively. Also,

Var(  ^ 1 ) =   (1/n +X 2mean 2  x 2i 2 ) Var(  ^ 2 ) =    x 2i 2 ). Cov(  ^ 1,  ^ 2 ) = -   X 2mean  x 2i 2

LimVar(  ^ 2 ) n  = Lim    x 2i 2 n  = Lim   /n  x 2i 2 /n n  = 0/Q [using assumption (1c)] = 0

Because  ^ 2 is an unbiased estimator of  2 and LimVar(  ^ 2 ) = 0 n   ^ 2 is a consistent estimator of  2

The variance of the random term,  , is not known To perform statistical analysis, we estimate    by  ^ 2  RSS/(n-2) This is because  ^ 2 is an unbiased estimator of  2