Applied Econometrics Second edition

Slides:



Advertisements
Similar presentations
Functional Form and Dynamic Models
Advertisements

Applied Econometrics Second edition
Applied Econometrics Second edition
Welcome to Econ 420 Applied Regression Analysis
Applied Econometrics Second edition
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
CHAPTER 8 MULTIPLE REGRESSION ANALYSIS: THE PROBLEM OF INFERENCE
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Specifying an Econometric Equation and Specification Error
Specification Error II
Introduction and Overview
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Graduate School Gwilym Pryce
Chapter 13 Multiple Regression
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Multiple Linear Regression Model
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Chapter 12 Multiple Regression
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter 5 Heteroskedasticity. What is in this Chapter? How do we detect this problem What are the consequences of this problem? What are the solutions?
Chapter 4 Multiple Regression.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Multiple Regression Analysis
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Chapter 11 Multiple Regression.
Further Inference in the Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Topic 3: Regression.
Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.
Economics Prof. Buckles
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
Ordinary Least Squares
Hypothesis Tests and Confidence Intervals in Multiple Regressors
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Hypothesis Testing in Linear Regression Analysis
Problems with Incorrect Functional Form You cannot compare R 2 between two different functional forms. ▫ Why? TSS will be different. One should also remember.
Regression Method.
Specification Error I.
Chapter 10 Hetero- skedasticity Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Twelve.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
May 2004 Prof. Himayatullah 1 Basic Econometrics Chapter 5: TWO-VARIABLE REGRESSION: Interval Estimation and Hypothesis Testing.
Lecture 14 Summary of previous Lecture Regression through the origin Scale and measurement units.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Nonlinear Models. Agenda Omitted Variables Dummy Variables Nonlinear Models Nonlinear in variables Polynomial Regressions Log Transformed Regressions.
Problems with the Durbin-Watson test
Chap 6 Further Inference in the Multiple Regression Model
KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.
Example x y We wish to check for a non zero correlation.
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
The Simple Linear Regression Model: Specification and Estimation  Theory suggests many relationships between variables  These relationships suggest that.
Heteroscedasticity Heteroscedasticity is present if the variance of the error term is not a constant. This is most commonly a problem when dealing with.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
REGRESSION DIAGNOSTIC IV: MODEL SPECIFICATION ERRORS
More on Specification and Data Issues
STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES
More on Specification and Data Issues
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
Undergraduated Econometrics
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
Tutorial 1: Misspecification
Chapter 7: The Normality Assumption and Inference with OLS
More on Specification and Data Issues
Financial Econometrics Fin. 505
Presentation transcript:

Applied Econometrics Second edition Dimitrios Asteriou and Stephen G. Hall

MISSPECIFICATION 1. Ommiting Influential or Including Non-Influential Explanatory Variables 2. Various Functional Forms 3. Measurement Errors 4. Tests for Mispecification 5. Approaches in Choosing an Appropriate Model

Learning Objectives 1. Understand the various forms of possible misspecification in the CLRM. 2. Appreciate the importance and learn the consequences of omitting influential variables in the CLRM. 3. Distinguish among the wide range of functional forms and understand the meaning and interpretation of their coefficients. 4. Understand the importance of measurement errors in the data. 5. Perform misspecification tests using econometric software. 6. Understand the meaning of nested and non-nested models. 7. Be familiar with the concept of data mining and choose an appropriate econometric model.

Omitting Influential Variables Omitting influential variables from a regression model causes these variables to become part of the error term. Therefore one or more of the assumptions of the CLRM will be violated. Consider the population regression function: Y=β1+β2X2+ β3X3+u where β2≠0 and β3 ≠ 0, and assume this as the correct.

Omitting Influential Variables However, we estimate the following Y=β1+β2X2+u where X3 is wrongfully omitted. Then, the error term of this equation is: u= β3X3+e It is clear that the assumption that the error term has a zero mean is now violated: E(u)=E(β3X3+e)=E(β3X3)+E(e)= E(β3X3) ≠0

Omitting Influential Variables Furthermore, if the excluded variable X3 happens to be correlated with X2 then the error term is no longer independent of X2. This results to estimators of β2 and β3 to be biased and inconsistent. This is called omitted variable bias.

Including Non-Influential Variables This is the opposite case. The correct model is: Y=β1+β2X2+u and we estimated: Y=β1+β2X2+ β3X3+e where X3 is wrongly included in the model.

Including Non-Influential Variables Since X3 does not belong to the correct model, its population coefficient should be equal to zero (i.e. β3=0). If β3=0 then none of the CLRM assumptions is violated and OLS estimators are both unbiased and consistent. However, it is unlikely that they are efficient. If X2 is correlated with X3 then an additional unnecessary element of multicollinearity will be introduced.

Omission and Inclusion at the same time In this case the correct model is: Y=β1+β2X2+ β3X3+v and we estimate: Y=β1+β2X2+ β4X4+w It should be easy now to understand the problems that this double mistake causes.

The Plug in Solution Sometimes it is possible to face omitted variable bias because a key variable that affects Y is not available. For example consider a model where the monthly salary of an individual is associated with Whether or not he/she is male/female. Years he/she has spent in education

The Plug in Solution Both of these factors can be quantified and included in the model. However, if we also assume that the salary level can be affected by the socio-economic environment in which each person was brought up, then this is hard to be measured in order to be included in the model: (salary)= β1+β2(sex)+β3(educ) +β3(background)+u

The Plug in Solution Not including the background variable in the model leads to biased estimates of β1 and β2. Our major interest, however, is to get appropriate estimates for those two coefficients (i.e. we do not care that much for β3 because we will never get the appropriate coefficient for that). A way to resolve that, is to include an alternative proxy variable for the omitted variable.

The Plug in Solution For this example what we can use is family income. Family income is not of course exactly what we mean with background but it is definitely a variable that is highly correlated with that.

The Plug in Solution To illustrate this consider the model: Y=β1+β2X2+ β3X3+β4X*4+u where X2 and X3 are observed, X*4 is unobserved. We know though that X*4=δ1+δ2X4+e Where an error term e should be included because there are not exactly the same and δ1 is also included in order to allow them to be measured in a different scale. We need variables that are positively correlated (i.e. δ2>0)

The Plug in Solution So we estimate: Y=β1+β2X2+ β3X3+β4(δ1+δ2X4+e)+u = a1 + β2X2+ β3X3+ a4X4+ w By estimating this model we do not get unbiased estimates for β1 and β4, but we get unbiased estimators for a1, β2, β3 and a4.

Various Functional Forms Linear Y=β1+β2X2 Linear-Log Y=β1+β2lnX2 Reciprocal Y=β1+β2 (1/X2) Quadratic Y=β1+β2X2 +β3X22 Interaction Y=β1+β2X2 +β3X2Z Log-Linear lnY=β1+β2X2 Double Log lnY=β1+β2lnX2

The Box-Cox Transformation The choice of functional form plays important role; thus, we need a formal test of comparing alternative models (functional forms). If we have the same dependent variable things are easy: estimate both models and choose the one with the higher R2. However, if the dependent variables are different an immediate comparison is impossible.

The Box-Cox Transformation Assume we have those two models: Y=β1+β2X2 and lnY=β1+β2lnX2 In such cases we need to scale the Y variable in such a way that we will be able to compare the two models. The procedure that does that is called the Box-Cox Transformation.

The Box-Cox Transformation Step 1: Obtain the geometric mean of the sample Y values. Y’=(Y1Y2Y3…Yn)1/n=exp[(1/n)ΣlnY) Step 2: Transform the sample Y values by dividing each of them by Y’ obtained from step 1 to get: Y*=Yi/Y’ Step 3: Estimate both models with Y* as the dependent variable. The equation with the lower RSS should be preferred. Step 4: If we want to check whether it is significantly better calculate (1/2 n)ln(RSS2/RSS1) and check with the chi-square distribution. RSS2 is the one with the lower.

Measurement Errors Sometimes the data are not measured appropriately. We can have measurement errors either in the dependent variable or in the explanatory variables or both. If it is in the dependent then we have larger variances of the OLS coefficients. Unavoidable. If it is in the explanatory variables, we have biased and inconsistent estimators. Totally wrong results.

Tests for Misspecification We have the following tests: Test for Normality of the residuals The Ramsey RESET test Tests for Non-Nested Models

Normality of Residuals Step 1: Calculate the Jarque-Berra (JB) Statistic (given in Eviews) Step 2: Find the chi-square critical value from the corresponding tables. Step 3: If JB>chi-square critical reject the null hypothesis of normality.

The Ramsey Reset Test Step 1: Estimate the model that we think is correct and obtain the fitted values of Y, call them Y’. Step 2: Estimate the model of step 1 again, this time including Y’2 and Y’3 as additional explanatory variables. Step 3: The model in step 1 is the restricted model and the model in step 2 is the unrestricted model. Calculate the F-statistic for these two models. Step 4: Compare the F-statistical with the F-critical and conclude (if F-stat>F-crit we reject the null of correct specification.

Tests for Non-Nested Models If we want to test models which are not nested then we can not use the F-statistic approach. Non-nested are the models in which neither equation is a special case of the other, in other words we don’t have restricted and unrestricted models. Suppose for example that we have the following: Y=β1+β2X2 +β3X3+u (1) Y=β1+β2lnX2 +β3lnX3+u (2)

Tests for Non-Nested Models One approach (Mizon and Richard) suggests the estimation of a comprehensive model of the form: Y= δ1+ δ2X2 + δ3X3+ δ4lnX2 +δ5lnX3+e and then to apply an F-test for significance of δ4 and δ5 having as restricted model equation (1).

Tests for Non-Nested Models A second approach (Davidson and McKinnon) suggests that if model (1) is true then the fitted values of (2) should be insignificant in (1) and vice versa. So they suggest the estimation of Y= β1+ β2X2 +β3X3+δY*+e where Y* is the fitted values of model (2). A simple t-test of the coefficient of Y* can conclude.

Choosing the Appropriate Model There are two major approaches The traditional view: Average Economic Regressions (AER) The Hendry’s General to Specific Approach

Choosing the Appropriate Model The AER essentially starts with a simple model and then ‘builds up’ the model as the situation demands. It is also called simple to specific. Two disadvantages: Suffers from data mining. Only the final model is presented by the researcher. The alterations to the original model are carried out in an arbitrary manner based on the beliefs of the researcher.

Choosing the Appropriate Model The Hendry approach starts with a general model that contains – nested within it as special cases – other simpler models and then with appropriate tests to narrow down the model to simpler ones. The model should be: Data admissible, Consistent with the theory Use regressors that are not correlated with the error term Exhibit parameter constancy Exhibit data coherency Encompasing, meaning to include all possible rival models