ECON 7710, 2010 10.1 Heteroskedasticity What is heteroskedasticity? What are the consequences? How is heteroskedasticity identified? How is heteroskedasticity.

Slides:



Advertisements
Similar presentations
Heteroskedasticity Hill et al Chapter 11. Predicting food expenditure Are we likely to be better at predicting food expenditure at: –low incomes; –high.
Advertisements

Applied Econometrics Second edition
Managerial Economics in a Global Economy
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Fourteen.
Homoscedasticity equal error variance. One of the assumption of OLS regression is that error terms have a constant variance across all value so f independent.
Heteroskedasticity Prepared by Vera Tabakova, East Carolina University.
Multicollinearity Multicollinearity - violation of the assumption that no independent variable is a perfect linear function of one or more other independent.
Heteroskedasticity The Problem:
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Chapter 13 Multiple Regression
HETEROSKEDASTICITY Chapter 8.
Chapter 13 Additional Topics in Regression Analysis
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter 12 Multiple Regression
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter Topics Types of Regression Models
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Econ 140 Lecture 191 Heteroskedasticity Lecture 19.
Topic 3: Regression.
Review.
1.The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. 2.Homoscedasticity --the.
Ch. 14: The Multiple Regression Model building
Empirical Estimation Review EconS 451: Lecture # 8 Describe in general terms what we are attempting to solve with empirical estimation. Understand why.
Economics Prof. Buckles
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Chapter 8 Forecasting with Multiple Regression
Inference for regression - Simple linear regression
Regression Method.
Returning to Consumption
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
What does it mean? The variance of the error term is not constant
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Chapter 10 Hetero- skedasticity Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Pure Serial Correlation
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
12.1 Heteroskedasticity: Remedies Normality Assumption.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Principles of Econometrics, 4t h EditionPage 1 Chapter 8: Heteroskedasticity Chapter 8 Heteroskedasticity Walter R. Paczkowski Rutgers University.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
Chap 8 Heteroskedasticity
1 Heteroskedasticity. 2 The Nature of Heteroskedasticity  Heteroskedasticity is a systematic pattern in the errors where the variances of the errors.
Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis.
11.1 Heteroskedasticity: Nature and Detection Aims and Learning Objectives By the end of this session students should be able to: Explain the nature.
Heteroskedasticity ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Heteroscedasticity Chapter 8
Chapter 4 Basic Estimation Techniques
Inference for Least Squares Lines
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Econometric methods of analysis and forecasting of financial markets
Fundamentals of regression analysis
Fundamentals of regression analysis 2
Pure Serial Correlation
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
I271B Quantitative Methods
Chapter 6: MULTIPLE REGRESSION ANALYSIS
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
Heteroskedasticity.
BEC 30325: MANAGERIAL ECONOMICS
Financial Econometrics Fin. 505
BEC 30325: MANAGERIAL ECONOMICS
Presentation transcript:

ECON 7710, Heteroskedasticity What is heteroskedasticity? What are the consequences? How is heteroskedasticity identified? How is heteroskedasticity corrected? Objectives

ECON 7710, Main empirical model for Unit 10: foodexp i =  0 +  1 income i +  i. foodexp: Family food expenditure income : Family income Least squares estimates, US data (UE_Tab0301) Is this the best estimated equation?

ECON 7710, The Nature of Heteroskedasticity In a regression about firms, for the same mistake, million billion

ECON 7710, Heteroskedasticity is a problem that occurs when the error term does not have a constant variance. CLRM: Each error term comes from the same probability distribution. Assumption CLRM.5 is violated!

ECON 7710, Y i =  0 +  1 X 1i +  2 X 2i +  i Regression Model E(  i |X 1i,X 2i ) = 0 var(  i |X 1i,X 2i ) =   2 zero mean: homoskedasticity: cov(  i,  j |X 1i,X 2i,X 1j,X 2j ) =  i = j no autocorrelation:

ECON 7710, Identical distributions for observations i and j Distribution for i Distribution for j

ECON 7710, X2X2.. X1X1.. X3X3 X4X4 X Y f(Y) 0 Homoskedasticity Y i =  0 +  1 X i +  i var(  i |X i ) =  2 for all i Conditional Distribution

ECON 7710, Heteroskedasticity Y i =  0 +  1 X i +  i var(  i |X i ) =  i 2 for all i Conditional Distribution

ECON 7710,

ECON 7710,

ECON 7710, Pure heteroskedasticity Different variances of the error term. Correctly specified PRF. Impure heteroskedasticity Different variances of the error term. Specification error.

ECON 7710, Detecting Heteroscedasticity 2.1 Graphical Method Plotting foodexp against income (for one regressor) Example 1 : Food expenditure, US Data (UE_Tab0301)

ECON 7710, Example 1: Food expenditure, US Data, UE_Tab0301 Plotting e against income. Plotting e 2 against income.

ECON 7710, Example 2 : textbook data, (Woody3)

ECON 7710, Park Test Model Y i =  0 +  1 X 1i + … +  K X Ki +  t i = 1,…,N (*) Suppose it is suspected that var(  i ) depends on Z i in the form of var(  i ) =  i 2 =  2 Z i  1 e vi ln  i 2 = ln  2 +  1 lnZ ki + v i Ho:  1 = 0 (Homoskedastic errors); H A :  1  0 (Heteroskedastic errors).

ECON 7710, Step 1: Estimate the equation (*) with OLS and obtain the residuals. Step 2: Regress the natural log of squared residuals on the natural log of a possible proportionality factor ln(e i 2 ) =  0 +  1 lnZ i + v i where v i is an error term satisfying all classical assumptions.

ECON 7710, Step 3 If the coefficient of lnZ is significantly different from zero, then it would suggest that there is heteroscedastic pattern in the residuals with respect to Z. Otherwise, homoscedastic errors cannot be rejected. Example 3: Park Test: US data (UE_Tab0301) ^ ln(e 2 ) = ** ln(income) t (2.28) p-value (0.0284)

ECON 7710, Advantages of the Park test: a.The test is simple. b.It provides information about the variance structure. Limitations of the Park test: a.The distribution of the dependent variable is problematic. b.It assumes a specific functional form. c.It does not work when the variance depends on two or more variables. d.The correct variable with which to order the observations must be identified first. e.It cannot handle partitioned data.

ECON 7710, White’s Test Model Y i =  0 +  1 X 1i +  2 X 2i +  i i = 1,…,N (*) Suppose it is suspected there may be heteroskedasticity but we are not sure of its functional form. H o : The conditional variance of  i is constant. H A : The conditional variance of  i is not constant.

ECON 7710, Step 1 : Estimate the equation (*) with OLS and obtain the residuals. Step 2 : Regress the squared residuals on all explanatory variables, all cross product terms and the square of each explanatory variable. e i 2 =  0 +  1 X 1i +  2 X 2i +  3 X 1i 2 +  4 X 2i 2 +  5 X 1i X 2i + v i

ECON 7710, Step 3 : Test the overall significance of the equation in Step 2. (df = number of regressors) Reject the hypothesis of homoskedasticity if NR 2 err > cv. Statistic = NR 2 white ~  2 df Critical value (cv) =  2 df,  Example 4: White test: US data (UE_Tab0301) ^ e 2 = 1924 – 7.4 income income 2* R 2 = , N = 40, N  R 2 = cv =  2 (2, 0.01) = 9.21.

ECON 7710, Advantages of the White test: a. It does not assume a specific functional form. b. It is applicable when the variance depends on two or more variables. Limitations of the White test: a.It is an large-sample test. b.It provides no information about the variance structure. c.It loses many degrees of freedom when there are many regressors. d.It cannot handle partitioned data. e.It also captures specification errors.

ECON 7710, Consequences of Heteroskedasticity If heteroskedasticity appears but OLS is used for estimation, how are the OLS estimates affected? Unaffected: OLS estimators are still linear and unbiased because, on average, overestimates are as likely as underestimates.

ECON 7710, OLS estimators are inefficient. Some fluctuations of the error term are attributed to the variation in independent variables. There are other linear and unbiased estimators that have smaller variances than the OLS estimator.

ECON 7710, Unreliable Hypothesis Testing  unreliable testing conclusion

ECON 7710, Remedies 4.1 Heteroskedasticity-Corrected Standard Errors Y i =  0 +  1 X 1i +  2 X 2i +  i heteroskedasticity: var(  i ) =  i  2 OLS estimators are unbiased. The standard errors of OLS are biased.

ECON 7710, A heteroskedasticity-consistent (HC) standard error of an estimated coefficient is a standard error of an estimated coefficient adjusted for heteroskedasticity. a. HC standard errors are consistent for any type of heteroskedasticity. b. Hypothesis tests are valid with HC standard errors in large samples. c. Typically, HC se > OLS se

ECON 7710, incorrect variance formula: Example 5: Y i =  0 +  1 X i +  i, var(  i |X i ) =  i. correct variance formula:

ECON 7710, HC estimator of the variance of the slope coefficient in a simple regression model Example 6 : HC Standard Errors, US data (UE_Tab0301)

ECON 7710, Y i =  0 +  1 X 1i +  2 X 2i +  i  i  2 = c Z i 2 The variance is assumed to be proportional to the value of Z i 2 var(  i ) =  i  2 E(  i ) = 0 cov(  t,  s ) = 0 t = s 4.2 Weighted Least Squares

ECON 7710, Step 1: Decide which variable is proportional to the heteroskedasticity. Step 2: Divide all terms in the original model by that variable (divide by Z i ).

ECON 7710, Step 3: Run least squares on the transformed model which has new variables. Note that the transformed model have an intercept only if Z is one of the explanatory variables. For example, if Z i = X 2i, then

ECON 7710, Example 7 : WLS: US data (UE_Tab0301) What are values of the estimated coefficients of the original model? Has the problem of heteroskedasticity solved?

ECON 7710, 00 11 OLS estimate *** OLS se HC se WLS estimate *** WLS se Comparing different estimates: US data (UE_Tab0301) The WLS estimates have improved upon those of OLS.

ECON 7710, Other possibilities var(  i ) = cZ i var(  i ) = cZ i  var(  i ) = c(a 1 X 1i + a 2 X 2i )

ECON 7710, In large samples HC standard errors are consistent measures for any type of heteroscedasticity. CI & t-test are valid.

ECON 7710, Re-specifying the Regression Model Use another functional form E.g., Double-log: Less variation The heteroskedasticity may be impure. Example 8 : US data (UE_Tab0301) The hypothesis of constant variance can be rejected.

ECON 7710, Empirical model: foodexp i =  0 +  1 totexp i +  i. Example 9 : India data (Food_India55) The hypothesis of homoskedasticity can be rejected by the Park and White tests.

ECON 7710, Double-log HC WLS Which model is the best?

ECON 7710, Other reformulations E.g., take average of variables related to the size of observed units, adding more variables Example 10 : Data set “Concert” The concert tour of a singer in the US revenue =  0 +  1 adv +  2 stad +  3 cd +  4 radio +  5 weekend + .

ECON 7710, (1) (2) (3)

ECON 7710, Remarks: The variable Z is difficult to identify. The functional relationship between the error and Z is not known. Use WLS at last. With correct WLS, we expect the standard errors of the regression coefficients will be smaller than the OLS counterparts. A log transformation usually reduces the degree of heteroskedasticity. The hypothesis of homoskedasticity should not be rejected in the new model.

ECON 7710, A Complete Example Sources: Section (pp. 255 – 256) Section 10.5 (pp. 369 – 376) pcon i =  0 +  1 reg i +  2 tax i +  3 uhm i +  i. Empirical regression model pconi 1 : petroleum consumption in the ith state reg i : motor vehicle registrations in the ithstate (‘000) tax i : the gasoline tax rate in the ith state(cents per gallon) uhm : urban highway miles wihtin the ith state

ECON 7710, pcon = *** – 0.061reg – *** tax *** uhm se, vif (0.04, 24.3) (13.15, 1.1) (10.26, 24.9) Adj. R 2 = , N = 50. ^ Equation 1 Equation 2 pcon = *** *** reg – *** tax se (0.012) (16.86) Adj. R 2 = , N = 50. ^

ECON 7710, Graphical investigation

ECON 7710, Park test White test ln(e 2 ) = ***ln(REG) R 2 = , N = 50 se (0.3083) ^ e 2 = 11,098, REG – REG 2 – 12.84REG  TAX – 237,873TAX TAX 2. R 2 = , N = 50, N  R 2 = ^ Checking for other specifications: Double log, quadratic

ECON 7710, pcon = *** *** reg – *** tax hc se (0.022) (23.90) R 2 = , N = 50. ^ (4) (5) (6)

ECON 7710, Selected Exercises Ch. 10: Q. 1, 3, 4, 5, 8, 10, 12, 14

ECON 7710, Y i =  0 +  1 X 1i +  2 X 2i +  i Regression Model E(  i |X 1i,X 2i ) = 0 var(  i |X 1i,X 2i ) =   2 zero mean: homoskedasticity: cov(  i,  j |X 1i,X 2i,X 1j,X 2j ) =  i = j no autocorrelation: heteroskedasticity: var(  i |X 1i,X 2i ) =  i  2

ECON 7710, X1X1 X2X2. X3X3. X Y f(Y) Heteroskedasticity Y i =  0 +  1 X i +  i var(  i |X i ) =  i 2 for all i Conditional Distribution 0

ECON 7710, Step 3: Test the overall significance of the equation in Step 2. (df = number of regressors) Reject the hypothesis of homoskedasticity if NR 2 err > cv. Statistic = NR 2 err ~  2 df Critical value (cv) =  2 df, 

ECON 7710, Step 1: Decide which variable is proportional to the heteroskedasticity. Step 2: Divide all terms in the original model by that variable (divide by Z i ).

ECON 7710, Step 3: Run least squares on the transformed model which has new variables. Note that the transformed model have an intercept only if Z is one of the explanatory variables. For example, if Z i = X 2i, then

ECON 7710, In large samples HC standard errors are consistent measures for any type of heteroscedasticity. CI & t-test are valid.