Presentation on theme: "Time-series with stationary variables. Dynamic Models, Autocorrelation and Forecasting Adapted from Vera Tabakova’s notes ECON 4551 Econometrics II Memorial."— Presentation transcript:
Time-series with stationary variables. Dynamic Models, Autocorrelation and Forecasting Adapted from Vera Tabakova’s notes ECON 4551 Econometrics II Memorial University of Newfoundland
9.1 Introduction 9.2 Lags in the Error Term: Autocorrelation 9.3 Estimating an AR(1) Error Model 9.4 Testing for Autocorrelation 9.5 An Introduction to Forecasting: Autoregressive Models 9.6 Finite Distributed Lags 9.7 Autoregressive Distributed Lag Models
The model so far assumes that the observations are not correlated with one another. This is believable if one has drawn a random sample, but less likely if one has drawn observations sequentially in time Time series observations, which are drawn at regular intervals, usually embody a structure where time is an important component. Principles of Econometrics, 3rd Edition
If we cannot completely model this structure in the regression function itself, then the remainder spills over into the unobserved component of the statistical model (its error) and this causes the errors be correlated with one another. Principles of Econometrics, 3rd Edition
In general, there are many situations where inertia affects a time series Or lagged effects of explanatory variables Or autocorrelation is introduced by “massaging” of the original data (interpolation, smoothing when going from monthly to quarterly, etc.) Or nonstationarity Principles of Econometrics, 3rd Edition
But, as with heteroskedasticity, we should suspect first that there is a specification issue Wrong functional form or missing variables Principles of Econometrics, 3rd Edition
Three ways to view the dynamics:
Figure 9.2(a) Time Series of a Stationary Variable Assume Stationarity For now
Figure 9.2(b) Time Series of a Nonstationary Variable that is ‘Slow Turning’ or ‘Wandering’
Figure 9.2(c) Time Series of a Nonstationary Variable that ‘Trends’
9.2.1 Area Response Model for Sugar Cane (bangla.gdt)
Assume also But assume v well behaved: (the series is stationary
OK OK, constant Not zero!!! …but since we stick to stationary series, still constant
aka autocorrelation coefficient Running simple OLS On sugarcane example Focus on k=1
Figure 9.3 Least Squares Residuals Plotted Against Time We see “runs” What if we saw “bouncing”?
Constant variance Null expectation of x and y if they are errors General covariance formula:
The existence of AR(1) errors implies: The least squares estimator is still a linear and unbiased estimator, but it is no longer best. There is another estimator with a smaller variance. The standard errors usually computed for the least squares estimator are incorrect. Confidence intervals and hypothesis tests that use these standard errors may be misleading.
As with heteroskedastic errors, you can salvage OLS when your data are autocorrelated. Now you can use an estimator of standard errors that is robust to both heteroskedasticity and autocorrelation proposed by Newey and West. This estimator is sometimes called HAC, which stands for heteroskedasticity autocorrelated consistent.
HAC is not as automatic as the heteroskedasticity consistent (HC) estimator. With autocorrelation you have to specify how far away in time the autocorrelation is likely to be significant The autocorrelated errors over the chosen time window are averaged in the computation of the HAC standard errors; you have to specify how many periods over which to average and how much weight to assign each residual in that average. That weighted average is called a kernel and the number of errors to average is called bandwidth.
Usually, you choose a method of averaging (Bartlett kernel or Parzen kernel) and a bandwidth (nw1, nw2 or some integer). Often your software (example GRETL) defaults to the Bartlett kernel and a bandwidth computed based on the sample size, N. Trade-off: Larger bandwidths reduce bias (good) as well as precision (bad). Smaller bandwidths exclude more relevant autocorrelations (and hence have more bias) but use more observations to increase precision (smaller variance). Choose a bandwidth large enough to contain the largest autocorrelations. The choice will ultimately depend on the frequency of observation and the length of time it takes for your system to adjust to shocks.
Once GRETL recognizes that your data are a time series, then the robust command will automatically apply the HAC estimator of standard errors with the default values of the kernel and Bandwidth (or with your customized choices) Again, remember that there are several choices about how to run these corrections (so different software packages can yield slightly different results) In a way, this correction is just an extension of White’s correction for heteroskedasticity It is based on a formula that has the OLS formula as a special case
Sugar cane example The two sets of standard errors, along with the estimated equation are: The 95% confidence intervals for β 2 are:
However, as in the case of robust estimation under heteroskedasticity, the HAC correction corrects or accounts for autocorrelation but does not exploit it We can get a more precise estimator if we exploit the Idea that some observations are different not because of the effect of the regressor values, but because of the effect of the adjacent errors
Now this transformed nonlinear model has errors that are uncorrelated over time
Solving this nonlinear regression is still based on minimizing the sum least squares, but it cannot be done with close formulae now (because of the nonlinearity in the parameters) so we call it Generalised Least Squares again It can be shown that nonlinear least squares estimation of (9.24) is equivalent to using an iterative generalized least squares estimator called the Cochrane-Orcutt procedure. Details are provided in Appendix 9A.
The nonlinear least squares estimator only requires that the errors be stable (not necessarily stationary). Other methods commonly used make stronger demands on the data, namely that the errors be covariance stationary. Furthermore, the nonlinear least squares estimator gives you an unconditional estimate of the autocorrelation parameter and yields a simple t-test of the hypothesis of no serial correlation. Monte Carlo studies show that it performs well in small samples as well.
But nonlinear least squares requires more computational power than linear estimation, …though this is not much of a constraint these days. Nonlinear least squares (and other nonlinear estimators) use numerical methods rather than analytical ones to find the minimum of your sum of squared errors objective function. The routines that do this are iterative.
We should then test the above restrictions (testing nonlinear restrictions is a possibility)
This restricted model could be estimated with OLS as long as v is well behaved, because there are no nonlinearities Check that our intuition of having dynamics given by lagged effects of errors boils down to having lagged effects of dependent and independent variables!!!
9.4.1Residual Correlogram (more practically…)
Figure 9.4 Correlogram for Least Squares Residuals from Sugar Cane Example
For this nonlinear model, then, the residuals should be uncorrelated
Figure 9.5 Correlogram for Nonlinear Least Squares Residuals from Sugar Cane Example with “whitened” error
Another way to determine whether or not your residuals are autocorrelated is to use an LM (Lagrange multiplier) test. For autocorrelation, this test is based on an auxiliary regression where lagged OLS residuals are added to the original regression equation. If the coefficient on the lagged residual is significant then you conclude that the model is autocorrelated.
We can derive the auxiliary regression for the LM test: First result from GRETL but it comes from the Rearranged regression format with residual as the dependent variable Contrary to what the book suggests!!!
Centered around zero, so the power of this test now would come from The lagged error, which is what we care about… Second result from GRETL
This was the Breusch-Godfrey LM test for autocorrelation and it is distributed chi-sq In STATA: regress la lp estat bgodfrey
There are other variants of this test Note that it is only valid in large samples (theoretically infinitely big ones) GRETL also reports Ljung-Box Q‘ test, which is good even if you do not have a regression to work with and just want to analyze any given time series but it is less powerful than Breusch-Godfrey’s test when the null hypothesis of no autocorrelation is false… The number of lags gives you the df for the chi-sq and the smaller df for the F distributions you need See appendix for popular (if infamous!) Durbin-Watson test and variants
* * LM test * regress la lp estat bgodfrey * * Note: we set ehat(1) = 0 in order to match result in text. In practice this is unnecessary. * predict ehat, residual gen ehat_1=L.ehat replace ehat_1 = 0 in 1 regress ehat lp ehat_1 di (e(N))*e(r2) Principles of Econometrics, 3rd Edition
You should be able to work it out from these notes and by comparing to the STATA code!!! Principles of Econometrics, 3rd Edition
Adding lagged values of y can serve to eliminate the error autocorrelation In general, p should be large enough so that vt is white noise
Figure 9.6 Correlogram for Least Squares Residuals from AR(3) Model for Inflation Helps decide How many lags To include in the model
This model is just a generalization of the ones previously discussed. In this model you include lags of the dependent variable (autoregressive) and the contemporaneous and lagged values of independent variables as regressors (distributed lags). The acronym is ARDL(p,q) where p is the maximum distributed lag and q is the maximum autoregressive lag
Figure 9.7 Correlogram for Least Squares Residuals from Finite Distributed Lag Model
Figure 9.8 Correlogram for Least Squares Residuals from Autoregressive Distributed Lag Model
Figure 9.9 Distributed Lag Weights for Autoregressive Distributed Lag Model
Slide 9-63 Principles of Econometrics, 3rd Edition autocorrelation autoregressive distributed lag models autoregressive error autoregressive model correlogram delay multiplier distributed lag weight dynamic models finite distributed lag forecast error forecasting HAC standard errors impact multiplier infinite distributed lag interim multiplier lag length lagged dependent variable LM test nonlinear least squares sample autocorrelation function standard error of forecast error total multiplier form of LM test
Slide 9-64 Principles of Econometrics, 3rd Edition
Slide 9-65 Principles of Econometrics, 3rd Edition (9A.2) (9A.1)
Slide 9-66 Principles of Econometrics, 3rd Edition (9A.4) (9A.3)
Slide 9-67 Principles of Econometrics, 3rd Edition (9A.5) (9A.6)
Slide 9-68 Principles of Econometrics, 3rd Edition
Slide 9-69 Principles of Econometrics, 3rd Edition (9B.1)
Slide 9-70 Principles of Econometrics, 3rd Edition (9B.2)
Slide 9-71 Principles of Econometrics, 3rd Edition (9B.3)
Figure 9A.1: Principles of Econometrics, 3rd Edition Slide 9-72
The Durbin-Watson test. used to be the standard and is exact (it works in small samples) comes out of almost every software package’s default output it is easy to calculate but the critical values depend on the values of the variables=> unless you have a computer to look up the exact distribution…you are stuck DW did provide the upper and lower distributions but then test may end up being inconclusive Principles of Econometrics, 3rd Edition Slide 9-73
Figure 9A.2: Principles of Econometrics, 3rd Edition Slide 9-74
The Durbin-Watson bounds test. if the test is inconclusive. The size of the inconclusive area shrinks with sample size Principles of Econometrics, 3rd Edition Slide 9-75
The Durbin-Watson test. used to be the standard, but…: it cannot handle lagged values of the dependent variable (it is biased for autoregressive moving average models, so that autocorrelation is underestimated) But for large samples one can compute the unbiased normally distributed h-statistic or Durbin’s alternative test statistic ) But then Breusch-Godfrey test is much more powerful instead Principles of Econometrics, 3rd Edition Slide 9-76
The Durbin-Watson test. used to be the standard, but…: it assumes normality of the errors it cannot handle more than one lag (it detects only AR1 autocorrelation) Principles of Econometrics, 3rd Edition Slide 9-77
Also remember that the Durbin-Watson test assumes that that there is an intercept in the model that the X regressors are nonstochastic that there are no missing values in the series Principles of Econometrics, 3rd Edition Slide 9-78
Slide 9-79 Principles of Econometrics, 3rd Edition
Slide 9-80 Principles of Econometrics, 3rd Edition (9C.2) (9C.1)
Slide 9-81 Principles of Econometrics, 3rd Edition (9C.3)
Slide 9-82 Principles of Econometrics, 3rd Edition (9C.4)
Slide 9-83 Principles of Econometrics, 3rd Edition
Slide 9-84 Principles of Econometrics, 3rd Edition (9C.6) (9C.5)
Slide 9-85 Principles of Econometrics, 3rd Edition (9D.2) (9D.1)
Figure 9A.3: Exponential Smoothing Forecasts for two alternative values of α Principles of Econometrics, 3rd Edition Slide 9-86