1 STATIONARY PROCESSES A very simple example: the AR(1) process X t = 2 X t–1 + t where │ 2 │ < 1 and t is iid—independently and identically distributed—with zero mean and finite variance.
9 STATIONARY PROCESSES The figure shows 50 realizations of the process. The AR(1) process is said to be stationary: the distribution of the 50 different realizations does not depend on time.
20 STATIONARY PROCESSES Conditions for weak stationarity: 1.The population mean of the distribution is independent of time. Intuition: this fails if the process tends to go up or down with time (eg. if there is a time trend) 2.The population variance of the distribution is independent of time. Intuition: this fails if different realizations of the process would spread out over time (eg. a random walk) 3.The population covariance between its values at any two time points depends only on the distance between those points, and not on time.
20 STATIONARY PROCESSES Example: let’s check the expectation for an AR(1) process. 2 t tends to zero as t becomes large since │ 2 │<1. Hence 2 t X 0 will tend to zero and the first condition will still be satisfied, apart from initial effects. Technically, we should say that the process is asymptotically stationary (although you still need to check 2 and 3: the variance and covariance will also tend to a limit). Usually we call asymptotically stationary processes stationary.
NONSTATIONARY PROCESSES 3 What if we take an AR(1) process with 2 = 1? This is a random walk, and it’s no longer stationary. The figure shows an example realization of a random walk for the case where t has a normal distribution with zero mean and unit variance.
NONSTATIONARY PROCESSES 4 This figure shows the results of a simulation with 50 realizations. The distribution changes as t increases, becoming increasingly spread out, so the process is non- stationary. Its variance depends on time.
NONSTATIONARY PROCESSES 12 Random walk The variance of the sum of the innovations is equal to the sum of their individual variances. The covariances are all zero because the innovations are assumed to be generated independently. Hence the population variance of X t is directly proportional to t. The variance of the process depends on time, and does not tend to any finite limit. Note: the shock in each time period is permanently incorporated in the series. There is no tendency for the effects of the shocks to attenuate with time.
NONSTATIONARY PROCESSES 21 Random walk with drift This process is known as a random walk with drift, the drift referring to the systematic change in the expectation from one time period to the next. Now both the expectation and the variance depend on time.
NONSTATIONARY PROCESSES 24 The mean is drifting upwards because b 1 has been taken to be positive. If b 1 were negative, it would be drifting downwards. And, as in the case of the random walk with no drift, the distribution spreads out around its mean.
NONSTATIONARY PROCESSES 27 Deterministic trend Another nonstationary time series is one possessing a deterministic time trend. It is nonstationary because the expected value of X t is not independent of t.
NONSTATIONARY PROCESSES 28 The figure shows 50 realizations of a variation where the disturbance term is the stationary process u t = 0.8u t–1 + t. The underlying trend line is shown in white.
NONSTATIONARY PROCESSES 35 Two very common forms of non-stationarity: Difference-stationarity: I(1) is not stationary is stationary Trend-stationarity is not stationary is stationary (for some coefficients)
NONSTATIONARY PROCESSES 36 If a non-stationary process can be transformed into a stationary process by differencing, it is said to be difference-stationary, integrated of order 1, or I(1). A random walk, with or without drift, is an example. Difference-stationarity
NONSTATIONARY PROCESSES 43 A non-stationary time series is described as being trend-stationary if it can be transformed into a stationary process by extracting a time trend. For example, the very simple model given by the first equation can be detrended by subtracting its regression residuals. Trend-stationarity
Spurious regressions with variables possessing deterministic trends What if you regress Y t on X t ? SPURIOUS REGRESSIONS 1 Intuition: the processes X and Y have nothing to do with one another, except for the fact that both tend to increase (or perhaps decrease) with time. For example, perhaps X t is the number of calories consumed by an average American in year t, and Y t is the proportion of women who vote. Because both of these have increased steadily over the past few decades, a regression of one on the other might conclude that giving the women the right to vote caused Americans to eat bigger meals. This is spurious: both values are simply increasing with time, which is omitted from the regression.
Spurious regressions with variables possessing deterministic trends Regresson Regress Y t on X t and t SPURIOUS REGRESSIONS 2 This problem is easy to solve. There are two equivalent options. One option is to de-trend the variables, so that you are only looking at fluctuations around the time trend. A second option is to include t in your regression directly, since it is acting as an omitted variable.
Spurious regressions with variables that are random walks SPURIOUS REGRESSIONS 11 What if you regress Y t on X t ? It turns out that two independent random walks, regressed one on the other, are likely to produce a significant slope coefficient, despite the fact that the series were unrelated. Granger and Newbold generated 100 pairs of random walks, with sample size 50 time periods. In 77 of the trials, the slope coefficient was significant at the 5 percent level: a lot of type I errors! Something is wrong with the OLS estimator when the variables are random walks…
19 Let’s do our own simulation. Our model will be that Y t is related to X t and we will regress Y t on X t where Y t and X t are generated as independent random walks. There is no reason that one should be related to the other. We will examine the distribution of b 2. We will see what happens when we test the null hypothesis H 0 : 2 = 0, knowing that it is true because the series have been generated independently. We will see that the OLS estimator behaves very strangely. SPURIOUS REGRESSIONS
22 This is what we would expect to see, if the dependent variable is independent of the regressor. The OLS estimator should collapse to a spike at zero, since it is consistent. This is not what we will see in the case of random walks. SPURIOUS REGRESSIONS Distribution of b 2 Y t, X t both iid N(0,1)
30 Look what happens in the random walk regressions. While the distribution of b 2 is centred on zero, it refuses to collapse to a spike. The estimator will be unbiased, but inconsistent. Even for very large sample sizes, you’re likely to get some estimates that are very far from the true parameter value, which is zero. SPURIOUS REGRESSIONS Y t, X t both random walks Distribution of b 2 T = 25, 50, 100, 200
31 This is the first time that we have seen an estimator that is inconsistent because its distribution fails to collapse to a spike. Other estimators we saw were asymptotically biased (the spike was in the wrong place) or inefficient (the spike collapsed slowly). SPURIOUS REGRESSIONS Y t, X t both random walks Distribution of b 2 T = 25, 50, 100, 200
42 We’ve misspecified the model. We omitted Y t–1. By omitting Y t–1, we have forced it into the disturbance term. Since Y t is a random walk, this implies that u t is also a random walk. This invalidates the OLS procedure for calculating standard errors, which assumes that u t is generated randomly from one observation to the next (no autocorrelation). SPURIOUS REGRESSIONS True DGP for Y t : What we did:
42 In milder cases of autocorrelation, such as AR(1) errors, the standard errors were wrong, and the variance of the estimator was large, but this variance eventually collapsed to zero. In the case of non-stationary disturbances (in this case, a random walk), the problem is much worse. The OLS estimator can even be inconsistent; the variance is large and stays large. Lesson: if the error term is non-stationary, OLS estimates might be inconsistent. SPURIOUS REGRESSIONS True DGP for Y t : What we did:
General model Alternatives 8 This process is stationary if –1 < 2 < 1 and = 0. Otherwise it is non-stationary. We will exclude 2 > 1 because explosive processes are seldom encountered in economics. We will also exclude 2 < –1 because it doesn’t make much sense. This will be the motivation for a test for non-stationarity, which is often called a unit root test. We will be more general: in our test we will actually include two lags in the model, and the process is stationary if –1 < 2 + 3 < 1 or TEST OF NONSTATIONARITY: INTRODUCTION
Augmented Dickey–Fuller test TEST OF NONSTATIONARITY 26 The ADF test involves a one-sided t-test on Y t-1. The null hypothesis is non- stationarity. Because the regressor is non-stationary under the null, the t-statistic will not have its usual distribution. You need to use Dickey-Fuller critical values. This test is low power: it will be very hard to reject the null if the process is stationary, but highly autocorrelated. Main condition for stationarity:
COINTEGRATION 1 In general, a linear combination of two or more time series will be nonstationary if one or more of them is nonstationary.However, if there is a long-run relationship between the time series, the outcome may be different. Perhaps, for example, X t is a random walk, and Y t just doubles X t : Then Y t is also a random walk, but there is a linear combination of X t and Y t that is stationary: This process is identically zero, which is stationary.
COINTEGRATION 1 We’ve seen that regressing one non-stationary process on another can lead to problems, ie spurious regressions. However, if the two processes are co-integrated, the regression has meaning. How can you tell if you’ve got a meaningful regression? If the processes are co- integrated, the error term ε t in the regression will be stationary, because although the two processes depend on time (non-stationary), they move together in such a way that their difference does not..
COINTEGRATION 1 How do we test for co-integration? If the relationship is known (from theory, for example), you just use a standard unit root test on that relationship: If the cointegrating relationship has to be estimated, the test is an indirect one because it must be performed on the residuals from the regression, rather than on the disturbance term. OLS coefficients minimize the sum of the squares of the residuals, and that the mean of the residuals is automatically zero, so the residuals will tend to appear more stationary than the underlying errors actually are. To allow for this, we use higher critical values than for the standard unit root test, making it harder to reject the null of non-stationarity..
COINTEGRATION 10 Example: permanent consumption and permanent income are co-integrated. Neither is stationary, but their difference is.
COINTEGRATION 14 To test for cointegration, it is necessary to evaluate whether the disturbance term is a stationary process. In the case of the example of consumer expenditure and income, it is sufficient to perform a standard ADF unit root test on the difference between the two series, because the relationship is known.
1 AUTOCORRELATION, PARTIAL ADJUSTMENT, AND ADAPTIVE EXPECTATIONS In the partial adjustment model, there is some long run “target” relationship between X and Y, but in the short term, the dependent variable adapts smoothly. So, in the short run, Y t is a weighted average of the target value, Y* t and last period’s value Y t-1. Intuition: for example, if I lose my job, I should sell my house and move into a studio, but in reality I move into a smaller house, then a few years later an apartment, and eventually a studio, because I don’t want to change my lifestyle so suddenly. I will eventually reach the long-run target.
5 There are a few problems with fitting this equation. 1.Including a lagged dependent variable introduces finite sample bias. 2.If the disturbance term in the target relationship is autocorrelated, it will be autocorrelated in the fitted relationship. Now you also have a lagged dependent variable. OLS would yield inconsistent estimates and you should use an AR(1) estimation method instead. AUTOCORRELATION, PARTIAL ADJUSTMENT, AND ADAPTIVE EXPECTATIONS
In the case of the adaptive expectations model, Y depends on what we expect X to be next year. For example, the number of employees a firm hires will depend on its expected profits for the coming year. An intelligent firm will adapt its expectation as it obtains new information. The expectation for next year will be a weighted average of this year’s actual value and the value that had been expected. We do not observe the firm’s expected values. We can substitute repeatedly for the expected value of X next period, resulting in an equation with lagged values of X, enough lags being taken to render negligible the coefficient of the unobservable variable X e t–s+1. 6 AUTOCORRELATION, PARTIAL ADJUSTMENT, AND ADAPTIVE EXPECTATIONS
A second option is to express Y as a function of X and lagged Y. The disturbance term is a compound of u t and u t–1. Thus if the disturbance term in the original model satisfies the regression model assumptions (is not autocorrelated), the disturbance term in the regression model will be subject to MA(1) autocorrelation (first-order moving average autocorrelation). If you compare the composite disturbance terms for observations t and t – 1, you will see that they have a component u t–1 in common. 11 AUTOCORRELATION, PARTIAL ADJUSTMENT, AND ADAPTIVE EXPECTATIONS
The combination of (moving-average) autocorrelation and the presence of the lagged dependent variable in the regression model causes a violation of Assumption C.7: the regressor and the error are autocorrelated. u t–1 is a component of both Y t–1 and the composite disturbance term. OLS estimates will be biased and inconsistent. Under these conditions, the other regression model should be used instead (the one without lagged Y’s). 12 AUTOCORRELATION, PARTIAL ADJUSTMENT, AND ADAPTIVE EXPECTATIONS
However, look what happens if u t is AR(1) (instead of i.i.d.). Then the composite disturbance term at time t will be as shown. Now, under reasonable assumptions, both and should lie between 0 and 1. Hence it is possible that the coefficient of u t–1 may be small enough for the autocorrelation to be negligible. If that is the case, OLS could be used to fit the regression model after all. just perform an h test to check that there is no (significant) autocorrelation. 17 AUTOCORRELATION, PARTIAL ADJUSTMENT, AND ADAPTIVE EXPECTATIONS
4 If the disturbance term u is an AR(1) process, the model can be rewritten with Y t depending on X t, Y t–1, X t–1, and a disturbance term t that is not subject to autocorrelation. This model is nonlinear in parameters since there is a restriction. It can be thought of as a special case of a more general model involving the same variables. COMMON FACTOR TEST
6 The restricted version imposes an interpretation on the coefficient of Y t–1 that may not be valid. Restricted model Unrestricted model Restriction embodied in the AR(1) process COMMON FACTOR TEST
9 If the original specification has two explanatory variables, the AR(1) special case is again a restricted version of a more general model. In this case, however, the restricted version incorporates two restrictions. Restricted model Unrestricted model Restrictions embodied in the AR(1) process COMMON FACTOR TEST
11 One can, and one should, test the validity of the restrictions. The test is known as the common factor test. Restricted model Unrestricted model Restrictions embodied in the AR(1) process COMMON FACTOR TEST
12 The test involves a comparison of RSS R and RSS U, the residual sums of squares in the restricted and unrestricted specifications. Restricted model Unrestricted model Restrictions embodied in the AR(1) process RSS R RSS U COMMON FACTOR TEST
13 RSS R can never be smaller than RSS U and it will in practice be greater, because imposing a restriction in general leads to some loss of goodness of fit. The question is whether the loss of goodness of fit is significant. Restricted model Unrestricted model Restrictions embodied in the AR(1) process RSS R RSS U COMMON FACTOR TEST
15 Because the restrictions are nonlinear, the F test is inappropriate. Instead, we construct the test statistic shown above. Restricted model Unrestricted model Restrictions embodied in the AR(1) process RSS R RSS U Test statistic: COMMON FACTOR TEST
16 Under the null hypothesis that the restrictions are valid, the test statistic has a 2 (chi- squared) distribution with degrees of freedom equal to the number of restrictions. It is in principle a large-sample test. Restricted model Unrestricted model Restrictions embodied in the AR(1) process RSS R RSS U Test statistic: COMMON FACTOR TEST
16 Note that by fitting these models we claim to rid our specification of autocorrelation. The h-test should confirm this. Restricted model Unrestricted model Restrictions embodied in the AR(1) process RSS R RSS U Test statistic: COMMON FACTOR TEST
Copyright Christopher Dougherty 2000–2011. This slideshow may be freely copied for personal use. 20.02.11