Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 6 Autocorrelation.

Similar presentations


Presentation on theme: "Chapter 6 Autocorrelation."— Presentation transcript:

1 Chapter 6 Autocorrelation

2 What is in this Chapter? How do we detect this problem?
What are the consequences? What are the solutions?

3 What is in this Chapter? Regarding the problem of detection, we start with the Durbin-Watson (DW) statistic, and discuss its several limitations and extensions. We discuss Durbin's h-test for models with lagged dependent variables and tests for higher-order serial correlation. We discuss (in Section 6.5) the consequences of serially correlated errors and OLS estimators.

4 What is in this Chapter? The solutions to the problem of serial correlation are discussed in Section 6.3 (estimation in levels versus first differences), Section 6.9 (strategies when the DW test statistic is significant), and Section 6.10 (trends and random walks). This chapter is very important and the several ideas have to be understood thoroughly.

5 6.1 Introduction The order of autocorrelation
In the following sections we discuss how to: 1. Test for the presence of serial correlation. 2. Estimate the regression equation when the errors are serially correlated.

6 6.2 Durbin-Watson Test

7 6.2 Durbin-Watson Test

8 6.2 Durbin-Watson Test

9 6.2 Durbin-Watson Test

10 6.2 Durbin-Watson Test

11 6.3 Estimation in Levels Versus First Differences
Simple solutions to the serial correlation problem: First Difference If the DW test rejects the hypothesis of zero serial correlation, what is the next step? In such cases one estimates a regression by transforming all the variables by ρ-differencing (quasi-first difference) or first-difference

12 6.3 Estimation in Levels Versus First Differences

13 6.3 Estimation in Levels Versus First Differences

14 6.3 Estimation in Levels Versus First Differences
When comparing equations in levels and first differences, one cannot compare the R2 because the explained variables are different. One can compare the residual sum of squares but only after making a rough adjustment. (Please refer to P.231)

15 6.3 Estimation in Levels Versus First Differences

16 6.3 Estimation in Levels Versus First Differences

17 6.3 Estimation in Levels Versus First Differences
Since we have comparable residual sum of squares (RSS), we can get the comparable R2 as well, using the relationship RSS = Syy(l — R2)

18 6.3 Estimation in Levels Versus First Differences

19 6.3 Estimation in Levels Versus First Differences
Illustrative Examples

20 6.3 Estimation in Levels Versus First Differences

21 6.3 Estimation in Levels Versus First Differences

22 6.3 Estimation in Levels Versus First Differences

23 6.3 Estimation in Levels Versus First Differences

24 6.3 Estimation in Levels Versus First Differences
Usually, with time-series data, one gets high R2 values if the regressions are estimated with the levels yt and Xt but one gets low R2 values if the regressions are estimated in first differences (yt — yt-1) and (xt — xt-1) Since a high R2 is usually considered as proof of a strong relationship between the variables under investigation, there is a strong tendency to estimate the equations in levels rather than in first differences. This is sometimes called the “R2 syndrome."

25 6.3 Estimation in Levels Versus First Differences
However, if the DW statistic is very low, it often implies a misspecified equation, no matter what the value of the R2 is In such cases one should estimate the regression equation in first differences and if the R2 is low, this merely indicates that the variables y and x are not related to each other.

26 6.3 Estimation in Levels Versus First Differences
Granger and Newbold present some examples with artificially generated data where y, x, and the error u are each generated independently so that there is no relationship between y and x But the correlations between yt and yt-1,.Xt and Xt-1, and ut and ut-1 are very high Although there is no relationship between y and x the regression of y on x gives a high R2 but a low DW statistic

27 6.3 Estimation in Levels Versus First Differences
When the regression is run in first differences, the R2 is close to zero and the DW statistic is close to 2 Thus demonstrating that there is indeed no relationship between y and x and that the R2 obtained earlier is spurious Thus regressions in first differences might often reveal the true nature of the relationship between y and x. Further discussion of this problem is in Sections 6.10 and 14.7

28 Homework Find the data Run two equations
Y is the Taiwan stock index X is the U.S. stock index Run two equations The equation in levels (log-based price) The equation in the first differences A comparison between the two equations The beta estimate and its significance The R square The value of DW statistic Q: Adopt the equation in levels or the first differences?

29 6.3 Estimation in Levels Versus First Differences
For instance, suppose that we have quarterly data; then it is possible that the errors in any quarter this year are most highly correlated with the errors in the corresponding quarter last year rather than the errors in the preceding quarter That is, ut could be uncorrelated with ut-1 but it could be highly correlated with ut-4. If this is the case, the DW statistic will fail to detect it What we should be using is a modified statistic defined as

30 6.3 Estimation in Levels Versus First Differences

31 6.4 Estimation Procedures with Autocorrelated Errors

32 6.4 Estimation Procedures with Autocorrelated Errors

33 6.4 Estimation Procedures with Autocorrelated Errors

34 6.4 Estimation Procedures with Autocorrelated Errors

35 6.4 Estimation Procedures with Autocorrelated Errors
GLS (Generalized least squares)

36 6.4 Estimation Procedures with Autocorrelated Errors

37 6.4 Estimation Procedures with Autocorrelated Errors
In actual practice ρ is not known There are two types of procedures for estimating 1. Iterative procedures 2. Grid-search procedures.

38 6.4 Estimation Procedures with Autocorrelated Errors

39 6.4 Estimation Procedures with Autocorrelated Errors

40 6.4 Estimation Procedures with Autocorrelated Errors

41 6.4 Estimation Procedures with Autocorrelated Errors

42 6.4 Estimation Procedures with Autocorrelated Errors

43 6.4 Estimation Procedures with Autocorrelated Errors

44 Homework Redo the example (see Table 3.11 for the data) in the Textbook OLS C-O procedure H-L procedure with the interval of 0.01 Compare the R2 (Note: please calculate the comparable R2 form the levels equation)

45 6.5 Effect of AR(1) Errors on OLS Estimates
In Section 6.4 we described different procedures for the estimation of regression models with AR(1) errors We will now answer two questions that might arise with the use of these procedures: 1. What do we gain from using these procedures? 2. When should we not use these procedures?

46 6.5 Effect of AR(1) Errors on OLS Estimates
First, in the case we are considering (i.e., the case where the explanatory variable Xt is independent of the error ut), the OLS estimates are unbiased However, they will not be efficient Further, the tests of significance we apply, which will be based on the wrong covariance matrix, will be wrong.

47 6.5 Effect of AR(1) Errors on OLS Estimates
In the case where the explanatory variables include lagged dependent variables, we will have some further problems, which we discuss in Section 6.7 For the present, let us consider the simple regression model

48 6.5 Effect of AR(1) Errors on OLS Estimates

49 6.5 Effect of AR(1) Errors on OLS Estimates

50 6.5 Effect of AR(1) Errors on OLS Estimates

51 6.5 Effect of AR(1) Errors on OLS Estimates

52 6.5 Effect of AR(1) Errors on OLS Estimates

53 6.5 Effect of AR(1) Errors on OLS Estimates

54 6.5 Effect of AR(1) Errors on OLS Estimates

55 6.5 Effect of AR(1) Errors on OLS Estimates

56 An Alternative Method to Prove the Above Characteristics???
Use “simulation method” as shown at Chapter 5 Write your program by the Gauss program Take the program at Chapter 5 and make some modifications on it

57 6.5 Effect of AR(1) Errors on OLS Estimates
Thus the consequences of autocorrelated errors are: 1. The least squares estimators are unbiased but are not efficient. Sometimes they are considerably less efficient than the procedures that take account of the autocorrelation 2. The sampling variances are biased and sometimes likely to be seriously understated. Thus R2 as well as t and F statistics tend to be exaggerated.

58 6.5 Effect of AR(1) Errors on OLS Estimates

59 6.5 Effect of AR(1) Errors on OLS Estimates
2. The discussion above assumes that the true errors are first-order autoregressive. If they have a more complicated structure (e.g., second-order autoregressive), it might be thought that it would still be better to proceed on the assumption that the errors are first-order autoregressive rather than ignore the problem completely and use the OLS method??? Engle shows that this is not necessarily true (i.e., sometimes one can be worse off making the assumption of first-order autocorrelation than ignoring the problem completely).

60 6.5 Effect of AR(1) Errors on OLS Estimates

61 6.5 Effect of AR(1) Errors on OLS Estimates

62 6.5 Effect of AR(1) Errors on OLS Estimates

63 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
In previous sections we considered explanatory variables that were uncorrelated with the error term This will not be the case if we have lagged dependent variables among the explanatory variables and we have serially correlated errors There are several situations under which we would be considering lagged dependent variables as explanatory variables These could arise through expectations, adjustment lags, and so on.

64 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
The various situations and models are explained in Chapter 10. For the present we will not be concerned with how the models arise. We will merely study the problem of testing for autocorrelation in these models Let us consider a simple model

65 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

66 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

67 new; format /m1 /rd 9,3; beta=2; sample u=Rndn(T,1); x=Rndn(T,1)+0*u; y=beta*x+u; @ Beta_OLS=olsqr(y,x); print " OLS beta estimate "; Beta_OLS;

68

69 new; format /m1 /rd 9,3; beta=2; sample u=Rndn(T,1); x=Rndn(T,1)+0*u; y=beta*x+u; @ Beta_OLS=olsqr(y,x); print " OLS beta estimate "; Beta_OLS;

70

71 new; format /m1 /rd 9,3; beta=2; sample u=Rndn(T,1); x=Rndn(T,1)+0.5*u; y=beta*x+u; @ Beta_OLS=olsqr(y,x); print " OLS beta estimate "; Beta_OLS;

72

73 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

74 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

75 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

76 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

77 6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables

78 6.8 A General Test for Higher-Order Serial Correlation: The LM Test
The h-test we have discussed is, like the Durbin-Watson test, a test for first-order autoregression. Breusch and Godfrey discuss some general tests that are easy to apply and are valid for very general hypotheses about the serial correlation in the errors These tests are derived from a general principle — called the Lagrange multiplier (LM) principle A discussion of this principle is beyond the scope of this book. For the present we will explain what the test is The test is similar to Durbin's second test that we have discussed

79 6.8 A General Test for Higher-Order Serial Correlation: The LM Test

80 6.8 A General Test for Higher-Order Serial Correlation: The LM Test

81 6.8 A General Test for Higher-Order Serial Correlation: The LM Test

82 6.8 A General Test for Higher-Order Serial Correlation: The LM Test

83 6.8 A General Test for Higher-Order Serial Correlation: The LM Test

84 6.9 Strategies When the DW Test Statistic is Significant
The DW test is designed as a test for the hypothesis ρ = 0 if the errors follow a first-order autoregressive process However, the test has been found to be robust against other alternatives such as AR(2), MA(1), ARMA(1, 1), and so on. Further, and more disturbingly, it catches specification errors like omitted variables that are themselves autocorrelated, and misspecified dynamics (a term that we will explain). Thus the strategy to adopt, if the DW test statistic is significant, is not clear. We discuss three different strategies:

85 6.9 Strategies When the DW Test Statistic is Significant
1. Assume that the significant DW statistic is an indication of serial correlation but may not be due to AR(1) errors 2. Test whether serial correlation is due to omitted variables. 3. Test whether serial correlation is due to misspecified dynamics.

86 6.9 Strategies When the DW Test Statistic is Significant

87 6.9 Strategies When the DW Test Statistic is Significant

88 6.9 Strategies When the DW Test Statistic is Significant

89 6.9 Strategies When the DW Test Statistic is Significant

90 6.9 Strategies When the DW Test Statistic is Significant

91 6.9 Strategies When the DW Test Statistic is Significant
Serial correlation due to misspecification dynamics

92 6.9 Strategies When the DW Test Statistic is Significant

93 6.9 Strategies When the DW Test Statistic is Significant

94 6.9 Strategies When the DW Test Statistic is Significant

95 6.9 Strategies When the DW Test Statistic is Significant

96 6.10 Trends and Random Walks

97 6.10 Trends and Random Walks

98 6.10 Trends and Random Walks

99 6.10 Trends and Random Walks

100 6.10 Trends and Random Walks
Both the models exhibit a linear trend. But the appropriate method of eliminating the trend differs To test the hypothesis that a time series belongs to the TSP class against the alternative that it belongs to the DSP class, Nelson and Plosser use a test developed by Dickey and Fuller:

101 6.10 Trends and Random Walks

102 6.10 Trends and Random Walks

103 Three Types of RW RW without drift: Yt=1*Yt-1+ut;
RW with drift: Yt=alpha+1*Yt-1+ut; RW with drift and time trend: Yt=alpha+beta*t+1*Yt-1+ut ut~iid(0,sigma)

104

105

106

107

108 RW or Unit Root tests by E-view
Additional Slides: Augmented D-F tests Yt=a1*Yt-1+ut; Yt-Yt-1=(a1-1)*Yt-1+ut ΔYt=(a1-1)*Yt-1+ut ΔYt=λ*Yt-1+ut H0:a1=1 ≡ H0: λ=0 ΔYt=λ*Yt-1+ΣΔYt-i+ut

109 6.10 Trends and Random Walks

110 6.10 Trends and Random Walks
As an illustration consider the example given by Dickey and Fuller.36 For the logarithm of the quarterly Federal Reserve Board Production Index through they assume that the time series is adequately represented by the model

111 6.10 Trends and Random Walks

112 6.10 Trends and Random Walks

113 6.10 Trends and Random Walks
6. Regression of one random walk on another, with time included for trend, is strongly subject to the spurious regression phenomenon. That is, the conventional t-test will tend to indicate a relationship between the variables when none is present.

114 6.10 Trends and Random Walks
The main conclusion is that using a regression on time has serious consequences when, in fact, the time series is of the DSP type and, hence, differencing is the appropriate procedure for trend elimination Plosser and Schwert also argue that with most economic time series it is always best to work with differenced data rather than data in levels The reason is that if indeed the data series are of the DSP type, the errors in the levels equation will have variances increasing over time

115 6.10 Trends and Random Walks
Under these circumstances many of the properties of least squares estimators as well as tests of significance are invalid On the other hand, suppose that the levels equation is correctly specified. Then all differencing will do is produce a moving average error and at worst ignoring it will give inefficient estimates For instance, suppose that we have the model

116 6.10 Trends and Random Walks

117 6.10 Trends and Random Walks
Differencing and Long-Run Effects:The Concept of Cointegration One drawback of the procedure of differencing is that it results in a loss of valuable "long-run information" in the data Recently, the concept of cointegrated series has been suggested as one solution to this problem.39 First, we need to define the term "cointegration.“ Although we do not need the assumption of normality and independence, we will define the terms under this assumption.

118 6.10 Trends and Random Walks

119 6.10 Trends and Random Walks
Yt~I(1) Yt is a random walk △Yt is a white noise, or iid No one could predict the future price change The market is efficient The impact of previous shock on the price will remain and not approach to zero

120 6.10 Trends and Random Walks

121 6.10 Trends and Random Walks

122 6.10 Trends and Random Walks

123 Cointegration

124 Cointegration

125 Cointegration Run the VECM (vector error correction model) by E-view
Additional slides

126 Cointegration

127 Lead-lag relation obtained with VECM model
If beta_A is significant and beta_U is insignificant, the price adjustment mainly depends on ADR markets ADR prices converge to UND prices UND prices lead ADR prices in price discovery process UND prices provide an information advantage

128 If beta_U is significant and beta_A is insignificant,
the price adjustment mainly depends on UND markets UND prices converge to ADR prices ADR prices lead UND prices in price discovery process ADR prices provide an information advantage

129 If both of beta_U and beta_A are significant
suggesting a bidirectional error correction The equilibrium prices line within ADR and UND prices Both ADR and UND prices converge to the equilibrium prices

130 If both of beta_U and beta_A are significant, but the beta_U is greater than beta_A in absolute velue The finding denotes that it is the UND price that makes greater adjustment in order to reestablish the equilibrium That is, most of the price discovery takes place at the ADR market.

131 Homework Find the spot and futures prices
Daily and 5-year data at least Run the cointegration test Run the VECM Lead-lag relationship

132 6.11 ARCH Models and Serial Correlation
We saw in Section 6.9 that a significant DW statistic can arise through a number of misspecifications. We will now discuss one other source. This is the ARCH model suggested by Engle which has, in recent years, been found useful in the analysis of speculative prices. ARCH stands for "autoregressive conditional heteroskedasticity."

133 6.11 ARCH Models and Serial Correlation
GARCH (p,q) Model:

134 6.11 ARCH Models and Serial Correlation
The high level of persistence in GARCH models the sum of the two GARCH parameter estimates approximates unity in most cases Li and Lin (2003): This finding provides some support for the notion that GARCH models are handicapped by the inability to account for structural changes during the estimation period and thus suffers from a high persistence problem in variance settings.

135 6.11 ARCH Models and Serial Correlation
Find the stock returns Daily and 5-year data at least Run the GARCH(1,1) model Check the sum of the two GARCH parameter estimates Parameter estimates Graph the time-varying variance estimates

136 Could we identify RW? Low test power of the DF test
The Power of the test? The H0 is not true, but we accept the H0 The data series is I(0), but we conclude it is I(1)

137 Several Key Problems for Unit Root Tests
Low test power Structural change problem Size distortion RW or non-stationary or I(1) : Yt=1*Yt-1+ut Stationary Process or I(0): Yt=0.99*Yt-1+ut-1, T=1,000 Yt=0.98*Yt-1+ut-1, T=50 or 1000

138

139

140

141 Spurious Regression RW 1 : Yt=0.05+1*Yt-1+ut RW 2: Xt=0.03+1*Xt-1+vt

142 Spurious Regression new; format /m1 /rd 9,3;
@ Data Gerneration Y=zeros(1000,1); u=2*Rndn(1000,1); X=zeros(1000,1); v=1*Rndn(1000,1); i=2; do until i>1000; Y[i,1]=0.05+1*Y[i-1,1]+u[i,1]; X[i,1]=0.03+1*X[i-1,1]+v[i,1]; i=i+1; endo; Output file=d:\Courses\Enclass\Unit\YX_Spur.out reset; Y~X; Output off;

143


Download ppt "Chapter 6 Autocorrelation."

Similar presentations


Ads by Google