Nonlinear Structure in Regression Residuals

Nonlinear Structure in Regression Residuals
Michael McCullougha, Thomas L. Marshb, and Ron C. Mittelhammerc aResearch Associate(contact author), bAssociate Professor, and cDirector, School of Economic Sciences, Washington State University; PO Box , Pullman, WA 99164, United States.

Overview Motivation Previous research
Primer on phase space reconstruction Nonlinear structure in regression residuals Simulation Application: S&P 500 Final observations 1 Simulations from a contaminated deterministic system are performed illustrating the ability to discern nonlineararities from noise. 2 An application to the S&P 500 bridges phase space reconstruction’s original use and regression diagnostics.

Motivation This paper investigates phase space reconstruction as a diagnostic tool for determining the structure of nonlinear processes in regression residuals. Outcomes will be used to create phase portraits for qualitative analysis. In effect, this approach is analogous to simple scatter plots in linear models.

Previous Research Maasoumi and Racine, J Econometrics 2002
Uses an entropy measure of distance to examine the predictability of stock market returns. Granger, Maasoumi, and Racine, JTSA 2004 Further reviews the performance of the metric entropy measure under different circumstances. Previous phase space reconstruction research focused primarily on the detection of chaos. Information and Entropy Economics suggest methods to detect dependence in regression residuals.

Phase Space Reconstruction
For any given event, n outcomes are observed and denoted by the time series vector Xt = [x(t), x(t-1), …, x(t-n)]’ with associated th lag vector Xt- = [x(t-), x(t-1-), …, x(t-n-)]’ Takens’ Theorem (1981) embeds a single time series onto a phase space that reproduces the entire structure of the system. The embedding is a diffeomorphism which creates a matrix of time delayed vectors Y = [Xt, Xt-, …, Xt- ] of dimension [(n-)  ].

Phase Space Reconstruction
The Method of Delays requires two parameter estimates: An optimal time lag; . The goal is to find the  which first minimizes redundancy between time delay vectors Xt and Xt-. A minimum embedding dimension; . This statistic estimates the minimum dimension at which the entire system dynamics may be appropriately represented. This technique is essentially nonparametric in nature.

The Method of Delays Estimating the optimal time lag; .  is estimated using an entropy measure of dependence called the mutual information function (Fraser and Swinney, PR:A 1986).

Estimating the minimum embedding dimension; .
The Method of Delays Estimating the minimum embedding dimension; . Kennel and Brown (PR:A 1992) developed the False Nearest Neighbors. The False Nearest Neighbors technique uses Euclidean distances to determine if the vectors of Y are still “close” as the dimension of the phase space is increased.

The Method of Delays Estimating the minimum embedding dimension; .


Simulation The simulation follows that in Chon et al (1997).
The Ikeda Map is numerically generated and the deterministic variable Xt is isolated. Ikeda Map

Simulation Six variables are constructed by adding contamination, i,t to the deterministic component, Xt, post numerical iteration. Z1 is entirely deterministic and Z6 is entirely random. The simulation follows that in Chon et al (1997), except that we add noise post numerical iteration of the map and focus on the structural analysis of dependence whereas Chon et al (1997) included a noise term within the numerical iteration of the Ikeda map and focused on the detection of chaos. This allows us to separate the two components more distinctly, in particular e is independent of X.

Simulation: Zi(t) Traditional tests of dependence confirm the presence of a process in Z1-Z4.

Simulation Which one is deterministically generated, Z1, and which one is randomly generated, Z6?

Simulation: Z1(t) Z1(t) = Xn + n,1 n,1~N(0,0)
The structure of the process is well defined and appears to follow a stable trajectory.

Simulation: Z3(t) n,3~N(0,0.125) Z3(t) = Xn + n,3
The contamination has a variance roughly equal to 25% of the deterministic process. While the general direction of the trajectory paths remain; the tightness, or clarity, of the paths has greatly diminished.

Simulation: Z5(t) n,5~N(0,0.5) Z5(t) = Xn + n,5
The phase portrait is close to that of Gaussian randomly generated data. Traditional methods of determining independence can no longer detect suspect behavior. The volatility of the contamination has overcome the deterministic component.

Simulation: Z6(t) n,6~N(0,0.5) Z6(t) = n,6
The “ball” of a normal distribution is the benchmark for comparison and detection of nonlinear processes in regression residuals.

Simulation Which one is deterministically generated, Z1, and which one is randomly generated, Z6? Z6, Randomly Generated Z1, Deterministically Generated

ARMA Fitted Z1(t) Residuals such as Z1 will only appear in the presence of a nonlinear process and cannot be controlled for using linear techniques.

Fitted Residual Reconstruction

Insights from Simulations
If the first four time series of the simulation happened to be the residuals of an econometric model, model misspecification may have been concluded. Having that qualitative representation of the residual structure in the phase space reconstruction may enhance additional modeling. After the variance of the additive contamination increased above 25% of the variance of the deterministic component only general structure could be inferred. After the variance of the contamination increased passed 50% of the deterministic component the ability to distinguish structure from noise decreased rapidly.

Application: The S&P 500 Following Granger, Maasoumi, and Racine (JTSA 2004), regression residuals from a nonlinear time series model of S&P 500 from Jan. 3, 1996 to Dec. 22, 1997 are analyzed A simple ARIMA model is fit first: After inspection, however, the linear model does not fully capture all variability.

Application: Reconstructed Residuals

Application: The S&P 500 An Integrated Auto Regressive Model with ARCH and GARCH structured residuals is fit to account for the discovered dependence.

Application: Reconstructed Residuals

Application: Normal Generated Data

Final Observations Given the similarity to random noise, we can find no strong visual evidence of nonlinear structure. In cases when it is not sufficient to just detect dependence, phase space reconstruction can add valuable insight on the nature of the dependence. Indeed, phase space reconstruction can be thought of as a nonlinear scatter plot. The phase portrait may be used as a foundation for future model enhancement. This research differs from using phase space reconstruction in the detection of chaos or providing evidence of dependence using entropy metrics, by illuminating structure.

Questions?

Table 3: The average mutual information function for S&P 500 residual series.

Table 4: The percentage of false nearest neighbors for the S&P 500 residual series.

Nonlinear Structure in Regression Residuals

Similar presentations

Presentation on theme: "Nonlinear Structure in Regression Residuals"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Nonlinear Structure in Regression Residuals

Similar presentations

Presentation on theme: "Nonlinear Structure in Regression Residuals"— Presentation transcript:

Similar presentations

About project

Feedback