ARIMA-models for non-stationary time series

ARIMA-models for non-stationary time series
Consider again the data material from Exercise 8.8 in the textbook (weekly sales figures of thermostats) This series is obviously non-stationary as it possesses a trend.

SAC and SPAC The first impression is that this points towards an AR(2)-model. What will happen if we try such a model?

We may ask for forecast for weeks (53, 54, 55,) 56 and 57 like was the task in exercise 8.8.
Note that we have to manually enter the columns where we wish the forecasts and the prediction limits to be stored (columns are not generated automatically like for other modules).

ARIMA Model: y Estimates at each iteration Iteration SSE Parameters Relative change in each estimate less than * WARNING * Back forecasts not dying out rapidly

Back forecasts (after differencing)
Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag

Back forecast residuals
Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag Lag

Final Estimates of Parameters
Type Coef SE Coef T P AR AR Constant Mean Number of observations: 52 Residuals: SS = (backforecasts excluded) MS = DF = 49 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value Forecasts from period 52 95% Limits Period Forecast Lower Upper Actual

Residuals after fitting looks nice, Ljung-Box’ statistics are in order
but.. the forecasts do not seem to be consistent with the development of the sales figures and… we have indications of problems in the fitting (back-forecasts are not dying out rapidly which they should) We do not go any deeper into the subject of back-forecasting, but a signal from the software should be taken seriously. As we have clearly seen a trend, we can force a model which takes this into account.  Calculate first-order differences

Calculate SAC and SPAC for the differences series!

One significant spike in SAC, one significant spike in SPAC.
Both are negative consistence! Most presumable models for the differenced data: AR(1) , MA(1) or ARMA(1,1) When fitting such models to differenced data, constant term should be excluded as the differences are expected to vary around 0.

AR(1): MA(1): ARMA(1,1): Seems best! Type Coef SE Coef T P
MS = DF = 50 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square P-Value Type Coef SE Coef T P MA MS = DF = 50 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square P-Value Seems best! Type Coef SE Coef T P AR MA MS = DF = 49 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square P-Value

Fitting the model directly on the original observations.
This time series seems to after first-order differencing apply to a MA(1)-model. The time-series is then said to apply to an ARIMA(0,1,1)-model For non-seasonal time series the notation is ARIMA(p,d,q) Order (q ) of the MA-part in the differenced series Order (p ) of the AR-part in the differenced series Order (d ) of the differencing

ARIMA(0,1,1) Relevant again, as the original time series may have an “intercept”

No longer any problems with back-forecasts!
ARIMA Model: y Estimates at each iteration Iteration SSE Parameters Relative change in each estimate less than No longer any problems with back-forecasts!

Note that information is given about the order of the differencing.
Final Estimates of Parameters Type Coef SE Coef T P MA Constant Differencing: 1 regular difference Number of observations: Original series 52, after differencing 51 Residuals: SS = (backforecasts excluded) MS = DF = 49 Note that information is given about the order of the differencing. MS is the smallest so far (due to the inclusion of the constant term)

Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag Chi-Square DF P-Value Forecasts from period 52 95% Limits Period Forecast Lower Upper Actual L-B’s are in order

Forecasts are now more consistent with the development of the sales figures.
SAC and SPAC of residuals are still satisfactory.

Sometimes the non-stationarity can be identified directly from the SAC and SPAC plots.
Note! Monthly data, but of the kind that usually do not contain seasonal variation within a year. SAC and SPAC usually indicate an AR(1)-model with slowly decreasing autocorrelations and with first value very close to 1

Seasonal ARIMA-models
(Weak) stationarity is often (wrongly) connected with a series that seems to vary non-systematically around a constant mean Stationary? Non-Stationary?

Are the spikes outside the red border evidence of non-stationarity?

 We can always try to differentiate the series: zt=yt – yt-1
No improvement!!

Could it have something to do with seasonal variation?
Lag A u t o c r e l a i n 60 55 50 45 40 35 30 25 20 15 10 5 1 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Autocorrelation Function (with 5% significance limits for the autocorrelations) Note that the spikes (besides the first ones) lie around the lags 12, 24, 36, 48 and 60. Could it have something to do with seasonal variation?

Seasonal AR-models: where L is the number of seasons (during a year)
Such a model takes care of both short-memory and long-memory relations within the series yt . More correct terms are nonseasonal and seasonal variation. The series can still be stationary. We differ between stationarity at the nonseasonal level and stationarity at the seasonal level. We do not consider the model as an AR(P L)-model!

In a stationary Seasonal AR-process (SAR(p,P) )
ACF spikes at nonseasonal level (scale), i.e. between 1 and L die down in an exponential fashion (possibly oscillating). PACF spikes at non-seasonal level (scale) cuts off after lag p. ACF spikes at seasonal level (scale), i.e. at lags L, 2L, 3L, 4L, … die down in an exponential fashion (possibly oscillating). PACF spikes at seasonal level (scale) cuts off after lag PL. Moderate ACF and PACF spikes usually exist around L, 2L, 3L, 4L, …

A more correct formulation of the model is
where Byt = yt – 1 , B2yt = yt – 2 , …, BLyt = yt – L , … (the backshift operator) In the special case of p=1 and P=1 we get i.e. we should model a dependency at lags 1, 12 and 13 to take into account the ”double” autoregressive structure

Seasonal MA-models (SMA(q,Q))
ACF spikes at nonseasonal level cuts off after lag q. PACF spikes at nonseasonal level, i.e. between 1 and L die down in an exponential fashion (possibly oscillating). ACF spikes at seasonal level cuts off after lag QL. PACF spikes at seasonal level, i.e. at lags L, 2L, 3L, 4L, … die down in an exponential fashion (possibly oscillating). Moderate ACF and PACF spikes usually exist around L, 2L, 3L, 4L, … The model can be written with backshift operator B analogously with SAR-models.

Seasonal ARMA-models (SARMA(p,P,q,Q))
Expression becomes more condensed with backshift operator: Note that the expressions within parentheses are polynomials either in B or in BL. A more common formulation is therefore to denote these polynomials

Non-stationary series?
SARMA-models have similar patterns at non-seasonal scale and at seasonal scale as those of ARMA-models, i.e. a mix of sinusoidal and exponentially decreasing spikes. Non-stationary series? yt ~ ARIMA(p,d,q,P,D,Q)L means taking dth order differences at nonseasonal level  zt = (1 – B)d yt (so-called regular differences) and Dth order differences at seasonal level  wt = (1 – BL)D zt wt = (1 – BL)D (1 – B)d yt Then, model the differenced series with SARMA(p,P,q,Q)

Have another look at the SAC and SPAC of the series with obvious seasonal variation:
SAC spikes at exact seasonal lags die down SAC and SPAC spikes close to exact seasonal lags are pronounced SPAC spikes at exact seasonal lags cuts off at lag 1 SAC nonseasonal spikes die down SPAC nonseasonal spikes might cut off at lag 1 ARIMA(1,0,0,1,0,0)12 ??

Minitab: StatTime Series…ARIMA…

OK! Not OK ! Final Estimates of Parameters Type Coef SE Coef T P
SAR Constant Mean Number of observations: 300 Residuals: SS = (backforecasts excluded) MS = DF = 297 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value OK! Not OK !

The time series in question has actually been generated with the model
with at i.i.d N(0.1) This model is stationary, as conditions for stationarity in AR(1)-models are fulfilled at both nonseasonal and seasonal level. Type Coef SE Coef T P AR SAR Constant Mean Still there might be problems with the Ljung-Box statistics!

An example with real data:
Monthly registered men at work (labour statistics) in pulp and paper related industry from January 1987 to March 2005 The series possesses a downward trend and seasonal pattern.

Obvious signs of non-stationarity.
Try 1 regular difference: (1 – B)yt and additionally 1 seasonal difference (1 – B12)(1 – B)yt MTB > diff c5 c6 MTB > diff 12 c6 c7

AR(2) at nonseasonal level?
MA(1) at seasonal level?

Final Estimates of Parameters
Type Coef SE Coef T P AR AR SMA Constant Differencing: 1 regular, 1 seasonal of order 12 Number of observations: Original series 219, after differencing 206 Residuals: SS = (backforecasts excluded) MS = DF = 202 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value

ARIMA-models for non-stationary time series

Similar presentations

Presentation on theme: "ARIMA-models for non-stationary time series"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ARIMA-models for non-stationary time series

Similar presentations

Presentation on theme: "ARIMA-models for non-stationary time series"— Presentation transcript:

Similar presentations

About project

Feedback