Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Box-Jenkins (ARIMA) Methodology

Similar presentations


Presentation on theme: "The Box-Jenkins (ARIMA) Methodology"— Presentation transcript:

1 The Box-Jenkins (ARIMA) Methodology

2 ARIMA Models (time series modeling)
AutoRegressive Integrated Moving Average (ARIMA) models model stationary as well as nonstationary data do not involve independent variables in their construction rely heavily on autocorrelation (and partial autocorrelation) patterns in the data aka “Box - Jenkins Methodology” Does not assume any particular pattern Uses an iterative approach of identifying from a general class of models chosen model is then checked against the historical data to see whether it accurately describes the series if not use a new model Partial auto – auto between y y-k after adjusting

3 ARIMA Models ARIMA models are designated by the level of autoregression, integration, and moving averages Does not assume any pattern … uses an iterative approach of identifying a model The model “Fits” if residuals are generally small, randomly distributed, and in general, contain no useful info

4 ARIMA Notation ( AR I MA ) | | | p d q Where
| | | p d q Where p = order of auto-regression (rare>2) d= order of integration (differencing) q= order of moving average (rare>2)

5 ARIMA Estimate Parameters Postulate a general class of models
Identify Model to be considered Use model for Forecasting Estimate Parameters Check – Is the model Adequate ?? YES NO

6 The general (AR) model… for pth-order autoregressive model
Yt=dependent variable at time t Yt-1, Yt-2, …Yt-p =responses in previous time periods … play the role of independent variables s = coefficients to be estimated Appropriate for stationary time series Regression model with lagged values of dependent as independent hence the name O/ related to the constant level of the series AR(1) AR(2)

7 AR( ) For auto-regresssive models, forecasts depend on observed values in previous time periods AR(2) forecast of next depends on observations from previous 2 periods AR(3) forecast of next depends on observations from previous 3 ……...

8 ARIMA (1, 0, 0) = AR(1) Just the 1st order regression model from before

9 ARIMA (2, 0, 0)=AR(2) The 2nd order regression model just includes the 2nd lag, and so on ….

10 MA - Moving Average Autoregressive (AR) models forecast Yt as a linear combination of a finite set of Yt Moving average models provide forecasts of Yt based on a linear combination of a finite # of past errors 

11 MA - Moving Average Notes for MA ….
Historical and not confused w/ MA from before… the deviation from the response is a linear combination of current and past errors, and as time moves forward the errors will move as well

12 The general MA model… for qth-order moving average model
Yt=dependent variable at time t t-1, t-2, … t-p = errors in previous time periods =constant mean of the process ’s = coefficients to be estimated Historical and not confused w/ MA from before… the deviation from the response is a linear combination of current and past errors, and as time moves forward the errors will move as well

13 The general MA model… MA(1) …. MA(2)
ARIMA (0, 0, 1) = MA(1) ARIMA (0, 0, 2) = MA(2) Historical and not confused w/ MA from before… the deviation from the response is a linear combination of current and past errors, and as time moves forward the errors will move as well

14 Autoregressive Moving Average Models
ARMA (p, q) ARMA (1, 1)

15 ARIMA model building steps
Model identification – plot and check for autocorrelation over several lags Parameter estimation model diagnostics forecast verification and reasonableness Model identification graphs etc see if stationary, patterns (if not difference (2x) Parameter estimation model coefficients thru lse model diagnostics forecast verification and reasonableness

16 ARIMA - Step 1 Model Identification (A)
Examine stationarity ….. plot the series - check for stationarity With a nonstationary time series the sample autocorrelations do not die out rapidly. If the series is nonstationary we can (of course) difference to make it stationary Plot for general character autocorrelations - match with pattern associated with a particular model not exact match.. Try again

17 ARIMA - Step 1 Model Identification (A) differencing
Differencing is done until the series is stationary and …. The number of differences that are needed to make the series stationary is noted by d Models for nonstationary series are called autoregressive – integrated - moving average models and noted by ARIMA (p,d,q) Plot for general character autocorrelations - match with pattern associated with a particular model not exact match.. Try again

18 ARIMA - Step 1 Model Identification (A) differencing
For example, suppose that the original series is increasing over time, but the 1st differences Yt=Yt-Yt-1 are stationary If we assume for a second that we want to model this with an ARMA (1,1) …. we may want to model the stationary differences, as such …. Now making our model ARIMA(1, 1, 1) Plot for general character autocorrelations - match with pattern associated with a particular model not exact match.. Try again

19 ARIMA - Step 1 Model Identification (B)
partial autocorrelations – side note on “partials” Partial autocorrelation at time lag k is the correlation between Y t and Y t-k … after adjusting for the effects of the intervening values (meaning …. when all other time lags are held constant) Each ARIMA model has a unique set of autocorrelations so we try to match the sample patterns to one of the theoretical patterns Plot for general character autocorrelations - match with pattern associated with a particular model not exact match.. Try again

20 ARIMA - Step 1 Model Identification (B)
get the autocorrelations (and partial autocorrelations) for several lags compare to the appropriate models if sample autocorrelations “die out – gradually approach zero” and partial correlations “cut off – rapidly go to zero” p if sample autocorrelations “cut off ” and partial correlations “die out” q if both die out .. p & q .. and order is determined by # of significant sample autocorrelations (compare to ) Plot for general character autocorrelations - match with pattern associated with a particular model not exact match.. Try again

21 Autoregressive Moving Average Models (Model Patterns) pp. 348-350
Autocorrelations Partial Autocorrelations MA(q) Cut off after the order q of the process Die Out AR(p) Cut off after the order p of the process ARMA(p, q)

22 Theoretical ACs and PACs for an AR(1)
“Die out” “Cut off”

23 Theoretical ACs and PACs for an AR(2)
“Die out” “Cut off”

24 Theoretical ACs and PACs for a MA(1)
“Cut off” “Die out”

25 Theoretical ACs and PACs for a MA(2)
“Cut off” “Die out”

26 Theoretical ACs and PACs for an ARMA(1,1)
“Die out” “Die out”

27 ARIMA - Step 2…. Model Estimation
Once a tentative model has been chosen, estimate parameters by minimizing SSE and check for significant coefficients (w/ MINITAB) Additionally, the residual mean squared error (an estimate of the variance of the error term) is computed … useful for assessing fit, comparing models, and calculating prediction limits

28 ARIMA - Step 2 …. Model Estimation
For example, assume that ARIMA (1, 0, 1) has been fit to a series of n=100 …. And the fitted equation is (7.02) (.17) (.21) The Yt-1 term is NOT significant ( t = .25 / .17 = 1.47) Maybe, we would want to go back and fit an ARIMA (0, 0, 1) model

29 ARIMA - Step 3….Model Checking
** We want to check for the randomness of the residuals Use a random probability plot, or a histogram The individual residual autocorrelations rk(e) should be small and generally within of zero The rk(e) as a group should be consistent with those produced by random errors

30 ARIMA - Step 3….Model Checking
** An overall check of model adequacy is provided by a chi-square (2) test based on the Ljung-Box Q statistic If the p-value is small (p < 0.05) the model is considered inadequate. Judgment is important however !!!!

31 ARIMA - Step 4….Forecasting
Once we have an adequate model, forecasts can be made. Understandably, the longer the lead time, the wider the prediction interval. If the nature of the series seems to be changing over time, new data may be used to re-estimate the model parameters (or if necessary, to develop a new model)

32 Sample …. AR(2) Table 9.4 Y^76=115.2-.535Y75+.0055Y74
Period Time Values (Yt) Forecast residuals t-5 71 90 76.4 13.6 t-4 72 78 67.5 10.5 t-3 73 87 74.0 13.0 t-2 74 99 69.1 29.9 t-1 75 62.7 9.3 t 76 77.2 Y^76= Y Y74 Y^76= (72)+.0055(99)= 77.2

33 Period Time Values (Yt) Forecast residuals t-5 71 90 76.1 13.9 t-4 72
Sample …… MA(2) Table 9.4 Period Time Values (Yt) Forecast residuals t-5 71 90 76.1 13.9 t-4 72 78 69.1 8.9 t-3 73 87 75.3 11.7 t-2 74 99 72.0 27.0 t-1 75 64.3 7.7 t 76 80.6

34 Period Values (Yt) Forecast residuals Y1 32.5 35 Y2 36.6 Y3 33.3 Y4
Problem #2 Forecast Y5, Y6, Y7 Period Values (Yt) Forecast residuals Y1 32.5 35 Y2 36.6 Y3 33.3 Y4 31.9 Y5 Y6 Y7

35 Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375
Problem #2 Forecast Y5, Y6, Y7 Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375 Y3 33.3 Y4 31.9 Y5 Y6 Y7

36 Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375
Problem #2 Forecast Y5, Y6, Y7 Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375 2.225 Y3 33.3 36.306 Y4 31.9 Y5 Y6 Y7

37 * Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375
Problem #2 Forecast Y5, Y6, Y7 Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375 2.225 Y3 33.3 36.306 -3.006 Y4 31.9 33.581 Y5 Y6 Y7 *

38 Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375
Problem #2 Forecast Y5, Y6, Y7 Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375 2.225 Y3 33.3 36.306 -3.006 Y4 31.9 33.581 -1.681 Y5 Y6 Y7

39 * Period Values (Yt) Forecast residuals Y1 32.5 35 -2.5 Y2 36.6 34.375
2.225 Y3 33.3 36.306 -3.006 Y4 31.9 33.581 -1.681 Y5 35.482 Y6 35.504 Y7 *

40 Be careful with the forecast when you “difference” … for example, the ARIMA (1, 1, 0) model is:
So with the data from Table 9.3

41 ARIMA – Comments It is NOT a good idea to try and cover all possibilities and add many AR & MA parameters from the start. Aim for the basics of what you need, and if you need more … the residual autocorrelations will tell you so. On the other hand, if parameters in a fitted ARIMA model are not significant delete one parameter at a time, as needed. There is obviously some subjectivity in selecting a model, and as we saw, 2 or more models can adequately represent the data. If the models have the same # of parameters the one with the smallest mean squared error, s2 (MINITAB – “MS”) If the models have a different number of parameters, select the simpler model (the principle of parsimony)

42 The process … Check for level of integration
Plot and check for stationarity Double check this … by examining the autocorrelations If necessary, you can difference next to see if this makes it stationary Examine the autocorrelations and “partials” for pattern After pattern is established go to MINITAB …stat - time series – ARIMA Enter the series ….. And your model A – D – M If you are using an AR model …. If the level (mean) of the series is different from 0 leave the “constant” box checked ….. If it is close to zero uncheck it Under “Graphs” check #1, #3, #4 If we are forecasting click “forecasts” and enter the number of forecast periods that you want in “LEAD” Perform your “check” Are the coefficients significant? Are the p-values for the “Ljung-Box” greater than alpha (0.05) ? Do the residuals show no autocorrelations? Are the residuals normally distributed?

43 ARIMA – Seasonal Models
The model building strategy is the same as with non-seasonal models …. If non-stationary we may want to difference the data at the seasonal lag Quarterly at S=4 … Yt-Yt-4 Monthly at S=12 …. Yt-Yt-12

44 Seasonal AC and PAC patterns
The autocorrelation patterns associated with purely seasonal models are analogous to those for nonseasonal models, with the only difference that the nonzero autocorrelations that form the pattern occur at lags that are multiples of the number of periods per season


Download ppt "The Box-Jenkins (ARIMA) Methodology"

Similar presentations


Ads by Google