# Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University.

## Presentation on theme: "Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University."— Presentation transcript:

Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University

Stationarity-what is it? n Example: Stocks of Silver in the NY Commodities Exchange n Two forecasts: n Nonstationary in yellow –No mean reversion, unbounded error bands n Stationary in green –Reverts to mean, bounded error bands

Silver Series

Stationarity-what is it? Constant mean Constant mean Covariance between Y t, Y t+h function of h only. (h) Covariance between Y t, Y t+h function of h only. (h) [Autocorrelation (h) = (h)/ (0)] [Autocorrelation (h) = (h)/ (0)]

One Lag Model Y t = (Y t-1 )+e t Y t = (Y t-1 )+e t –shocks e t ~N(0, 2 ) Stationary: | |<1 Stationary: | |<1 – Y t = (1 Y t-1 +e t – Regress Y t on 1, Y t-1 »Estimators approximately normally distributed in large samples »Use t test for H0: =0

One Lag Model with =1 Y t =1(Y t-1 )+e t Y t =1(Y t-1 )+e t – shocks e t ~N(0, 2 ) n Y t =Y t-1 +e t n Best forecast of Y t is Y t-1 Nonstationary: =1 Nonstationary: =1 –Regress Y t on 1, Y t-1 –Estimators NOT normally distributed even in large samples –CANNOT use t tables to test for H 0 : =0 –t test statistic does NOT have t distribution!!!

Hypothesis Test Model: Y t = (Y t-1 )+e t Model: Y t = (Y t-1 )+e t n Test – H 0 : =1 Nonstationary, Unit Root – H 1 : | |<1 Stationary (mean reverting) n Compare t calculated to new distribution

Two Tests Model: Y t = (Y t-1 )+e t Model: Y t = (Y t-1 )+e t – Y t Y t-1 =( (Y t-1 )+e t – Y t Y t-1 = (1- + ( Y t-1 +e t –Regress Y t Y t-1 on 1, Y t-1 –Tests: – n(coefficient of Y t-1 ) Rho – calculated t test Tau

Some math Above diagonal ->

More math W(t) is Wiener Process on [0,1]

Two Series SAS software: PROC ARIMA proc gplot; plot (Y Z)*t / overlay; proc arima; i var=Y nlag=10 stationarity=(adf); i var=Z nlag=10 stationarity=(adf);

Symptoms of Nonstationarity n ACF dies down slowly –ACF is Corr (Y t, Y t-j ) plot vs. j n Nonconstant level when plotted n Saw plot, ACFs coming up

The ARIMA Procedure Name of Variable = Y Mean of Working Series 110.9728 Standard Deviation 5.286108 Number of Observations 250 Autocorrelation Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 Std Error 0 1.00000 | |********************| 0 1 0.97219 |. |******************* | 0.063246 2 0.94506 |. |******************* | 0.107523 3 0.91741 |. |****************** | 0.136771 4 0.89025 |. |****************** | 0.159498 5 0.86479 |. |***************** | 0.178269 6 0.84145 |. |***************** | 0.194326 7 0.81771 |. |**************** | 0.208391 8 0.79836 |. |**************** | 0.220853 9 0.77912 |. |**************** | 0.232110 10 0.75671 |. |*************** | 0.242346 Y series ACF

The ARIMA Procedure Name of Variable = Z Mean of Working Series 100.5022 Standard Deviation 2.402392 Number of Observations 250 Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 1.00000 | |********************| 1 0.90796 |. |****************** | 2 0.81755 |. |**************** | 3 0.72228 |. |************** | 4 0.63703 |. |************* | 5 0.56707 |. |*********** | 6 0.51964 |. |********** | 7 0.47865 |. |********** | 8 0.46026 |. |********* | 9 0.44466 |. |********* | 10 0.42313 |. |******** | "." marks two standard errors Z series ACF

The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr F Zero Mean 0 0.1014 0.7059 0.71 0.8675 1 0.0880 0.7027 0.59 0.8422 2 0.0719 0.6989 0.45 0.8101 Single Mean 0 -6.8507 0.2817 -2.30 0.1724 2.99 0.3095 1 -6.8539 0.2815 -2.16 0.2211 2.57 0.4147 2 -7.1478 0.2624 -2.07 0.2564 2.29 0.4861 Trend 0 -7.3468 0.6313 -2.46 0.3502 3.64 0.4500 1 -7.3273 0.6328 -2.30 0.4295 3.07 0.5636 2 -7.5909 0.6114 -2.19 0.4905 2.65 0.6489 Tests on Y

Tests on Z The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr F Zero Mean 0 -0.0087 0.6803 -0.05 0.6647 1 -0.0237 0.6769 -0.15 0.632 2 -0.0393 0.6733 -0.24 0.5997 Single Mean 0 -22.8511 0.0051 -3.45 0.0104 5.96 0.0136 1 -24.5443 0.0034 -3.48 0.0095 6.06 0.0114 2 -28.8542 0.0015 -3.69 0.0050 6.80 0.0010 Trend 0 -24.6119 0.0236 -3.61 0.0312 6.53 0.0449 1 -26.2971 0.0161 -3.60 0.0319 6.48 0.0461 2 -30.7682 0.0057 -3.77 0.0196 7.13 0.0283

Higher Order Processes Y t Y t-1 Y t-2 Y t-3 e t Y t = Y t -Y t-1 = Y t-1 Y t-1 Y t-2 e t [ coefficient ] Augmenting lags ADF stands for Augmented Dickey-Fuller Testing for no mean reversion: H 0 : Regress Y t -Y t-1 on 1, Y t-1, Y t-1 -Y t-2, Y t-2 -Y t-3 Nonstandard | N(__, __) |

Higher Order Processes Q1: How many lags??? Regress Y t on 1,Y t-1, Y t-1 Y t-2 | N(__, __) | so... Just use usual t tests and p-values!!! Q2: Why Unit Root Tests ?? B(Y t )= Y t-1 ( B B 2 B 3 )(Y t = e t root of B B 2 B 3 at B=1 means 1 1 2 1 3 = 0

Check Silver Series for Augmenting Lags PROC REG; MODEL DEL= LSILVER DEL1 DEL2 DEL3 DEL4; TEST DEL2=0, DEL3=0, DEL4=0; Mean Source DF Square F Value Pr > F Numerator 3 4589.63459 1.31 0.2753 Denominator 133 3515.48242

Unit Root test in PROC REG PROC REG; MODEL DEL= LSILVER DEL1; Parameter Variable DF Estimate t Value Pr > |t| Intercept 1 75.58073 2.76 0.0082 LSILVER 1 -0.11703 -2.78 0.0079 DEL1 1 0.67115 6.21 <.0001

Unit Root test in PROC ARIMA PROC ARIMA DATA=SILVER; I VAR=SILVER STATIONARITY=(ADF=(1)); Augmented Dickey-Fuller Unit Root Tests Type Lags Tau Pr < Tau Zero Mean 1 -0.28 0.5800 Single Mean 1 -2.78 0.0689 Trend 1 -2.63 0.2697

And now...the rest of the story

Type Lags Tau Pr < Tau Zero Mean ????? (A) Single Mean 1 -2.78 0.0689 Trend ????? (B) (A) Assumes mean is 0 (or known and subtracted off) Has different (pair of) distributions !! (B) Allows for TREND under H1 Has third (pair of) distributions !!!!

Silver - Need 2 nd Difference? D t = Y t = Y t -Y t-1 Q: Does D (also) have a unit root ?

Regress D t on D t-1 using /NOINT (why?) No augmenting lags (why?) No augmenting lags (why?) I VAR=Y(1) STATIONARITY =... I VAR=Y(1) STATIONARITY =... Type Lags Tau Pr < Tau Zero Mean 0 -3.42 0.0010 Single Mean 0 -3.39 0.0158 Trend 0 -3.62 0.0383

Autocorrelations Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 7612550 1.00000 | |********************| 0 7612550 1.00000 | |********************| 1 7604217 0.99891 |.|********************| 1 7604217 0.99891 |.|********************| 2 7595529 0.99776 |.|********************| 2 7595529 0.99776 |.|********************| 3 7586855 0.99662 |. |********************| 3 7586855 0.99662 |. |********************| 4 7578152 0.99548 |. |********************| 4 7578152 0.99548 |. |********************| 5 7569481 0.99434 |. |********************| 5 7569481 0.99434 |. |********************| 6 7560553 0.99317 |. |********************| 6 7560553 0.99317 |. |********************| 7 7551925 0.99204 |. |********************| 7 7551925 0.99204 |. |********************| 8 7543869 0.99098 |. |********************| 8 7543869 0.99098 |. |********************| 9 7535957 0.98994 |. |********************| 9 7535957 0.98994 |. |********************| 10 7528240 0.98892 |. |********************| 10 7528240 0.98892 |. |********************| 11 7519890 0.98783 |. |********************| 11 7519890 0.98783 |. |********************| 12 7511672 0.98675 |. |********************| 12 7511672 0.98675 |. |********************| "." marks two standard errors "." marks two standard errors

Output from SAS PROC ARIMA Augmented Dickey-Fuller Unit Root Tests Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr < Rho Type Lags Rho Pr < Rho Zero Mean 0 1.3567 0.9565 Zero Mean 0 1.3567 0.9565 1 1.3481 0.9557 1 1.3481 0.9557 Single Mean 0 0.4065 0.9744 Single Mean 0 0.4065 0.9744 1 0.3500 0.9725 1 0.3500 0.9725 Trend 0 -6.3073 0.7203 Trend 0 -6.3073 0.7203 1 -6.5833 0.6981 1 -6.5833 0.6981

Differences

Autocorrelations Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 4003.285 1.00000 | |********************| 0 4003.285 1.00000 | |********************| 1 102.471 0.02560 |.|* | 1 102.471 0.02560 |.|* | 2 -117.368 -.02932 | *|. | 2 -117.368 -.02932 | *|. | 3 -235.578 -.05885 | *|. | 3 -235.578 -.05885 | *|. | 4 -26.946567 -.00673 |.|. | 4 -26.946567 -.00673 |.|. | 5 -46.750761 -.01168 |.|. | 5 -46.750761 -.01168 |.|. | 6 -77.100469 -.01926 |.|. | 6 -77.100469 -.01926 |.|. | 7 -224.055 -.05597 | *|. | 7 -224.055 -.05597 | *|. | 8 -27.874814 -.00696 |.|. | 8 -27.874814 -.00696 |.|. | 9 132.415 0.03308 |.|* | 9 132.415 0.03308 |.|* | 10 316.534 0.07907 |.|** | 10 316.534 0.07907 |.|** | 11 -254.117 -.06348 | *|. | 11 -254.117 -.06348 | *|. | 12 200.979 0.05020 |.|* | 12 200.979 0.05020 |.|* | "." marks two standard errors "." marks two standard errors

n Inverse Autocorrelation n Ming Chang thesis n Dual model (1 B) Y t = e t dual is Y t = (1 B) e t (1 B) Y t = e t dual is Y t = (1 B) e t AR(1) MA(1) AR(1) MA(1) n Chang shows IACF dies off slowly if you overdifference.

Differenced DJIA IACF Inverse Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 -0.51119 | **********|. | 2 0.01380 |.|. | 3 -0.00533 |.|. | 4 0.01061 |.|. | 5 -0.02324 |.|. | 6 0.00722 |.|. | 7 0.02122 |.|. | 8 -0.01617 |.|. | 9 0.02831 |.|* | 10 -0.04860 | *|. | 11 0.02759 |.|* | 12 -0.00422 |.|. |

2 nd Differenced DJIA IACF Just for illustration, here is the inverse autocorrelation you would get if you differenced these differences once more, that is, if you took the second difference of the original series. Note the roughly triangular appearance, suggesting that you should have stopped after the first difference Inverse Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 0.89720 |.|****************** | 2 0.80302 |.|**************** | 3 0.70785 |.|************** | 4 0.60466 |.|************ | 5 0.50498 |.|********** | 6 0.41173 |.|******** | 7 0.32523 |.|******* | 8 0.23836 |.|***** | 9 0.15871 |.|*** | 10 0.09447 |.|** | 11 0.05758 |.|* | 12 0.01735 |.|. |

Rho and F Y t Y t-1 Y t-2 e t Factor: Y t Y t-1 Y t-1 e t Rho (1) Estimate ( H 0 ) by regression (2) Divide n[ estimate] by ( estimate-1)F Regress on 1, t, Regress Y t on 1, t, Y t-1, Y t-1 Test underlined items with F (3 numerator df)

Trend is not Unit Root Y t = a + b t + Z t with Z t stationary Y t-1 = a + b(t-1) + Z t-1 Y t = b + Z t with Z t an overdifferenced Y t = b + Z t with Z t an overdifferenced series !! series !!Example:

Amazon.com Example (volume)

PROC REG; MODEL DV = DATE LAGV DV1-DV4; TEST DV3=0, DV4=0; Parameter Variable DF Estimate t Value Pr > |t| Type I SS Intercept 1 -17.49220 -5.26 <.0001 0.00848 date 1 0.00147 5.41 <.0001 0.01395 LAGV 1 -0.21914 -5.80 <.0001 26.67803 DV1 1 -0.15446 -3.08 0.0022 0.94211 DV2 1 -0.18447 -3.72 0.0002 3.52898 DV3 1 -0.04433 -0.94 0.3477 0.07997 DV4 1 -0.05774 -1.31 0.1923 0.48763 Test 1 Results for Dependent Variable DV Mean Source DF Square F Value Pr > F Numerator 2 0.28380 0.99 0.3715 Denominator 497 0.28602

ACF Levels: Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 2.503910 1.00000 | |********************| 1 2.327538 0.92956 |. |******************* | 2 2.225324 0.88874 |. |****************** | 3 2.193509 0.87603 |. |****************** | 4 2.155492 0.86085 |. |***************** | 5 2.127643 0.84973 |. |***************** | 6 2.099292 0.83841 |. |***************** | 7 2.069929 0.82668 |. |***************** | 8 2.062194 0.82359 |. |**************** | 9 2.051450 0.81930 |. |**************** | 10 2.011864 0.80349 |. |**************** | 11 2.006564 0.80137 |. |**************** | 12 1.996735 0.79745 |. |**************** | 13 1.960231 0.78287 |. |**************** | 14 1.951272 0.77929 |. |**************** | 15 1.940939 0.77516 |. |**************** | 16 1.919167 0.76647 |. |*************** | 17 1.906896 0.76157 |. |*************** | 18 1.905406 0.76097 |. |*************** | 19 1.892168 0.75569 |. |*************** | 20 1.857199 0.74172 |. |*************** | 21 1.846038 0.73726 |. |*************** | 22 1.826167 0.72933 |. |*************** | 23 1.816151 0.72533 |. |*************** | 24 1.821228 0.72735 |. |*************** | "." marks two standard errors

IACF - Differences Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 0.48216 |. |********** | 2 0.44816 |. |********* | 3 0.34266 |. |******* | 4 0.30682 |. |****** | 5 0.25213 |. |***** | 6 0.24854 |. |***** | 7 0.23624 |. |***** | 8 0.18675 |. |**** | 9 0.14088 |. |*** | 10 0.20330 |. |**** | 11 0.13295 |. |*** | 12 0.11437 |. |** | 13 0.15524 |. |*** | 14 0.11829 |. |** | 15 0.09978 |. |** | 16 0.10919 |. |** | 17 0.09049 |. |** | 18 0.06653 |. |*. | 19 0.02886 |. |*. | 20 0.09515 |. |** | 21 0.05504 |. |*. | 22 0.07104 |. |*. | 23 0.06065 |. |*. | 24 0.02284 |. |.

The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr F Zero Mean 2 0.0144 0.6861 0.02 0.6909 Single Mean 2 -14.2100 0.0474 -2.60 0.0944 3.42 0.1920 Trend 2 -85.7758 0.0007 -6.35 <.0001 20.18 0.0010 Do the test: Fit AR(3) plus trend. Diagnostics: Autocorrelation Check of Residuals To Chi- Pr > Lag Square DF ChiSq -----Autocorrelations----- 6 1.59 3 0.6615 -0.015... -0.000 12 10.89 9 0.2835 -0.025... 0.072 18 12.43 15 0.6460 -0.036... 0.031 24 18.97 21 0.5872 30 23.75 27 0.6439 36 30.32 33 0.6014 42 37.56 39 0.5358 48 39.37 45 0.7087

Extensions S. E. Said shows that models with lagged e t terms can still be tested by ADF tests. Nobel Prize cointegration idea: Two or more unit root processes have Two or more unit root processes have stationary linear combination. stationary linear combination. Compute, e.g. Y t = ln(S t /L t ) and test for stationarity. stationarity. n n http://www4.stat.ncsu.edu/~dickey http://www4.stat.ncsu.edu/~dickey n n Click: SAS Code from Presentations

Thanks ! Questions ?

Download ppt "Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University."

Similar presentations