1 Ka-fu Wong University of Hong Kong Modeling Cycles: MA, AR and ARMA Models.

Slides:



Advertisements
Similar presentations
FINANCIAL TIME-SERIES ECONOMETRICS SUN LIJIAN Feb 23,2001.
Advertisements

Autocorrelation Functions and ARIMA Modelling
Autoregressive Integrated Moving Average (ARIMA) models
Dates for term tests Friday, February 07 Friday, March 07
Using SAS for Time Series Data
Model Building For ARIMA time series
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 7: Box-Jenkins Models – Part II (Ch. 9) Material.
Time Series Building 1. Model Identification
R. Werner Solar Terrestrial Influences Institute - BAS Time Series Analysis by means of inference statistical methods.
STAT 497 LECTURE NOTES 8 ESTIMATION.
10 Further Time Series OLS Issues Chapter 10 covered OLS properties for finite (small) sample time series data -If our Chapter 10 assumptions fail, we.
How should these data be modelled?. Identification step: Look at the SAC and SPAC Looks like an AR(1)- process. (Spikes are clearly decreasing in SAC.
Non-Seasonal Box-Jenkins Models
Forecasting Purpose is to forecast, not to explain the historical pattern Models for forecasting may not make sense as a description for ”physical” behaviour.
Modeling Cycles By ARMA
1 Ka-fu Wong University of Hong Kong Forecasting with Regression Models.
1 Ka-fu Wong University of Hong Kong Pulling Things Together.
Prediction and model selection
ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011
Financial Econometrics
Economics 20 - Prof. Anderson
Modern methods The classical approach: MethodProsCons Time series regression Easy to implement Fairly easy to interpret Covariates may be added (normalization)
1 Ka-fu Wong University of Hong Kong Some Final Words.
Non-Seasonal Box-Jenkins Models
Modern methods The classical approach: MethodProsCons Time series regression Easy to implement Fairly easy to interpret Covariates may be added (normalization)
BOX JENKINS METHODOLOGY
AR- MA- och ARMA-.
ARMA models Gloria González-Rivera University of California, Riverside
STAT 497 LECTURE NOTES 2.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 8: Estimation & Diagnostic Checking in Box-Jenkins.
Linear Stationary Processes. ARMA models. This lecture introduces the basic linear models for stationary processes. Considering only stationary processes.
TIME SERIES ANALYSIS Time Domain Models: Red Noise; AR and ARMA models LECTURE 7 Supplementary Readings: Wilks, chapters 8.
#1 EC 485: Time Series Analysis in a Nut Shell. #2 Data Preparation: 1)Plot data and examine for stationarity 2)Examine ACF for stationarity 3)If not.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Lecture 7: Forecasting: Putting it ALL together. The full model The model with seasonality, quadratic trend, and ARMA components can be written: Ummmm,
Autoregressive Integrated Moving Average (ARIMA) Popularly known as the Box-Jenkins methodology.
Byron Gangnes Econ 427 lecture 11 slides Moving Average Models.
FORECASTING. Minimum Mean Square Error Forecasting.
The Properties of Time Series: Lecture 4 Previously introduced AR(1) model X t = φX t-1 + u t (1) (a) White Noise (stationary/no unit root) X t = u t i.e.
Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.
K. Ensor, STAT Spring 2004 Memory characterization of a process How would the ACF behave for a process with no memory? What is a short memory series?
Time Series Basics Fin250f: Lecture 8.1 Spring 2010 Reading: Brooks, chapter
3.Analysis of asset price dynamics 3.1Introduction Price – continuous function yet sampled discretely (usually - equal spacing). Stochastic process – has.
MULTIVARIATE TIME SERIES & FORECASTING 1. 2 : autocovariance function of the individual time series.
Auto Regressive, Integrated, Moving Average Box-Jenkins models A stationary times series can be modelled on basis of the serial correlations in it. A non-stationary.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
Dynamic Models, Autocorrelation and Forecasting ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
1 Chapter 5 : Volatility Models Similar to linear regression analysis, many time series exhibit a non-constant variance (heteroscedasticity). In a regression.
Review and Summary Box-Jenkins models Stationary Time series AR(p), MA(q), ARMA(p,q)
Module 4 Forecasting Multiple Variables from their own Histories EC 827.
Univariate Time series - 2 Methods of Economic Investigation Lecture 19.
Introduction to stochastic processes
Time Series Analysis PART II. Econometric Forecasting Forecasting is an important part of econometric analysis, for some people probably the most important.
STAT 497 LECTURE NOTES 3 STATIONARY TIME SERIES PROCESSES
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Analysis of financial data Anders Lundquist Spring 2010.
EC 827 Module 2 Forecasting a Single Variable from its own History.
Analysis of Financial Data Spring 2012 Lecture 4: Time Series Models - 1 Priyantha Wijayatunga Department of Statistics, Umeå University
Financial Econometrics Lecture Notes 2
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Model Building For ARIMA time series
CHAPTER 16 ECONOMIC FORECASTING Damodar Gujarati
Chapter 6: Forecasting/Prediction
Machine Learning Week 4.
The Spectral Representation of Stationary Time Series
CH2 Time series.
BOX JENKINS (ARIMA) METHODOLOGY
Presentation transcript:

1 Ka-fu Wong University of Hong Kong Modeling Cycles: MA, AR and ARMA Models

2 Unobserved components model of time series According to the unobserved components model of a time series, the series y t has three components y t = T t + S t + C t Time trend Seasonal component Cyclical component

3 The Starting Point Let y t denote the cyclical component of the time series. We will assume, unless noted otherwise, that y t is a zero-mean covariance stationary process. Recall that part of this assumption is that the time series originated infinitely far back into the past and will continue infinitely far into the future, with the same mean, variance, and autocovariance structure. The starting point for introducing the various kinds of econometric models that are available to describe stationary processes is the Wold Representation Theorem (or, simply, Wold’s theorem).

4 Wold’s theorem According to Wold’s theorem, if y t is a zero mean covariance stationary process than it can be written in the form: where the ε’s are (i) WN(0,σ 2 ), (ii) b 0 = 1, and (iii) In other words, each y t can be expressed in terms of a single linear function of current and (possibly an infinite number of) past drawings of the white noise process, ε t. If y t depends on an infinite number of past ε’s, the weights on these ε’s, i.e., the b i ’s must be going to zero as i gets large (and they must be going to zero at a fast enough rate for the sum of squared b i ’s to converge).

5 Innovations ε t is called the innovation in y t because ε t is that part of y t not predictable from the past history of y t, i.e., E(ε t │ y t-1,y t-2,…)=0 Hence, the forecast (conditional expectation) E(y t │ y t-1,y t-2,…) = E(y t │ ε t-1,ε t-2,…) = E(ε t + b 1 ε t-1 + b 2 ε t-2 +… │ ε t-1,ε t-2,…) = E(ε t │ ε t-1,ε t-2,…) + E(b 1 ε t-1 + b 2 ε t-2 +… │ ε t-1,ε t-2,…) = 0 + (b 1 ε t-1 + b 2 ε t-2 +…) = b 1 ε t-1 + b 2 ε t-2 +… And, the one-step ahead forecast error y t - E(y t │ y t-1,y t-2,…) = (ε t + b 1 ε t-1 + b 2 ε t-2 +…)-(b 1 ε t-1 + b 2 ε t-2 +…) = ε t

6 Mapping Wold to a variety of models The one-step-ahead forecast error is y t - E(y t │ y t-1,y t-2,…) = (ε t + b 1 ε t-1 + b 2 ε t-2 +…)-(b 1 ε t-1 + b 2 ε t-2 +…) = ε t Thus, according to the Wold theorem, each y t can be expressed as the same weighted average of current and past innovations (or, 1- step ahead forecast errors). It turns out that the Wold representation can usually be well- approximated by a variety of models that can expressed in terms of a very small number of parameters. the moving-average (MA) models, the autoregressive (AR) models, and the autoregressive moving-average (ARMA) models.

7 Mapping Wold to a variety of models For example, suppose that the Wold representation has the form: for some b, 0 < b < 1. (i.e., b i = b i ) Then it can be shown that y t = by t-1 + ε t which is an AR(1) model.

8 Mapping Wold to a variety of models The procedure we will follow is to describe each of these three types of models and, especially, the shapes of the autocorrelation and partial autocorrelations that they imply. Then, the game will be to use the sample autocorrelation/partial autocorrelation functions of the data to “guess” which kind of model generated the data. We estimate that model and see if it provide a good fit to the data. If yes, we proceed to the forecasting step using this estimated model of the cyclical component. If not, we guess again …

9 Digression – The Lag Operator The lag operator, L, is a simple but powerful device that is routinely used in applied and theoretical time series analysis, including forecasting. The lag operator is defined as follows – Ly t = y t-1 That is, the operation L applied to y t returns y t-1, which is y t “lagged” one period. Similarly, L 2 y t = y t-2 i.e., the operation L applied twice to y t returns y t-2, y t lagged two periods. More generally, L s y t = y t-s, for any integer s.

10 Digression – The Lag Operator Consider the application of the following polynomial in the lag operator to y t : (b 0 +b 1 L+b 2 L 2 +…+b s L s )y t = b 0 y t + b 1 y t-1 + b 2 y t-2 + …+ b s y t-s where b 0, b 1,…,b s are real numbers. We sometimes shorthand this as B(L)y t, where B(L) = b 0 +b 1 L+b 2 L 2 +…+b s L s. Thus, we can write the Wold representation of y t as B(L)ε t where B(L) is the infinite order polynomial in L: B(L) = 1 + b 1 L + b 2 L 2 + … Similarly, suppose y t = by t-1 + ε t, we can write B(L)y t =ε t, B(L)=1-bL.

11 Moving Average (MA) Models If y t is a (zero-mean) covariance stationary process, then Wold’s theorem tells us that y t can expressed as a linear combination of current and past values of a white noise process, ε t. That is: where the ε’s are (i) WN(0,σ 2 ), (ii) b 0 = 1, and (iii) Suppose that for some positive integer q, it turns out that b q+1, b q+2,… are all equal to zero. That is suppose that y t depends on current and only a finite number of past values of ε: This is called a q-th order moving average process (MA(q))

12 Realization of two MA(1) processes y t = ε t + θε t-1

13 MA(1): y t = ε t + θε t-1 [= (1+θL)ε t ] 1.E(y t )=E(ε t + θε t-1 )= E(ε t )+ θE(ε t-1 )=0 2.Var(y t ) = E[(y t -E(y t )) 2 ]=E(y t 2 ) = E[(ε t + θε t-1 ) 2 ] = E(ε t 2 ) + θ 2 E(ε t-1 2 ) + 2θE(ε t ε t-1 ) = σ 2 + θ 2 σ (since E(ε t ε t-1 ) = E[E(ε t |ε t-1 ) ε t-1 ]=0 ) = (1+ θ 2 )σ 2

14 MA(1): y t = ε t + θε t-1 [= (1+θL)ε t ]  (1) =Cov(y t,y t-1 ) = E[(y t -E(y t ))(y t-1 -E(y t-1 ))] = E(y t y t-1 ) = E[ (ε t + θε t-1 )(ε t-1 + θε t-2 )] = E[ε t ε t-1 + θε t θε t ε t-2 + θ 2 ε t-1 ε t-2 ] = E(ε t ε t-1 )+E(θε t-1 2 )+E(θε t ε t-2 )+E(θ 2 ε t-1 ε t-2 ) = 0 + θσ = θσ 2 4.ρ(1) = Corr(y t,y t-1 ) =  (1)/  (0) = θσ 2 /[(1+ θ 2 )σ 2 ] = θ / (1+ θ 2 ) ρ(1) > 0 if θ > 0 and is < 0 if θ < 0.

15 MA(1): y t = ε t + θε t-1 [= (1+θL)ε t ]  (2) =Cov(y t,y t-2 ) = E[(y t -E(y t ))(y t-2 -E(y t-2 ))] = E(y t y t-2 ) = E[ (ε t + θε t-1 )(ε t-2 + θε t-3 )] = E[ε t ε t-3 + θε t-1 ε t-2 + θε t ε t-3 + θ 2 ε t-1 ε t-3 ] = E(ε t ε t-1 )+E(θε t-1 ε t-2 )+E(θε t ε t-3 )+E(θ 2 ε t-1 ε t-3 ) = =0  (  )=0 for all  >1 6.ρ(2) = Corr(y t,y t-2 ) =  (2)/  (0) = 0 ρ(  )=0 for all  >1

16 Population autocorrelation y t = ε t + 0.4ε t-1 ρ(0)=  (0)/  (0) = 1 ρ(1)=  (1)/  (0) = 0.4/( )=0.345 ρ(2)=  (2)/  (0) = 0/( ) = 0 ρ(  )=  (  )/  (0) = 0 for all  > 1

17 Population autocorrelation y t = ε t ε t-1 ρ(0)=  (0)/  (0) = 1 ρ(1)=  (1)/  (0) = /( )=0.499 ρ(2)=  (2)/  (0) = 0/( ) = 0 ρ(  )=  (  )/  (0) = 0 for all  > 1

18 MA(1): y t = ε t + θε t-1 [= (1+θL)ε t ] The partial autocorrelation function for the MA(1) process is a bit more tedious to derive. The PACF for an MA(1): The PACF, p(  ), will be nonzero for all , converging monotonically to zero in absolute value as  increases. If the MA coefficient θ is positive, the PACF will exhibit damped oscillations as  increases. If the MA coefficient θ is negative, then the PACF will be negative and converging to zero monotonically.

19 Population Partial Autocorrelation y t = ε t + 0.4ε t-1

20 Population Partial Autocorrelation y t = ε t ε t-1

21 Forecasting y T+h E(y T+h │ y T,y T-1,…) E(y T+h │ y T,y T-1,…) = E(y T+h │ ε T,ε T-1,…) since each y t can be expressed as a function of ε T,ε T-1,… E(y T+1 │ ε T,ε T-1,…) = E(ε T+1 + θε T │ ε T,ε T-1,…) since y T+1 = ε T+1 + θε T = E(ε T+1 │ ε T,ε T-1,…) + E(θε T │ ε T,ε T-1,…) = θε T E(y T+2 │ ε T,ε T-1,…)=E(ε T+2 + θε T+1 │ ε T,ε T-1,…) = E(ε T+2 │ ε T,ε T-1,…)+E(θε T+1 │ ε T,ε T-1,…) = 0 … E(y T+h │ y T,y T-1,…) = E(y T+h │ ε T,ε T-1,…) = θε T for h = 1 = 0 for h > 1

22 MA(q): 1.E(y t ) = 0 2.Var(y t ) = (1+b …+b q 2 )σ 2  (  ) and ρ(  ) will be equal to 0 for all  > q. [The behavior of these functions for 1 <  < q will depend on the signs and magnitudes of b 1,…,b q in a complicated way.] 4.The partial autocorrelation function, p(  ), will be nonzero for all . [Its behavior will depend on the signs and magnitudes of b 1,…,b q in a complicated way.]

23 MA(q): 5.E(y T+h │ y T,y T-1,…) = E(y T+h │ ε T, ε T-1,…) = ? y T+1 = ε T+1 + θ 1 ε T + θ 2 ε T-1 + … + θ q ε T-q+1 So, E(y T+1 │ ε T, ε T-1,…) = θ 1 ε T +θ 2 ε T-1 +…+ θ q ε T-q+1 More generally, E(y T+h │ ε T, ε T-1,…) = θ h ε T + … + θ q ε T-q+h for h < q 0 for h > q

24 Autoregressive Models (AR(p)) In certain circumstances, the Wold form for y t, can be “inverted” into a finite-order autoregressive form, i.e., y t = φ 1 y t-1 + φ 2 y t-2 +…+ φ p y t-p +ε t This is called a p-th order autoregressive process AR(p)). Note that it has p unknown coefficients: φ 1,…, φ p Note too that the AR(p) model looks like a standard linear regression model with zero-mean, homoskedastic, and serially uncorrelated errors.

25 AR(1): y t = φy t-1 + ε t

26 AR(1): y t = φy t-1 + ε t The “stationarity condition”: If y t is a stationary time series with an AR(1) form, then it must be that the AR coefficient, φ, is less than one in absolute value, i.e., │ φ │ < 1. To see how the AR(1) model is related to the Wold form – y t = φy t-1 + ε t = φ(φy t-2 + ε t-1 ) + ε t, since y t-1 = φy t-2 +ε t-1 = φ 2 y t-2 + φε t-1 + ε t = φ 2 (φy t-3 + ε t-2 ) + φε t-1 + ε t = φ 3 y t-3 + φ 2 ε t-2 + φε t-1 + ε t = (since │ φ │ < 1 and var(yt) <∞) So, the AR(1) model is appropriate for a covariance stationary process with Wold form

27 AR(1): y t = φy t-1 + ε t Mean of y t :E(y t ) = E(φy t-1 + ε t ) = φE(y t-1 ) + E(ε t ) = φE(y t ) + E(ε t ), by stationarity So, E(y t ) = E(ε t )/(1-φ) = 0, since ε t ~WN Variance of y t : Var(y t ) = E(y t 2 ) since E(y t ) = 0. E(y t 2 ) = E[(φy t-1 + ε t ) 2 ] = φ 2 E(y t 2 ) + E(ε t 2 ) + φE(y t-1 ε t ) (1- φ 2 )E(y t 2 ) = σ 2 E(y t 2 ) = σ 2 /(1- φ 2 )

28 AR(1): y t = φy t-1 + ε t  (1) =Cov(y t,y t-1 ) = E[(y t -E(y t ))(y t-1 -E(y t-1 ))] = E(y t y t-1 ) = E[ (φy t-1 + ε t )y t-1 ] = φE(y t-1 2 ) + E(ε t y t-1 ) = φE(y t-1 2 ) since E(ε t y t-1 ) = 0 = φ  (0), since E(y t-1 2 ) = Var(y t ) =  (0), ρ(1) = Corr(y t,y t-1 ) =  (1)/  (0) = φ > 0 if φ > 0 < 0 if φ < 0.

29 AR(1): y t = φy t-1 + ε t More generally, for the AR(1) process: ρ(  ) = φ  for all  So the ACF for the AR(1) process will Be nonzero for all values of , decreasing monotonically in absolute value to zero as  increases be strictly positive, decreasing monotonically to zero as  increases, if φ is positive alternate in sign as it decreases to zero, if φ is negative The PACF for an AR(1)will be equal to φ for  = 1 and will be equal to 0 otherwise, i.e., p(  ) = φ if  = 1 0 if  > 1

30 Population Autocorrelation Function AR(1): y t = 0.4y t-1 + ε t ρ(0)=  (0)/  (0) = 1 ρ(1)=  (1)/  (0) =φ=0.4 ρ(2)=  (2)/  (0) = φ 2 = 0.16 ρ(  )=  (  )/  (0) = φ  for all  > 1

31 Population Autocorrelation Function AR(1): y t = 0.95y t-1 + ε t ρ(0)=  (0)/  (0) = 1 ρ(1)=  (1)/  (0) =φ=0.95 ρ(2)=  (2)/  (0) = φ 2 = ρ(  )=  (  )/  (0) = φ  for all  > 1

32 Population Partial Autocorrelation Function AR(1): y t = 0.4y t-1 + ε t

33 Population Partial Autocorrelation Function AR(1): y t = 0.95y t-1 + ε t

34 AR(1): y t = φy t-1 + ε t E(y T+h │ y T,y T-1,…)= E(y T+h │ y T,y T-1,… ε T,ε T-1,…) 1. E(y T+1 │ y T,y T-1,…, ε T,ε T-1,…) = E(φy T +ε T+1 │ y T,y T-1,…, ε T,ε T-1,…) = E(φy T │ y T,y T-1,…, ε T,ε T-1,…) + E(ε T+1 │ y T,y T-1,…, ε T,ε T-1,…) = φy T 2. E(y T+2 │ y T,y T-1,…, ε T,ε T-1,…) = E(φy T+1 +ε T+2 │ y T,y T-1,…, ε T,ε T-1,…) = E(φy T+1 │ y T,y T-1,…, ε T,ε T-1,…) = φ E(y T+1 │ y T,y T-1,…, ε T,ε T-1,…) = φ(φy T ) = φ 2 y T 3. E(y T+h │ y T,y T-1,…) = φ h y T

35 Properties of the AR(p) Process y t = φ 1 y t-1 + φ 2 y t-2 +…+ φ p y t-p +ε t or, using the lag operator, φ(L)y t = ε t, φ(L) = 1- φ 1 L-…-φ p L p where the ε’s are WN(0,σ 2 ).

36 AR(p): y t = φ 1 y t-1 + φ 2 y t-2 +…+ φ p y t-p +ε t The coefficients of the AR(p) model of a covariance stationary time series must satisfy the stationarity condition: Consider the values of x that solve the equation 1-φ 1 x-…-φ p x p = 0 These x’s must all be greater than 1 in absolute value. For example, if p = 1 (the AR(1) case), consider the solutions to 1- φx = 0 The only value of x that satisfies this equation is x = 1/φ, which will be greater than one in absolute value if and only if the absolute value of φ is less than one. So, │ φ │ < 1 is the stationarity condition for the AR(1) model. The condition guarantees that the impact of ε t on y t+  decays to zero as  increases.

37 AR(p): y t = φ 1 y t-1 + φ 2 y t-2 +…+ φ p y t-p +ε t The autocovariance and autocorrelation functions,  (  ) and ρ(  ), will be non-zero for all . Their exact shapes will depend upon the signs and magnitudes of the AR coefficients, though we know that they will be decaying to zero as  goes to infinity. The partial autocorrelation function, p(  ), will be equal to 0 for all  > p. The exact shape of the pacf for 1 <  < p will depend on the signs and magnitudes of φ 1,…, φ p.

38 Population Autocorrelation Function AR(2): y t = 1.5y t y t-2 + ε t

39 AR(p): y t = φ 1 y t-1 + φ 2 y t-2 +…+ φ p y t-p +ε t E(y T+h │ y T,y T-1,…) = ? h = 1: y T+1 = φ 1 y T + φ 2 y T-1 +…+ φ p y T-p+1 +ε T+1 E(y T+1 │ y T,y T-1,…)=φ 1 y T +φ 2 y T-1 +…+φ p y T-p+1 h = 2: y T+2 = φ 1 y T+1 + φ 2 y T +…+ φ p y T-p+2 +ε T+2 E(y T+2 │ y T,y T-1,…) = φ 1 E(y T+1 │ y T,y T-1,…) + φ 2 y T +…+ φ p y T-p+2 h = 3 y T+3 = φ 1 y T+2 + φ 2 y T+1 + φ 3 y T +…+ φ p y T-p+3 +ε T+3 E(y T+3 │ y T,y T-1,…) = φ 1 E(y T+2 │ y T,y T-1,…) + φ 2 E(y T+1 │ y T,y T-1,…) + φ 3 y T +…+ φ p y T-p+3

40 AR(p): y t = φ 1 y t-1 + φ 2 y t-2 +…+ φ p y t-p +ε t E(y T+h │ y T,y T-1,…) = φ 1 E(y T+h-1 │ y T,y T-1,…) + φ 2 E(y T+h-2 │ y T,y T-1,…) +…+ φ p E(y T+h-p │ y T,y T-1,…) where E(y T+h-s │ y T,y T-1,…) = y T+h-s if h-s ≤0 In contrast to the MA(q), it is straightforward to operationalize this forecast. It is also straightforward to estimate this model: Apply OLS.

41 Planned exploratory regressions Series #1 of Problem Set #4 MA order 0123 AR order 0ARMA(0,0)ARMA(0,1)ARMA(0,2)ARMA(0,3) 1ARMA(1,0)ARMA(1,1)ARMA(1,2)ARMA(1,3) 2ARMA(2,0)ARMA(2,1)ARMA(2,2)ARMA(2,3) 3ARMA(3,0)ARMA(3,1)ARMA(3,2)ARMA(3,3) Want to find a regression model (the AR and MA orders in this case) such that the residuals look like white noise.

42 Model selection MA order 0123 AR order MA order 0123 AR order AIC SIC

43 ARMA(0,0): y t = c +  t The probability of observing the test statistics (Q-Stat) of under the null that the residual e(t) is white noise. That is, if e(t) is truly white noise, the probability of observing a test statistics of or higher is In this case, we will reject the null hypothesis.

44 ARMA(0,0): y t = c +  t The 95% confidence band for the autocorrelation under the null that residuals e(t) is white noise. That is, if e(t) is truly white noise, 95% of time (out of many realization of samples), the autocorrelation will fall within the band. We will reject the null hypothesis if the autocorrelation falls outside the band.

45 ARMA(0,0): y t = c +  t The PAC suggests AR(1).

46 ARMA(0,1)

47 ARMA(0,2)

48 ARMA(0,3)

49 ARMA(1,0)

50 ARMA(2,0)

51 ARMA(1,1)

52 AR or MA? ARMA(1,0) ARMA(0,3) Truth: y t = 0.5 y t-1 +  t We cannot reject the null that e(t) is white noise in both models.

53 Approximation Any MA process may be approximated by an AR(p) process, for sufficient large p. And the residuals will appear white noise. Any AR process may be approximated by a MA(q) process, for sufficient large q. And the residuals will appear white noise. In fact, if an AR(p) process can be written exactly as a MA(q) process, the AR(p) process is called invertible. Similarly, if a MA(q) process can be written exactly as an AR(p) process, the MA(q) process is called invertible.

54 Example: Employment MA(4) model

55 Residual plot

56 Correlogram of sample residual from an MA(4) model

57 Autocorrelation function of sample residual from an MA(4) model

58 Partial autocorrelation function of sample residual from an MA(4) model

59 Model AR(2)

60 Residual plot

61 Correlogram of sample residual from an AR(2) model

62 Model selection criteria – various MA and AR orders SIC values AIC values

63 Autocorrelation function of sample residual from an AR(2) model

64 Partial autocorrelation function of sample residual from an AR(2) model

65 ARMA(3,1)

66 Residual plot

67 Correlogram of sample residual from an ARMA(3,1) model

68 Autocorrelation function of sample residual from an ARMA(3,1) model

69 Partial autocorrelation function of sample residual from an ARMA(3,1) model

70 End