Review and Summary Box-Jenkins models Stationary Time series AR(p), MA(q), ARMA(p,q)

Slides:



Advertisements
Similar presentations
Autocorrelation Functions and ARIMA Modelling
Advertisements

State Space Models. Let { x t :t T} and { y t :t T} denote two vector valued time series that satisfy the system of equations: y t = A t x t + v t (The.
Stationary Time Series
Dates for term tests Friday, February 07 Friday, March 07
Using SAS for Time Series Data
Model Building For ARIMA time series
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Time Series Building 1. Model Identification
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
R. Werner Solar Terrestrial Influences Institute - BAS Time Series Analysis by means of inference statistical methods.
The General Linear Model. The Simple Linear Model Linear Regression.
An Introduction to Time Series Ginger Davis VIGRE Computational Finance Seminar Rice University November 26, 2003.
STAT 497 APPLIED TIME SERIES ANALYSIS
Non-Seasonal Box-Jenkins Models
1 Ka-fu Wong University of Hong Kong Modeling Cycles: MA, AR and ARMA Models.
Time series analysis - lecture 2 A general forecasting principle Set up probability models for which we can derive analytical expressions for and estimate.
1 Ka-fu Wong University of Hong Kong Pulling Things Together.
Prediction and model selection
ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011
Financial Econometrics
Non-Seasonal Box-Jenkins Models
BOX JENKINS METHODOLOGY
ARMA models Gloria González-Rivera University of California, Riverside
Random Process The concept of random variable was defined previously as mapping from the Sample Space S to the real line as shown below.
Time-Series Analysis and Forecasting – Part V To read at home.
Time Series Analysis.
STAT 497 LECTURE NOTES 2.
Linear Stationary Processes. ARMA models. This lecture introduces the basic linear models for stationary processes. Considering only stationary processes.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Exam 2 review: Quizzes 7-12* (*) Please note that.
Continuous Distributions The Uniform distribution from a to b.
Autoregressive Integrated Moving Average (ARIMA) Popularly known as the Box-Jenkins methodology.
Models for Non-Stationary Time Series The ARIMA(p,d,q) time series.
FORECASTING. Minimum Mean Square Error Forecasting.
Introduction to Time Series Analysis
K. Ensor, STAT Spring 2004 Memory characterization of a process How would the ACF behave for a process with no memory? What is a short memory series?
3.Analysis of asset price dynamics 3.1Introduction Price – continuous function yet sampled discretely (usually - equal spacing). Stochastic process – has.
MULTIVARIATE TIME SERIES & FORECASTING 1. 2 : autocovariance function of the individual time series.
Auto Regressive, Integrated, Moving Average Box-Jenkins models A stationary times series can be modelled on basis of the serial correlations in it. A non-stationary.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
Statistics 349.3(02) Analysis of Time Series. Course Information 1.Instructor: W. H. Laverty 235 McLean Hall Tel:
Linear Filters. denote a bivariate time series with zero mean. Let.
Discrete-time Random Signals
Example x y We wish to check for a non zero correlation.
The Box-Jenkins (ARIMA) Methodology
MODELS FOR NONSTATIONARY TIME SERIES By Eni Sumarminingsih, SSi, MM.
TESTING FOR NONSTATIONARITY 1 This sequence will describe two methods for detecting nonstationarity, a graphical method involving correlograms and a more.
TESTING FOR NONSTATIONARITY 1 This sequence will describe two methods for detecting nonstationarity, a graphical method involving correlograms and a more.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Environmental Data Analysis with MatLab 2 nd Edition Lecture 14: Applications of Filters.
Introduction to stochastic processes
Time Series Analysis PART II. Econometric Forecasting Forecasting is an important part of econometric analysis, for some people probably the most important.
STAT 497 LECTURE NOTES 3 STATIONARY TIME SERIES PROCESSES
EC 827 Module 2 Forecasting a Single Variable from its own History.
Analysis of Financial Data Spring 2012 Lecture 4: Time Series Models - 1 Priyantha Wijayatunga Department of Statistics, Umeå University
Models for Non-Stationary Time Series
Time Series Analysis.
Ch8 Time Series Modeling
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Statistics 153 Review - Sept 30, 2008
Model Building For ARIMA time series
Random Process The concept of random variable was defined previously as mapping from the Sample Space S to the real line as shown below.
Forecasting with non-stationary data series
Chapter 6: Forecasting/Prediction
Box-Jenkins models Stationary Time series AR(p), MA(q), ARMA(p,q)
State Space Models.
Linear Filters.
The Spectral Representation of Stationary Time Series
16. Mean Square Estimation
CH2 Time series.
Presentation transcript:

Review and Summary Box-Jenkins models Stationary Time series AR(p), MA(q), ARMA(p,q)

Models for Stationary Time Series.

The Moving Average Time series of order q, MA(q) where is a white noise time series with variance  2. is a Moving Average time series of order q. MA(q) if it satisfies the equation:

The autocorrelation function for an MA(q) time series The autocovariance function for an MA(q) time series The mean

The Autoregressive Time series of order p, AR(p) where {u t |t  T} is a white noise time series with variance  2. {x t |t  T} is called a Autoregressive time series of order p. AR(p) if it satisfies the equation:

The mean value of a stationary AR(p) series The Autocovariance function  (h) of a stationary AR(p) series Satisfies the equations:

with for h > p The Autocorrelation function  (h) of a stationary AR(p) series Satisfies the equations: and

or: and c 1, c 2, …, c p are determined by using the starting values of the sequence  (h). where r 1, r 2, …, r p are the roots of the polynomial

Stationarity AR(p) time series: consider the polynomial with roots r 1, r 2, …, r p 1. then {x t |t  T} is stationary if |r i | > 1 for all i. 2. If |r i | < 1 for at least one i then {x t |t  T} exhibits deterministic behaviour. 3. If |r i | ≥ 1 and |r i | = 1 for at least one i then {x t |t  T} exhibits non-stationary random behaviour.

The Mixed Autoregressive Moving Average Time Series of order p, ARMA(p,q) where {u t |t  T} is a white noise time series with variance  2, A Mixed Autoregressive- Moving Average time series - ARMA(p,q) series {x t : t  T} satisfies the equation:

The mean value of a stationary ARMA(p,q) series Stationary of an ARMA(p,q) series Consider the polynomial with roots r 1, r 2, …, r p 1. then {x t |t  T} is stationary if |r i | > 1 for all i. 2. If |r i | < 1 for at least one i then {x t |t  T} exhibits deterministic behaviour. 3. If |r i | ≥ 1 and |r i | = 1 for at least one i then {x t |t  T} exhibits non-stationary random behaviour.

The autocovariance function  (h) satisfies: For h = 0, 1. …, q: for h > q:

h  ux (h)

The partial auto correlation function at lag k is defined to be: Using Cramer’s Rule

A recursive formula for  kk : Starting with  11 =  1 and

Spectral density function f( )

Let {x t : t  T} denote a time series with auto covariance function  (h) and let f( ) satisfy: then f( ) is called the spectral density function of the time series {x t : t  T}

Linear Filters

Let {x t : t  T} be any time series and suppose that the time series {y t : t  T} is constructed as follows:: The time series {y t : t  T} is said to be constructed from {x t : t  T} by means of a Linear Filter. input x t output y t Linear Filter a s

Spectral theory for Linear Filters if {y t : t  T} is obtained from {x t : t  T} by the linear filter:

since Applications: The Moving Average Time series of order q, MA(q)

since The Autoregressive Time series of order p, AR(p)

since where {z t |t  T} is a MA(q) time series. The ARMA(p,q) Time series of order p,q

Three Important Forms of a Non-Stationary Time Series The Difference equation Form:  (B)  x t =  +  (B)u t or x t =  1 x t-1 +  2 x t  p x t-p +  + u t +  1 u t-1 +  2 u t  q u t-q

The Random Shock Form: x t =  +  (B)u t or x t =  + u t  +  1 u t-1 +  2 u t-2 +  3 u t where  (B) =  (B)  -1  (B) = = I  +  1 B +  2 B 2 +  3 B  =  (B)  -1  =  /(1  -  1 -   p ) and

The Inverted Form:  (B)x t =  + u t or x t =  1 x t-1 +  2 x t-2 +  3 x  + u t where  (B) = [  (B)] -1 [  (B)] = I -  1 B -  2 B 2 -  3 B

Models for Non-Stationary Time Series The ARIMA(p,d,q) time series

An important fact: Most Non-stationary time series have changes that are stationary Recall the time series {x t : t  T} defined by the following equation: x t =  1 x t-1  + u t Then 1) if |  1 | < 1 then the time series {x t : t  T} is a stationary time series. 2) if |  1 | = 1 then the time series {x t : t  T} is a non stationary time series. 3) if |  1 | > 1 then the time series {x t : t  T} is a deterministic time series in nature.

In fact if  1 = 1 then this equation becomes: x t = x t-1 + u t This is the equation of a well known non stationary time series (called a Random Walk.) Note: x t - x t-1 = (I -  B)x t =  x t = u t where  = I - B Thus by the simple transformation of computing first differences we can can convert the time series {x t : t  T} into a stationary time series.

Now consider the time series, {x t : t  T}, defined by the equation:  (B)x t =  +  (B)u t where  (B) = I -  1 B -  2 B  p+d B p+d. Let r 1, r 2,...,r p+d are the roots of the polynomial  (x) where:  (x) = 1 -  1 x -  2 x  p+d x p+d.

Then 1) if |r i | > 1 for i = 1,2,...,p+d the time series {x t : t  T} is a stationary time series. 2) if |r i | = 1 for at least one i (i = 1,2,...,p) and |r i | > 1 for the remaining values of i then the time series {x t : t  T} is a non stationary time series. 3) if |r i | < 1 for at least one i (i = 1,2,...,p) then the time series {x t : t  T} is a deterministic time series in nature.

Suppose that d roots of the polynomial  (x) are equal to unity then  (x) can be written:  (B) = (1 -  1 x -  2 x  p x p )(1-x) d. and  (B) could be written:  (B) = (I -  1 B -  2 B  p B p )(I-B) d =  (B)  d. In this case the equation for the time series becomes:  (B)x t =  +  (B)u t or  (B)  d x t =  +  (B)u t..

Thus if we let w t =  d x t then the equation for {w t : t  T} becomes:  (B)w t =  +  (B)u t Since the roots of  (B) are all greater than 1 in absolute value then the time series {w t : t  T} is a stationary ARMA(p,q) time series. The original time series, {x t : t  T}, is called an ARIMA(p,d,q) time series or Integrated Moving Average Autoregressive time series. The reason for this terminology is that in the case that d = 1 then {x t : t  T} can be expressed in terms of {w t : t  T} as follows: x t =  -1 w t = (I - B) -1 w t = (I + B + B 2 + B 3 + B )w t = w t + w t-1 + w t-2 + w t-3 + w t

Comments: 1. The operator  (B) =  (B)  d is called the generalized autoregressive operator. 2. The operator  (B) is called the autoregressive operator. 3. The operator  (B) is called moving average operator. 4. If d = 0 then t the process is stationary and the level of the process is constant. 5. If d = 1 then the level of the process is randomly changing. 6. If d = 2 thern the slope of the process is randomly changing.

 (B)x t =  +  (B)u t

 (B)  x t =  +  (B)u t

 (B)  2 x t =  +  (B)u t

Forecasting for ARIMA(p,d,q) Time Series

Consider the m+n random variables x 1, x 2,..., x m, y 1, y 2,..., y n with joint density function f(x 1, x 2,..., x m, y 1, y 2,..., y n ) = f(x,y) where x = (x 1, x 2,..., x m ) and y = (y 1, y 2,..., y n ).

Then the conditional density of x = (x 1, x 2,..., x m ) given y = (y 1, y 2,..., y n ) is defined to be:

In addition the conditional expectation of g(x) = g(x 1, x 2,..., x m ) given y = (y 1, y 2,..., y n ) is defined to be:

Prediction

Again consider the m+n random variables x 1,..., x m, y 1,..., y n. Suppose we are interested in predicting g(x 1,..., x m ) = g(x) given y = (y 1, y 2,..., y n ). Let t(y 1, y 2,..., y n ) = t(y) denote any predictor of g(x 1, x 2,..., x m ) = g(x) given the information in the observations y = (y 1, y 2,..., y n ). Then the Mean square error of t(y) in predicting g(x) using t(y) is defined to be MSE[t(y)] = E[{t(y)-g(x)} 2 |y]

It can be shown that the choice of t(y) that minimizes MSE[t(y)] is t(y) = E[g(x) |y]. Proof: Let v(t) = E{[t-g(x)] 2 |y } = E[t 2 -2tg(x)+g 2 (x) |y] = t 2 -2tE[g(x)|y]+E[g 2 (x) |y]. Then v'(t) = 2t -2 E[g(x)|y] = 0 when t = E[g(x)|y].

Three Important Forms of a Non-Stationary Time Series The Difference equation Form:  (B)  d x t =  +  (B)u t or  (B)x t =  +  (B)u t or x t =  1 x t-1 +  2 x t  p+d x t-p-d +  + u t +a 1 u t-1 + a 2 u t a q u t-q

The Random Shock Form: x t =  (t) +  (B)u t or x t =  (t) + u t  +  1 u t-1 +  2 u t-2 +  3 u t where  (B) =  (B)  -1  (B) =  (B)  d  -1  (B) = I  +  1 B +  2 B 2 +  3 B  =  (B)  -1  =  /(1  -  1 -   p ) and  (t)  =  -d 

Note:  d  (t)  =  i.e. the d th order differences are constant. This implies that  (t) is a polynomial of degree d.

Consider The Difference equation Form:  (B)  d x t =  +  (B)u t or  (B)x t =  +  (B)u t

Multiply both sides by  (B) -1 To get  (B) -1  (B)x t =  (B) -1  +  (B) -1  (B)u t or x t =  (B) -1  +  (B) -1  (B)u t

The Inverted Form:  (B)x t =  + u t or x t =  1 x t-1 +  2 x t-2 +  3 x  + u t where  (B) = [  (B)] -1  (B) = [  (B)] -1 [  (B)  d ] = I -  1 B -  2 B 2 -  3 B

Again Consider The Difference equation Form:  (B)x t =  +  (B)u t Multiply both sides by  (B) -1 To get  (B) -1  (B)x t =  (B) -1  +  (B) -1  (B)u t or  (B)  x t =  +  u t

Forecasting an ARIMA(p,d,q) Time Series Let P T denote {…, x T-2, x T-1, x T } = the “past” til time T. Then the optimal forecast of x T+l given P T is denoted by: This forecast minimizes the mean square error

Three different forms of the forecast 1.Random Shock Form 2.Inverted Form 3.Difference Equation Form Note:

Random Shock Form of the forecast Recall x t =  (t) + u t  +  1 u t-1 +  2 u t-2 +  3 u t x T+l =  (T + l) + u T+l +  1 u T+l-1 +  2 u T+l-2 +  3 u T+l or Taking expectations of both sides and using

To compute this forecast we need to compute {…, u T-2, u T-1, u T } from {…, x T-2, x T-1, x T }. Note: x t =  (t) + u t +  1 u t-1 +  2 u t-2 +  3 u t Thus Which can be calculated recursively and

The Error in the forecast: The Mean Sqare Error in the Forecast Hence

Prediction Limits for forecasts (1 –  )100% confidence limits for x T+l

The Inverted Form:  (B)x t =  + u t or x t =  1 x t-1 +  2 x t-2 +  3 x  + u t where  (B) = [  (B)] -1  (B) = [  (B)] -1 [  (B)  d ] = I -  1 B -  2 B 2 -  3 B

The Inverted form of the forecast x t =  1 x t-1 +  2 x t  + u t and for t = T+l x T+l =  1 x T+l-1 +  2 x T+l  +  + u T+l Taking conditional Expectations Note:

The Difference equation form of the forecast x T+l =  1 x T+l-1 +  2 x T+l  +  p+d x T+l-p-d +  + u T+l +  1 u T+l-1 +  2 u T+l  q u T+l-q Taking conditional Expectations

Example: The Model: x t - x t-1 =  1 (x t-1 - x t-2 ) + u t +  1 u t +  2 u t or x t = (1 +  1 )x t-1 -  1 x t-2 + u t +  1 u t +  2 u t or  (B)x t =  (B)(I-B)x t =  (B)u t where  (x) = 1 - (1 +  1 )x +  1 x 2 = (1 -  1 x)(1-x) and  (x) = 1 +  1 x +  2 x 2.

The Random Shock form of the model: x t =  (B)u t where  (B) = [  (B)(I-B)] -1  (B) = [  (B)] -1  (B) i.e.  (B) [  (B)] =  (B). Thus (I +  1 B +  2 B 2 +  3 B 3 +  4 B )(I - (1 +  1 )B +  1 B 2 ) = I +  1 B +  2 B 2 Hence  1 =  1 - (1 +  1 ) or  1 = 1 +  1 +  1.  2 =  2 -  1 (1 +  1 ) +  1 or  2 =  1 (1 +  1 ) -  1 +  2. 0 =  h -  h-1 (1 +  1 ) +  h-2  1 or  h =  h-1 (1 +  1 ) -  h-2  1 for h ≥ 3.

The Inverted form of the model:  (B) x t = u t where  (B) = [  (B)] -1  (B)(I-B) =  B)] -1  (B) i.e.  (B) [  (B)] =  (B). Thus (I -  1 B -  2 B 2 -  3 B 3 -  4 B )(I +  1 B +  2 B 2 ) = I - (1 +  1 )B +  1 B 2 Hence -(1 +  1 ) =  1 -  1 or  1 = 1 +  1 +  1.  1 = -  2 -  1  1 +  2 or  2 = -  1  1 -  1 +  2. 0 =  h -  h-1  1 -  h-2  2 or  h = -(  h-1  1 +  h-2  2 ) for h ≥ 3.

Now suppose that  1 = 0.80,  1 = 0.60 and  2 = 0.40 then the Random Shock Form coefficients and the Inverted Form coefficients can easily be computed and are tabled below:

The Forecast Equations

The Difference Form of the Forecast Equation

Computation of the Random Shock Series, One- step Forecasts One-step Forecasts Random Shock Computations

Computation of the Mean Square Error of the Forecasts and Prediction Limits Mean Square Error of the Forecasts Prediction Limits

Table:  MSE of Forecasts to lead time l = 12 (  2 = 2.56)

Raw Observations, One-step Ahead Forecasts, Estimated error, Error

Forecasts with 95% and 66.7% prediction Limits

Graph: Forecasts with 95% and 66.7% Prediction Limits

Next Topic – Modelling Seasonal Time seriesModelling Seasonal Time series