ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011

Slides:



Advertisements
Similar presentations
FINANCIAL TIME-SERIES ECONOMETRICS SUN LIJIAN Feb 23,2001.
Advertisements

SMA 6304 / MIT / MIT Manufacturing Systems Lecture 11: Forecasting Lecturer: Prof. Duane S. Boning Copyright 2003 © Duane S. Boning. 1.
Autocorrelation Functions and ARIMA Modelling
Dates for term tests Friday, February 07 Friday, March 07
Model Building For ARIMA time series
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
1 MF-852 Financial Econometrics Lecture 11 Distributed Lags and Unit Roots Roy J. Epstein Fall 2003.
Unit Roots & Forecasting
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 7: Box-Jenkins Models – Part II (Ch. 9) Material.
Applied Business Forecasting and Planning
Time Series Building 1. Model Identification
R. Werner Solar Terrestrial Influences Institute - BAS Time Series Analysis by means of inference statistical methods.
How should these data be modelled?. Identification step: Look at the SAC and SPAC Looks like an AR(1)- process. (Spikes are clearly decreasing in SAC.
Business Forecasting Chapter 10 The Box–Jenkins Method of Forecasting.
Non-Seasonal Box-Jenkins Models
Forecasting Purpose is to forecast, not to explain the historical pattern Models for forecasting may not make sense as a description for ”physical” behaviour.
BABS 502 Lecture 9 ARIMA Forecasting II March 23, 2009.
Moving Averages Ft(1) is average of last m observations
1 Ka-fu Wong University of Hong Kong Modeling Cycles: MA, AR and ARMA Models.
1 Pulp Price Analysis Prepared by Professor Martin L. Puterman for BABS 502 Sauder School of Business March 24, 2004.
ARIMA-models for non-stationary time series
Modeling Cycles By ARMA
BABS 502 Lecture 8 ARIMA Forecasting II March 16 and 21, 2011.
Prediction and model selection
Financial Econometrics
Economics 20 - Prof. Anderson
Modern methods The classical approach: MethodProsCons Time series regression Easy to implement Fairly easy to interpret Covariates may be added (normalization)
Non-Seasonal Box-Jenkins Models
Modern methods The classical approach: MethodProsCons Time series regression Easy to implement Fairly easy to interpret Covariates may be added (normalization)
BOX JENKINS METHODOLOGY
Box Jenkins or Arima Forecasting. H:\My Documents\classes\eco346\Lectures\chap ter 7\Autoregressive Models.docH:\My Documents\classes\eco346\Lectures\chap.
(c) Martin L. Puterman1 BABS 502 Regression Based Forecasting February 28, 2011.
AR- MA- och ARMA-.
ARMA models Gloria González-Rivera University of California, Riverside
STAT 497 LECTURE NOTES 2.
The Box-Jenkins Methodology for ARIMA Models
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 8: Estimation & Diagnostic Checking in Box-Jenkins.
Linear Stationary Processes. ARMA models. This lecture introduces the basic linear models for stationary processes. Considering only stationary processes.
#1 EC 485: Time Series Analysis in a Nut Shell. #2 Data Preparation: 1)Plot data and examine for stationarity 2)Examine ACF for stationarity 3)If not.
Autoregressive Integrated Moving Average (ARIMA) Popularly known as the Box-Jenkins methodology.
It’s About Time Mark Otto U. S. Fish and Wildlife Service.
John G. Zhang, Ph.D. Harper College
Autocorrelation, Box Jenkins or ARIMA Forecasting.
Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect.
K. Ensor, STAT Spring 2004 Memory characterization of a process How would the ACF behave for a process with no memory? What is a short memory series?
1 Chapter 3:Box-Jenkins Seasonal Modelling 3.1Stationarity Transformation “Pre-differencing transformation” is often used to stablize the seasonal variation.
MULTIVARIATE TIME SERIES & FORECASTING 1. 2 : autocovariance function of the individual time series.
Auto Regressive, Integrated, Moving Average Box-Jenkins models A stationary times series can be modelled on basis of the serial correlations in it. A non-stationary.
Forecasting (prediction) limits Example Linear deterministic trend estimated by least-squares Note! The average of the numbers 1, 2, …, t is.
1 Chapter 5 : Volatility Models Similar to linear regression analysis, many time series exhibit a non-constant variance (heteroscedasticity). In a regression.
1 BABS 502 Moving Averages, Decomposition and Exponential Smoothing Revised March 14, 2010.
Review and Summary Box-Jenkins models Stationary Time series AR(p), MA(q), ARMA(p,q)
The Box-Jenkins (ARIMA) Methodology
Seasonal ARIMA FPP Chapter 8.
Components of Time Series Su, Chapter 2, section II.
Ch16: Time Series 24 Nov 2011 BUSI275 Dr. Sean Ho HW8 due tonight Please download: 22-TheFed.xls 22-TheFed.xls.
Introduction to stochastic processes
Time Series Analysis PART II. Econometric Forecasting Forecasting is an important part of econometric analysis, for some people probably the most important.
Subodh Kant. Auto-Regressive Integrated Moving Average Also known as Box-Jenkins methodology A type of linear model Capable of representing stationary.
Analysis of financial data Anders Lundquist Spring 2010.
MODEL DIAGNOSTICS By Eni Sumarminingsih, Ssi, MM.
EC 827 Module 2 Forecasting a Single Variable from its own History.
Review of Unit Root Testing D. A. Dickey North Carolina State University (Previously presented at Purdue Econ Dept.)
Analysis of Financial Data Spring 2012 Lecture 4: Time Series Models - 1 Priyantha Wijayatunga Department of Statistics, Umeå University
Lecture 8 ARIMA Forecasting II
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Model Building For ARIMA time series
BOX JENKINS (ARIMA) METHODOLOGY
Chap 7: Seasonal ARIMA Models
Presentation transcript:

ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011 BABS 502 ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011

General Overview An ARIMA model is a mathematical model for time series data. George Box and Gwilym Jenkins developed a systematic approach for fitting these models to data so these models are often called Box-Jenkins models. We always use statistical or forecasting programs to fit these models The programs fit models and produce forecasts for us. But it is beneficial to understand the basic model to know that what the software is doing makes sense Especially if we use an automatic forecasting program. (c) Martin L. Puterman

ARIMA Models ARIMA Stands for AutoRegressive Integrated Moving Average We speak also of AR models, MA models, ARMA models, IMA models which are special cases of this general class. Models generalize regression but “independent” variables are past values of the series itself and unobservable random disturbances. Estimation is based on maximum likelihood; not least squares. We distinguish between seasonal and non-seasonal models. (c) Martin L. Puterman

Notation Y1, Y2, …, Yt denotes a series of values for a time series. These are observable. e1, e2, …, et denotes a series of random disturbances. These are not observable. They may be thought of as a series of random shocks. Usually they are assumed to be generated from a Normal distribution with mean 0 and standard deviation  and to be uncorrelated with each other. They are often called “white noise”. (c) Martin L. Puterman

An Autoregressive (AR(1)) Model AR(1) Model: Yt = A1Yt-1 + et A1 is an unknown parameter with values between -1 and +1 which is to be estimated from data As a first approximation we can estimate A1 by linear regression (with intercept set equal to 0) (How?) When A1 = 1, the model is called a random walk. In this case, Yt = Yt-1 + et or alternatively Yt - Yt-1 = et We can show (by back substitution and assuming Y0 = 0) that for a random walk E(Yt ) = 0 and Var(Yt) = t2 Hence the values get more variable as you move out in the series. This means that when data follows a random walk the best prediction of the future is the present (a naïve forecast) and the prediction gets less accurate the further into the future we forecast. (c) Martin L. Puterman

The ACF for a random walk If Yt is a random walk, it can be represented by Yt = et + et-1 + … + e1 Consequently cov(Yt,Yt-1) = (t-1)σ2 Var(Yt) = tσ2 So that Corr(Yt,Yt-1) = (t-1)/t Corr(Yt,Yt-k) = (t-k)/t This gives the ACF this shape > (c) Martin L. Puterman

Random Walk (c) Martin L. Puterman

Other AR(p) models The AR(2) Model The AR(p) Model Yt = A1Yt-1 +A2 Yt-2 + et Here, A1 and A2 are unknown parameters The AR(p) Model Yt = A1Yt-1 +A2 Yt-2 + … + Ap Yt-p+ et Here, A1, … Ap are unknown parameters To apply these in practice, we estimate the parameters and then use the model for forecasting by substituting past observed values. These models are called ARIMA(p,0,0) models. (c) Martin L. Puterman

Which Model to Fit? The Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) give some insight into what model to fit to data. We work backwards here. Given a theoretical model, we can determine theoretically what its ACF and PACF should be. So if the ACF and PACF from the data have a recognizable pattern then we try fitting a model that could generate that pattern to the data. What is a PACF? The pth partial autocorrelation is the coefficient of Yt-p in a regression of Yt on Yt-1, Yt-2, …, Yt-p. Thus, if the data was generated by an AR(2) model, in theory the first two PACFs would be non-zero and all PACF’s higher than two would be zero. (c) Martin L. Puterman

Some further comments on ACFs and PACFs Computing autocorrelations (ACs) is similar to performing a series of simple regressions of Yt on Yt-1, then on Yt-2, then on Yt-3, …. The AC coefficients reflect only the relationship between the two quantities included in the regression. Computing partial autocorrelations (PACs) is more in the spirit of multiple regression. The PACs remove the effects of all lower order lags before computing the autocorrelation. For example the 2nd order PAC is the effect of observations two periods ago on the current observation, given that the effect of the observation one period ago has been removed. This can be viewed as multiple regression. (c) Martin L. Puterman

Example: AR(1) model A1 = .8 (c) Martin L. Puterman

Example: AR(1) Model; A1 =-.7 (c) Martin L. Puterman

Example: AR(2) Model (c) Martin L. Puterman

Monthly Pulp Price Data (c) Martin L. Puterman

Annual Births Data (c) Martin L. Puterman

Stationarity A time series is stationary if: It’s mean is the same at every time It’s variance is the same every time It’s autocorrelations are the same at every time A series of outcomes from independent identical trials is stationary. A series with a trend is not stationary. A random walk is not stationary. (Why?) If a time series is non-stationary, its ACF dies off slowly and the first partial autocorrelation is near 1. In such cases we can sometimes create a stationary series by differencing the original series. If Yt is a random walk, then its differences are white noise which is stationary A unit root test is a formal test for non-stationarity One such test is the Dickey-Fuller test (c) Martin L. Puterman

Differenced Births Data The PACF suggests that the differences of the birth data may follow an AR(1) or AR(2) or AR(5) model. (c) Martin L. Puterman

Differenced Pulp Price Data The story is less clear here. Perhaps the differences follow an AR(1), the lag 1 PAC is .346, the lag 2 PAC is .184. (c) Martin L. Puterman

Differenced Models We let Zt = Yt – Yt-1. When the differenced model is stationary, we can write a model in terms of Zt . If Zt follows an AR(p) model, then Yt follows and ARIMA(p,1,0) model. In practice ARIMA(1,1,0) and ARIMA(2,1,0) are quite common. (c) Martin L. Puterman

Pulp Data The fit from an ARIMA(1,1,0) model is A1 =.346 (t-value 5.46) So fitted model is Zt = .346 Zt-1 + et The residuals appear to have no remaining autocorrelation Forecasts seem pretty flat; 561.7, 562.3, 562.6, 562.6, 562.6 (c) Martin L. Puterman

MA(q) Models These are less plausible but fit many series well. MA(1) model: Yt = et + W1 et-1 MA(2) model: Yt = et + W1 et-1 + W2 et-2 MA(q) model Yt = et + W1 et-1 + W2 et-2 +…+ Wq et-q This is referred to as an ARIMA(0,0,q) model. Rationale for MA models is that effects of disturbances are short lived (q periods) as opposed to an AR model where they persist forever. Note that the disturbances are not observable. (c) Martin L. Puterman

An MA(1) Model: W1 = .7 (c) Martin L. Puterman

An MA(1) Model: W1 = -.7 (c) Martin L. Puterman

Births Data Clearly differencing is required Consider fitting an MA(1) model to the differenced data Find that estimated coefficient is -.42 with a T-value of -3.87 But autocorrelation of residuals contains information Note lag 2 AC = .349 (c) Martin L. Puterman

Births Data Try an ARIMA(0,1,2) model Parameters are -.37 (t =-3.47 ), -.59 (t=-5.76) Residuals appear to be white noise. Forecasts are 338311, 340936, 340936,…. (c) Martin L. Puterman

The ARIMA(0,1,1) Model Revisited This model can be written as (letting w = -W1) Yt –Yt-1 = et - w et-1 The forecast from this model is Ft = Yt-1 - w(Yt-1 - Ft-1) = (1-w) Yt-1 + w Ft-1 This is simple exponential smoothing The new concept here is that the ARIMA(0,1,1) model is a formal statistical model while simple exponential is an ad hoc approach to forecasting. This means that there is an error term and hence forecast errors and hypothesis tests are part of the model. (c) Martin L. Puterman

Relationship between MA and AR Models Any finite AR model can be written as an infinite MA model Any finite MA model can be written as an infinite AR model. These results can be shown by backward substitution (as we did previously for the AR models) Two consequences of these observations Model Selection If your best fit is an AR model with several terms (i.e., 4 or more); try an MA model with a few terms and conversely Identification AR models have ACF with several terms and short PACFs MA models have short ACF’s and long PACFs (c) Martin L. Puterman