# Using SAS for Time Series Data

## Presentation on theme: "Using SAS for Time Series Data"— Presentation transcript:

Using SAS for Time Series Data
LSU Economics Department March 16, 2012

Instrumental Variables Estimation
Next Workshop March 30 Instrumental Variables Estimation

Nonstationary Variables
Time-Series Data: Nonstationary Variables

12.1 Stationary and Nonstationary Variables 12.2 Spurious Regressions
Chapter Contents 12.1 Stationary and Nonstationary Variables 12.2 Spurious Regressions 12.3 Unit Root Tests for Nonstationarity

The aim is to describe how to estimate regression models involving nonstationary variables
The first step is to examine the time-series concepts of stationarity (and nonstationarity) and how we distinguish between them.

Stationary and Nonstationary Variables
12.1 Stationary and Nonstationary Variables

Stationary and Nonstationary Variables
12.1 Stationary and Nonstationary Variables The change in a variable is an important concept The change in a variable yt, also known as its first difference, is given by Δyt = yt – yt-1. Δyt is the change in the value of the variable y from period t - 1 to period t

Stationary and Nonstationary Variables
12.1 Stationary and Nonstationary Variables FIGURE 12.1 U.S. economic time series

Stationary and Nonstationary Variables
12.1 Stationary and Nonstationary Variables FIGURE 12.1 (Continued) U.S. economic time series

Stationary and Nonstationary Variables
12.1 Stationary and Nonstationary Variables Formally, a time series yt is stationary if its mean and variance are constant over time, and if the covariance between two values from the series depends only on the length of time separating the two values, and not on the actual times at which the variables are observed

Stationary and Nonstationary Variables
12.1 Stationary and Nonstationary Variables That is, the time series yt is stationary if for all values, and every time period, it is true that: Eq. 12.1a Eq. 12.1b Eq. 12.1c

FIGURE 12.2 Time-series models
12.1 Stationary and Nonstationary Variables FIGURE 12.2 Time-series models 12.1.1 The First-Order Autoregressive Model

FIGURE 12.2 (Continued) Time-series models
12.1 Stationary and Nonstationary Variables FIGURE 12.2 (Continued) Time-series models 12.1.1 The First-Order Autoregressive Model

FIGURE 12.3 Time series and scatter plot of two random walk variables
12.2 Spurious Regressions FIGURE 12.3 Time series and scatter plot of two random walk variables

A simple regression of series one (rw1) on series two (rw2) yields:
12.2 Spurious Regressions A simple regression of series one (rw1) on series two (rw2) yields: These results are completely meaningless, or spurious The apparent significance of the relationship is false

12.2 Spurious Regressions When nonstationary time series are used in a regression model, the results may spuriously indicate a significant relationship when there is none In these cases the least squares estimator and least squares predictor do not have their usual properties, and t-statistics are not reliable Since many macroeconomic time series are nonstationary, it is particularly important to take care when estimating regressions with macroeconomic variables

Unit Root Tests for Stationarity
12.3 Unit Root Tests for Stationarity

Unit Root Tests for Stationarity
12.3 Unit Root Tests for Stationarity There are many tests for determining whether a series is stationary or nonstationary The most popular is the Dickey–Fuller test

Consider the AR(1) model:
12.3 Unit Root Tests for Stationarity 12.3.1 Dickey-Fuller Test 1 (No constant and No Trend) Consider the AR(1) model: We can test for nonstationarity by testing the null hypothesis that ρ = 1 against the alternative that |ρ| < 1 Or simply ρ < 1 Eq. 12.4

A more convenient form is:
12.3 Unit Root Tests for Stationarity 12.3.1 Dickey-Fuller Test 1 (No constant and No Trend) A more convenient form is: The hypotheses are: Eq. 12.5a

The null and alternative hypotheses are the same as before
12.3 Unit Root Tests for Stationarity 12.3.2 Dickey-Fuller Test 2 (With Constant but No Trend) The second Dickey–Fuller test includes a constant term in the test equation: The null and alternative hypotheses are the same as before Eq. 12.5b

The null and alternative hypotheses are H0: γ = 0 and H1:γ < 0
12.3 Unit Root Tests for Stationarity 12.3.3 Dickey-Fuller Test 3 (With Constant and With Trend) The third Dickey–Fuller test includes a constant and a trend in the test equation: The null and alternative hypotheses are H0: γ = 0 and H1:γ < 0 Eq. 12.5c

Unfortunately this t-statistic no longer has the t-distribution
12.3 Unit Root Tests for Stationarity 12.3.4 The Dickey-Fuller Critical Values To test the hypothesis in all three cases, we simply estimate the test equation by least squares and examine the t-statistic for the hypothesis that γ = 0 Unfortunately this t-statistic no longer has the t-distribution Instead, we use the statistic often called a τ (tau) statistic

Table 12.2 Critical Values for the Dickey–Fuller Test
12.3 Unit Root Tests for Stationarity Table 12.2 Critical Values for the Dickey–Fuller Test 12.3.4 The Dickey-Fuller Critical Values

12.3 Unit Root Tests for Stationarity 12.3.4 The Dickey-Fuller Critical Values To carry out a one-tail test of significance, if τc is the critical value obtained from Table 12.2, we reject the null hypothesis of nonstationarity if τ ≤ τc If τ > τc then we do not reject the null hypothesis that the series is nonstationary

12.3 Unit Root Tests for Stationarity 12.3.4 The Dickey-Fuller Critical Values An important extension of the Dickey–Fuller test allows for the possibility that the error term is autocorrelated Consider the model: where Eq. 12.6

As an example, consider the two interest rate series:
12.3 Unit Root Tests for Stationarity 12.3.6 The Dickey-Fuller Tests: An Example As an example, consider the two interest rate series: The federal funds rate (Ft) The three-year bond rate (Bt) Following procedures described in Sections 9.3 and 9.4, we find that the inclusion of one lagged difference term is sufficient to eliminate autocorrelation in the residuals in both cases

The results from estimating the resulting equations are:
12.3 Unit Root Tests for Stationarity 12.3.6 The Dickey-Fuller Tests: An Example The results from estimating the resulting equations are: The 5% critical value for tau (τc) is -2.86 Since > -2.86, we do not reject the null hypothesis

Unit Root Tests for Stationarity
12.3 Unit Root Tests for Stationarity 12.3.7 Order of Integration Recall that if yt follows a random walk, then γ = 0 and the first difference of yt becomes: Series like yt, which can be made stationary by taking the first difference, are said to be integrated of order one, and denoted as I(1) Stationary series are said to be integrated of order zero, I(0) In general, the order of integration of a series is the minimum number of times it must be differenced to make it stationary

Unit Root Tests for Stationarity
12.3 Unit Root Tests for Stationarity 12.3.7 Order of Integration The results of the Dickey–Fuller test for a random walk applied to the first differences are:

Unit Root Tests for Stationarity
12.3 Unit Root Tests for Stationarity 12.3.7 Order of Integration Based on the large negative value of the tau statistic ( < -1.94), we reject the null hypothesis that ΔFt is nonstationary and accept the alternative that it is stationary We similarly conclude that ΔBt is stationary (-7:662 < -1:94)

12.4 Cointegration

There is an exception to this rule
12.4 Cointegration As a general rule, nonstationary time-series variables should not be used in regression models to avoid the problem of spurious regression There is an exception to this rule

In this case yt and xt are said to be cointegrated
12.4 Cointegration There is an important case when et = yt - β1 - β2xt is a stationary I(0) process In this case yt and xt are said to be cointegrated Cointegration implies that yt and xt share similar stochastic trends, and, since the difference et is stationary, they never diverge too far from each other

We are basing this test upon estimated values of the residuals
12.4 Cointegration The test for stationarity of the residuals is based on the test equation: The regression has no constant term because the mean of the regression residuals is zero. We are basing this test upon estimated values of the residuals Eq. 12.7

Table 12.4 Critical Values for the Cointegration Test

There are three sets of critical values
12.4 Cointegration There are three sets of critical values Which set we use depends on whether the residuals are derived from: Eq. 12.8a Eq. 12.8b Eq. 12.8c

An Example of a Cointegration Test
12.4 Cointegration 12.4.1 An Example of a Cointegration Test Consider the estimated model: The unit root test for stationarity in the estimated residuals is: Eq. 12.9

An Example of a Cointegration Test
12.4 Cointegration 12.4.1 An Example of a Cointegration Test The null and alternative hypotheses in the test for cointegration are: Similar to the one-tail unit root tests, we reject the null hypothesis of no cointegration if τ ≤ τc, and we do not reject the null hypothesis that the series are not cointegrated if τ > τc

Regression with Time Series Data:
Chapter 9 Regression with Time Series Data: Stationary Variables Walter R. Paczkowski Rutgers University

9.2 Finite Distributed Lags 9.3 Serial Correlation
Chapter Contents 9.1 Introduction 9.2 Finite Distributed Lags 9.3 Serial Correlation 9.4 Other Tests for Serially Correlated Errors 9.5 Estimation with Serially Correlated Errors 9.6 Autoregressive Distributed Lag Models 9.7 Forecasting 9.8 Multiplier Analysis

9.1 Introduction

Two features of time-series data to consider:
9.1 Introduction When modeling relationships between variables, the nature of the data that have been collected has an important bearing on the appropriate choice of an econometric model Two features of time-series data to consider: Time-series observations on a given economic unit, observed over a number of time periods, are likely to be correlated Time-series data have a natural ordering according to time

9.1 Introduction There is also the possible existence of dynamic relationships between variables A dynamic relationship is one in which the change in a variable now has an impact on that same variable, or other variables, in one or more future time periods These effects do not occur instantaneously but are spread, or distributed, over future time periods

FIGURE 9.1 The distributed lag effect
Introduction FIGURE 9.1 The distributed lag effect

Dynamic Nature of Relationships
9.1 Introduction 9.1.1 Dynamic Nature of Relationships Ways to model the dynamic relationship: Specify that a dependent variable y is a function of current and past values of an explanatory variable x Because of the existence of these lagged effects, Eq. 9.1 is called a distributed lag model Eq. 9.1

Dynamic Nature of Relationships
9.1 Introduction 9.1.1 Dynamic Nature of Relationships Ways to model the dynamic relationship (Continued): Capturing the dynamic characteristics of time-series by specifying a model with a lagged dependent variable as one of the explanatory variables Or have: Such models are called autoregressive distributed lag (ARDL) models, with ‘‘autoregressive’’ meaning a regression of yt on its own lag or lags Eq. 9.2 Eq. 9.3

Dynamic Nature of Relationships
9.1 Introduction 9.1.1 Dynamic Nature of Relationships Ways to model the dynamic relationship (Continued): Model the continuing impact of change over several periods via the error term In this case et is correlated with et - 1 We say the errors are serially correlated or autocorrelated Eq. 9.4

Least Squares Assumptions
9.1 Introduction 9.1.2 Least Squares Assumptions The primary assumption is Assumption MR4: For time series, this is written as: The dynamic models in Eqs. 9.2, 9.3 and 9.4 imply correlation between yt and yt - 1 or et and et - 1 or both, so they clearly violate assumption MR4

Finite Distributed Lags
9.2 Finite Distributed Lags

Finite Distributed Lags
9.2 Finite Distributed Lags Consider a linear model in which, after q time periods, changes in x no longer have an impact on y Note the notation change: βs is used to denote the coefficient of xt-s and α is introduced to denote the intercept Eq. 9.5

Finite Distributed Lags
9.2 Finite Distributed Lags Model 9.5 has two uses: Forecasting Policy analysis What is the effect of a change in x on y? Eq. 9.6 Eq. 9.7

9.3 Serial Correlation

These terms are used interchangeably
9.3 Serial Correlation When is assumption TSMR5, cov(et, es) = 0 for t ≠ s likely to be violated, and how do we assess its validity? When a variable exhibits correlation over time, we say it is autocorrelated or serially correlated These terms are used interchangeably

Computing Autocorrelation
9.3 Serial Correlation 9.3.1a Computing Autocorrelation More generally, the k-th order sample autocorrelation for a series y that gives the correlation between observations that are k periods apart is: Eq. 9.14

Computing Autocorrelation
9.3 Serial Correlation 9.3.1a Computing Autocorrelation How do we test whether an autocorrelation is significantly different from zero? The null hypothesis is H0: ρk = 0 A suitable test statistic is: Eq. 9.17

9.3 Serial Correlation 9.3.1b The Correlagram The correlogram, also called the sample autocorrelation function, is the sequence of autocorrelations r1, r2, r3, … It shows the correlation between observations that are one period apart, two periods apart, three periods apart, and so on

FIGURE 9.6 Correlogram for G
9.3 Serial Correlation FIGURE 9.6 Correlogram for G 9.3.1b The Correlagram

9.3 Serial Correlation 9.3.2a A Phillips Curve To determine if the errors are serially correlated, we compute the least squares residuals: Eq. 9.20

The k-th order autocorrelation for the residuals can be written as:
9.3 Serial Correlation 9.3.2a A Phillips Curve The k-th order autocorrelation for the residuals can be written as: The least squares equation is: Eq. 9.21 Eq. 9.22

The values at the first five lags are:
9.3 Serial Correlation 9.3.2a A Phillips Curve The values at the first five lags are:

Other Tests for Serially Correlated Errors
9.4 Other Tests for Serially Correlated Errors

We can substitute this into a simple regression equation:
9.4 Other Tests for Serially Correlated Errors 9.4.1 A Lagrange Multiplier Test If et and et-1 are correlated, then one way to model the relationship between them is to write: We can substitute this into a simple regression equation: Eq. 9.23 Eq. 9.24

But since we know that , we get:
9.4 Other Tests for Serially Correlated Errors 9.4.1 A Lagrange Multiplier Test To derive the relevant auxiliary regression for the autocorrelation LM test, we write the test equation as: But since we know that , we get: Eq. 9.25

9.4 Other Tests for Serially Correlated Errors 9.4.1 A Lagrange Multiplier Test Rearranging, we get: If H0: ρ = 0 is true, then LM = T x R2 has an approximate χ2(1) distribution T and R2 are the sample size and goodness-of-fit statistic, respectively, from least squares estimation of Eq. 9.26 Eq. 9.26

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors Three estimation procedures are considered: Least squares estimation An estimation procedure that is relevant when the errors are assumed to follow what is known as a first-order autoregressive model A general estimation strategy for estimating models with serially correlated errors

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors We will encounter models with a lagged dependent variable, such as:

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors ASSUMPTION FOR MODELS WITH A LAGGED DEPENDENT VARIABLE TSMR2A In the multiple regression model Where some of the xtk may be lagged values of y, vt is uncorrelated with all xtk and their past values.

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors 9.5.1 Least Squares Estimation Suppose we proceed with least squares estimation without recognizing the existence of serially correlated errors. What are the consequences? The least squares estimator is still a linear unbiased estimator, but it is no longer best The formulas for the standard errors usually computed for the least squares estimator are no longer correct Confidence intervals and hypothesis tests that use these standard errors may be misleading

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors 9.5.1 Least Squares Estimation It is possible to compute correct standard errors for the least squares estimator: HAC (heteroskedasticity and autocorrelation consistent) standard errors, or Newey-West standard errors These are analogous to the heteroskedasticity consistent standard errors

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors 9.5.1 Least Squares Estimation Consider the model yt = β1 + β2xt + et The variance of b2 is: where Eq. 9.27

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors 9.5.1 Least Squares Estimation When the errors are not correlated, cov(et, es) = 0, and the term in square brackets is equal to one. The resulting expression is the one used to find heteroskedasticity-consistent (HC) standard errors When the errors are correlated, the term in square brackets is estimated to obtain HAC standard errors

Estimation with Serially Correlated Errors
9.5 Estimation with Serially Correlated Errors 9.5.1 Least Squares Estimation If we call the quantity in square brackets g and its estimate , then the relationship between the two estimated variances is: Eq. 9.28

Substituting, we get: Eq. 9.43 9.5
Estimation with Serially Correlated Errors 9.5.2b Nonlinear Least Squares Estimation Substituting, we get: Eq. 9.43

The coefficient of xt-1 equals -ρβ2
9.5 Estimation with Serially Correlated Errors 9.5.2b Nonlinear Least Squares Estimation The coefficient of xt-1 equals -ρβ2 Although Eq is a linear function of the variables xt , yt-1 and xt-1, it is not a linear function of the parameters (β1, β2, ρ) The usual linear least squares formulas cannot be obtained by using calculus to find the values of (β1, β2, ρ) that minimize Sv These are nonlinear least squares estimates