Presentation on theme: "ECON 251 Research Methods 11. Time Series Analysis and Forecasting."— Presentation transcript:
1ECON 251Research Methods11. Time Series Analysisand Forecasting
2IntroductionAny variable that is measured over time in sequential order is called a time series.We analyze time series to detect patterns.The patterns help in forecasting future values of the time series.Future expected valueThe time series exhibit adownward trend pattern.
3Components of a Time Series A time series can consist of four components.Long - term trend (T)Cyclical effect (C)Seasonal effect (S)Random variation (R)A trend is a long term relativelysmooth pattern or direction, thatpersists usually for more than one year.
4Components of a Time Series A time series can consists of four components.Long - term trend (T)Cyclical effect (C)Seasonal effect (S)Random variation (R)A cycle is a wavelike pattern describing a long term behavior (for more than one year).Cycles are seldom regular, andoften appear in combination withother components.
5Components of a Time Series A time series can consists of four components.Long - term trend (T)Cyclical effect (C)Seasonal effect (S)Random variation (R)The seasonal component of the time-series exhibits a short term (less than one year) calendar repetitive behavior.
6Components of a Time Series A time series can consists of four components.Long - term trend (T)Cyclical effect (C)Seasonal effect (S)Random variation (R)Random variation comprises the irregular unpredictable changes in the time series. It tends to hide the other (more predictable) components.We try to remove random variationthereby, identify the other components.
7Time-series models There are two commonly used time series models: The additive modelThe multiplicative modelyt = Tt + Ct + St + Rtyt = Tt x Ct x St x Rt
8Smoothing TechniquesTo produce a better forecast we need to determine which components are present in a time seriesTo identify the components present in the time series, we need first to remove the random variationThis can be easily done by smoothing techniques:moving averagesexponential smoothing
9Moving AveragesA k-period moving average for time period t is the arithmetic average of the time series values where the period t is the center observation.Example: A 3-period moving average for period t is calculated by(yt+1 + yt + yt-1)/3.Example: A 5-period moving average for period t is calculated by(yt+2 + yt+1 + yt + yt-1 + yt-2)/5.
10Example: Gasoline Sales To forecast future gasoline sales, the last four years quarterly sales were recorded.Calculate the three-quarter and five-quarter moving average.Data
11(yt+2 + yt+1 + yt + yt-1 + yt-2)/5 3 period moving average(yt+1 + yt + yt-1)/35 period moving average(yt+2 + yt+1 + yt + yt-1 + yt-2)/5SolutionSolving by hand*
12Example: Gasoline Sales Notice how the averaging process removes some ofthe random variation.There is some trend component present,as well as seasonality.
13Example: Gasoline Sales The 5-period moving average removes morevariation than the 3-period moving average.Too much smoothing may eliminate patterns of interest. Here, the seasonalitycomponent is removed when using 5-period moving average.Too little smoothing leaves much of the variation, which disguises the real patterns.5-period moving average3-period moving average
14Example – Armani’s Pizza Sales Below are the daily sales figures for Armani’s pizza. Compute a 3-day moving average. (Armani TS.xls)Week Week 2MondayTuesdayWednesdayThursdayFridaySaturdaySundayThe moving average that is associated with Monday of week 2 is?
15Centered moving average With even number of observations included in the moving average, the average is placed between the two periods in the middle.To place the moving average in an actual time period, we need to center it.Two consecutive moving averages are centered by taking their average, and placing it in the middle between them.
16Example – Armani’s Pizza Sales Calculate the 4-period moving average and center it, for the data given below:Period Time series Moving Avg. Centerd Mov.Avg3 204 145 256 1119.0(2.5)20.2521.5(3.5)19.5017.5(4.5)
17Example – Armani’s Pizza Sales Armani’s pizza is back. Compute a 2-day centered moving average.Week 1Monday 35Tuesday 42Wednesday 56Thursday 46Friday 67Saturday 51Sunday 39The centered moving average that is associated with Friday of week 1 is?
18Components of a Time Series A time series can consist of four components.Long - term trend (T)Cyclical effect (C)Seasonal effect (S)Random variation (R)A trend is a long term relativelysmooth pattern or direction, thatpersists usually for more than one year.
19Trend AnalysisThe trend component of a time series can be linear or non-linear.It is easy to isolate the trend component by using linear regression.For linear trend use the model y = b0 + b1t + e.For non-linear trend with one (major) change in slope use the quadratic model y = b0 + b1t + b2t2 + e
20Example – Pharmaceutical Sales (Identify trend) Annual sales for a pharmaceutical company are believed to change linearly over time.Based on the last 10 year sales records, measure the trend component.Start by renaming your years 1, 2, 3, etc.
21Example – Pharmaceutical Sales (Identify trend) SolutionUsing Excel we haveForecast for period 1111
22Example – Cigarette Consumption Determine the long-term trend of the number of cigarettes smoked by Americans 18 years and older.Data (Cigarettes.xls)SolutionCigarette consumption increased between and 1963, but decreased thereafter.A quadratic model seems to fit this pattern.
23Example – Cigarette Consumption The quadratic model fitsthe data very well.
24ExampleForecast the trend components for the previous two examples for periods 12, and 41 respectively.
25Components of a Time Series A time series can consists of four components.Long - term trend (T)Cyclical effect (C)Seasonal effect (S)Random variation (R)A cycle is a wavelike pattern describing a long term behavior (for more than one year).Cycles are seldom regular, andoften appear in combination withother components.
26Measuring the Cyclical Effects Although often unpredictable, cycles need to be isolated.To identify cyclical variation we use the percentage of trend.Determine the trend line (by regression).Compute the trend value for each period t.Calculate the percentage of trend by
27Example – Energy Demand Does the demand for energy in the US exhibit cyclic behavior over time? Assume a linear trend and calculate the percentage of trend.Data (Energy Demand.xls)The collected data included annual energy consumption from 1970 through 1993.Find the values for trend and percentage of trend for years 1975 and 1976.
28Example – Energy Demand Now consider the multiplicative model(Assuming no seasonal effects).No trend is observed, but cyclicality and randomness still exist.The regression line represents trend.
29(66.4/69.828)100 When groups of the percentage of trend alternate around 100%,the cyclical effectsare present.
30Components of a Time Series A time series can consists of four components.Long - term trend (T)Cyclical effect (C)Seasonal effect (S)Random variation (R)The seasonal component of the time-series exhibits a short term (less than one year) calendar repetitive behavior.
31Measuring the Seasonal effects Seasonal variation may occur within a year or even within a shorter time interval.To measure the seasonal effects we can use two common methods:constructing seasonal indexesusing indicator variablesWe will do each in turn
32Example – Hotel Occupancy Calculate the quarterly seasonal indexes for hotel occupancy rate in order to measure seasonal variation.Data (Hotel Occupancy.xls)
33Example – Hotel Occupancy Perform regression analysis for the model y = b0 + b1t + e where t represents the chronological time, and y represents the occupancy rate.Time (t) RateThe regression line represents trend.
34Example – Hotel Occupancy Now let’s consider the multiplicative model, and let’s DETREND data(Assuming no cyclical effects)The regression line represents trend.No trend is observed, butseasonality and randomness still exist.
35Example – Hotel Occupancy To remove most of the random variation but leave the seasonal effects, average the terms StRt for each season.Average ratio for quarter 1:( )/5 = .878Average ratio for quarter 2: ( )/5 = 1.076Average ratio for quarter 3: ( )/5 = 1.171Average ratio for quarter 4: ( )/ 5 = .875
36Example – Hotel Occupancy Interpreting the resultsThe seasonal indexes tell us what is the ratio between the time series value at a certain season, and the overall seasonal average.In our problem:17.1% above theannual averageQuarter 3Quarter 387.8%107.6%117.1%87.5%7.6% above theannual average12.5% below theannual averageAnnual averageoccupancy (100%)Quarter 2Quarter 212.2% below theannual averageQuarter 1Quarter 1Quarter 4Quarter 4
37Creating Seasonal Indexes Normalizing the ratios:The sum of all the ratios must be 4, such that the average ratio per season is equal to 1.If the sum of all the ratios is not 4, we need to normalize (adjust) them proportionately. Suppose the sum of ratios equaled 4.1. Then each ratio will be multiplied by 4/4.1(Seasonal averaged ratio) (number of seasons)Sum of averaged ratiosSeasonal index =In our problem the sum of all the averaged ratios is equal to 4:= 4.0.No normalization is needed. These ratios become the seasonal indexes.
38ExampleUsing a different example on quarterly GDP, assume that the averaged ratios were:Quarter 1: .97Quarter 2: 1.03Quarter 3: .87Quarter 4: 1.00Determine the seasonal indexes.
39Deseasonalizing time series Seasonally adjusted time series =Actual time seriesSeasonal indexBy removing the seasonality, we can:compare data across different seasonsidentify changes in the other components of the time - series.
40Example – Hotel Occupancy A new manager was hired in Q3 1994, when hotel occupancy rate was 80.6%. In the next quarter, the occupancy rate was 63.2%. Should the manager be fired because of poor performance?Method 1:compare occupancy rates in Q (0.632) and Q (0.806)the manager has underperformed he should be firedMethod 2:compare occupancy rates in Q (0.632) and Q (0.600)the manager has performed very well he should get a raise
41Example – Hotel Occupancy Method 3:compare deseasonalized occupancy rates in Q and Q3 1994Seasonally adjusted time series =Actual time seriesSeasonal indexRecall:Seasonally adjusted occupancy rate in Q4 1994Actual occupancy rate in Q4 1994Seasonal index for Q4=Seasonally adjusted occupancy rate in Q3 1994the manager has indeed performed well he should be maybe get a raise
42Recomposing the time series Recomposition will recreate the time series, based only on the trend and seasonal components – you can then deduce the value of the random component.In period #1 ( quarter 1):In period #2 ( quarter 2):Actual seriesSmoothed seriesThe linear trend (regression) line
43Example – Recomposing the Time Series Recompose the smoothed time series in the above example for periods 3 and 4 assuming our initial seasonal index figures of:Quarter 1: Quarter 3: 1.171Quarter 2: Quarter 4: 0.875and equation of:
44Seasonal Time Series with Indicator Variables We create a seasonal time series model by using indicator variables to determine whether there is seasonal variation in data.Then, we can use this linear regression with indicator variables to forecast seasonal effects.If there are n seasons, we use n-1 indicator variables to define the season of period t:[Indicator variable n] = 1 if season n belongs to period t[Indicator variable n] = 0 if season n does not belong to period t
45Example – Hotel Occupancy Let’s again build a model that would be able to forecast quarterly occupancy rates at a luxurious hotel, but this time we will use a different technique.Data (Hotel Occupancy.xls)
46Example – Hotel Occupancy Use regression analysis and indicator variables to forecast hotel occupancy rate in DataQi =1 if quarter i occur at t0 otherwiseQuarter 1 belongs to t = 1Quarter 2 does notbelong to t = 1Quarter 3 does notbelong to t = 1Quarters 1, 2, 3, do notbelong to period t = 4.
47The regression model y = b0 + b1t + b2Q1 + b3Q2 + b4Q3 +e There is insufficient evidence to conclude that seasonality causes occupancy rate in quarter 1 be different than this of quarter 4!There is sufficient evidence toconclude that trend is present.Good fitSeasonality effects on occupancy rate in quarter 2 and 3 are different than this of quarter 4!
48The estimated regression model y = b0 + b1t + b2Q1 + b3Q2 + b4Q3 +e1234b2b3b4Trend lineb0
49Time-Series Forecasting with Autoregressive (AR) models Autocorrelation among the errors of the regression model provides opportunity to produce accurate forecasts.Correlation between consecutive residuals leads to the following autoregressive model:This is type of model is termed a first-order autoregressive model, and is denoted AR(1). A model which also includes both yt-1 and yt-2 is a second order model AR(2) and so on.Selecting between them is done on the basis of the significance of the highest order term. If you reject the highest order term, re-run the model without it. Apply the same criterion for the new highest order term, and so on.yt = b0 + b1yt-1 + et
50Example – Consumer Price Index (CPI) Forecast the increase in the Consumer Price Index (CPI) for the year 1994, based on the data collected for the yearsData (CPI.xls)
51Example – Consumer Price Index (CPI) Regress the Percent increase (yt) on the time variable (t) it does not lead to good results.The residualsform a patternover time. Also,Yt = tr2 = 15.0%this indicates thepotentialpresence of first orderautocorrelation betweenconsecutive residuals.(+)(-)(-)
52Example – Consumer Price Index (CPI) An autoregressive model appears to be a desirable technique.The increase in the CPI for periods 1, 2, 3,… are predictors of the increase in the CPI for periods 2, 3, 4,…, respectively.
54Selecting Among Alternative Forecasting Models There are many forecasting models available?*oo*ooooooModel 1Model 2Which modelperforms better?
55Model selectionTo choose a forecasting method, we can evaluate forecast accuracy using the actual time series.The two most commonly used measures of forecast accuracy are:Mean Absolute DeviationSum of Squares for Forecast ErrorIn practice, SSE is SSFE are used interchangeably.
56Model selection Procedure for model selection. Use some of the observations to develop several competing forecasting models.Run the models on the rest of the observations.Calculate the accuracy of each model using both MAD and SSE criterion.Use the model which generates the lowest MAD value, unless it is important to avoid (even a few) large errors - in this case use best model as indicated by the lowest SSE.
57Example – Model selection I Assume we have annual data from 1950 to 2001.We used data from 1950 to to develop three alternate forecasting models (two different regression models and an autoregressive model),Use MAD and SSE to determine which model performed best using the model forecasts versus the actual data for
58Example – Model selection I SolutionFor model 1Summary of resultsActual yin 1998Forecast for yin 1998
59Example – Model selection II For the actual values and the forecast values of a time series shown in the following table, calculate MAD and SSE.Forecast Value Actual Value MAD SSEFt yt