'Trends, time series and forecasting

Slides:



Advertisements
Similar presentations
Autocorrelation Functions and ARIMA Modelling
Advertisements

Assumptions underlying regression analysis
September 2005Created by Polly Stuart1 Analysis of Time Series For AS90641 Part 2 Extra for Experts.
Module 4. Forecasting MGS3100.
Chapter 4: Basic Estimation Techniques
Continued Psy 524 Ainsworth
Statistical Analysis SC504/HS927 Spring Term 2008
Spreadsheet Modeling & Decision Analysis
Part II – TIME SERIES ANALYSIS C3 Exponential Smoothing Methods © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Forecasting OPS 370.
Correlation and regression
Mean, Proportion, CLT Bootstrap
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Chapter 11: Forecasting Models
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Time Series Analysis Autocorrelation Naive & Simple Averaging
Moving Averages Ft(1) is average of last m observations
Chapter 12 Simple Regression
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Chapter 5 Time Series Analysis
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
Chapter 13 Forecasting.
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Business Forecasting Chapter 5 Forecasting with Smoothing Techniques.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
1 BABS 502 Moving Averages, Decomposition and Exponential Smoothing Revised March 11, 2011.
Diane Stockton Trend analysis. Introduction Why do we want to look at trends over time? –To see how things have changed What is the information used for?
Inference for regression - Simple linear regression
Regression Method.
Forecasting and Statistical Process Control MBA Statistics COURSE #5.
TIME SERIES by H.V.S. DE SILVA DEPARTMENT OF MATHEMATICS
Business Forecasting Used to try to predict the future Uses two main methods: Qualitative – seeking opinions on which to base decision making – Consumer.
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
DSc 3120 Generalized Modeling Techniques with Applications Part II. Forecasting.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 27 Time Series.
Time Series Analysis and Forecasting
Managerial Economics Demand Estimation & Forecasting.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Time series Decomposition Farideh Dehkordi-Vakil.
Time series Model assessment. Tourist arrivals to NZ Period is quarterly.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Bivariate Data Analysis Bivariate Data analysis 4.
Copyright © 2011 Pearson Education, Inc. Time Series Chapter 27.
Time Series Analysis and Forecasting. Introduction to Time Series Analysis A time-series is a set of observations on a quantitative variable collected.
EXCEL DECISION MAKING TOOLS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.
Forecasting Parameters of a firm (input, output and products)
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Lecture Notes 1.2 Prepared.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1 BABS 502 Moving Averages, Decomposition and Exponential Smoothing Revised March 14, 2010.
The Box-Jenkins (ARIMA) Methodology
Time-Series Forecast Models  A time series is a sequence of evenly time-spaced data points, such as daily shipments, weekly sales, or quarterly earnings.
Paul Fryers Deputy Director, EMPHO Technical Advisor, APHO Introduction to Correlation and Regression Contributors Shelley Bradley, EMPHO Mark Dancox,
Managerial Decision Modeling 6 th edition Cliff T. Ragsdale.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Forecasting.
Chapter 15 Forecasting. Forecasting Methods n Forecasting methods can be classified as qualitative or quantitative. n Such methods are appropriate when.
F5 Performance Management. 2 Section C: Budgeting Designed to give you knowledge and application of: C1. Objectives C2. Budgetary systems C3. Types of.
Trends in the life expectancy at birth relative gap between Spearhead PCTs and England, compared to the national health inequalities target of a 10% reduction.
Stats Methods at IC Lecture 3: Regression.
Chapter 14 Introduction to Multiple Regression
Chapter 4 Basic Estimation Techniques
Basic Estimation Techniques
What is Correlation Analysis?
CHAPTER 26: Inference for Regression
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

'Trends, time series and forecasting Paul Fryers East Midlands KIT

Overview Introduction Understanding trends and time series – Seasonality – Transformations Methods for analysing time series – Regression – Moving averages – Autocorrelation Overview of forecasting Forecasting methods – Extrapolation of regression – Holt’s method Uses for forecasting – Setting and monitoring targets – Estimating current values General methodological points

What is a time series? A set of well defined measures collected through time: Mortality Diagnoses Temperature Rainfall Share price Sunspots Ice cream sales Air passengers Road accidents

What is special about time series data? There is an implicit order to the data with a first, second, third,..., nth value Previous observations may be important determinants of later observations this has implications for analysis Trend and/or seasonal effects may be present a trend is a tendency for observations to fall or rise over time seasonal effects are regular repeating patterns of rises or falls Different techniques are needed for analysis of historical data and for producing forecasts

Continuous time : electrocardiogram trace

Monthly emphysema deaths

Understanding trends and time series First plot the data Is the time series consistent? Look for step changes in level or trend Is there any visual evidence of any pattern or trend? Is there evidence of a regular ‘seasonal’ pattern? If there is a trend, is it linear? (probably not!)

Is the time series consistent? – change in trend

Is the time series consistent? – step changes

Handling inconsistency Usually, we will simply break the time series at the point where the trend changes, or the step change occurs Analyse only the data since that point, or analyse the different parts of the time series separately Or use a method/software that will do that automatically, eg by weighting more recent points more heavily We may be able to adjust or transform the data prior to a step change but only if we understand the reason for the change and are confident that the adjustment makes the data consistent eg adjusting for a coding change (ICD coding, definition of unemployment, etc.) But it’s not always clear cut...

Is the time series consistent? – step changes?

Is the time series consistent? – outlier

Handling outliers Normally, we ignore outliers, ie exclude them from the analysis this can be a nuisance for some analyses But again, it’s not always clear cut: we need to identify plausible reasons for the outlier/s (eg known issues with data collection, or a specific factor that has influenced the outcome)

Is there any visual evidence of any pattern or trend?

Is there any visual evidence of any pattern or trend?

Is there any visual evidence of any pattern or trend?

Is there any visual evidence of any pattern or trend?

Is there any visual evidence of any pattern or trend?

Graph of an indicator showing a seasonal pattern plus rising trend Is there any visual evidence of any pattern or trend? Graph of an indicator showing a seasonal pattern plus rising trend

Handling seasonality Seasonality can be additive or multiplicative ie each different period in the cycle has an extra factor added to (or subtracted from) or multiplied by the overall average level We can adjust the data by applying the inverse factor to each period Easier to use an integrated method that adjusts for the seasonality within the analysis

Is the trend linear?

Example of a falling rate – straight line

Example of a falling rate – exponential curve

Transformations – non-linear trends In many cases, it is meaningless for the forecasts to fall below zero In public health we are most commonly dealing with counts, rates or proportions We routinely transform the data in order to ‘make the data linear’ and constrain them to be no less than zero By default, we should use a log-transformation for counts or rates, fitting an exponential curve which assumes a constant rate of change, rather than a constant numerical increase or decrease We should use a logit-transformation for proportions (or percentages), which constrains the variable to be between 0 and 1 (or 0% and 100%)

Transformations – falling exponential curve A rapidly falling trend The indicator looks to be heading rapidly towards zero, but the log transformation ensures that it stays positive: the rate or count is ‘tending towards’ zero but can never quite get there It represents a constant rate of change (i.e. reducing by x% each year rather than reducing by a set amount each year) This should be the default option for analysis of counts or rates

Transformations – rising exponential curve A rapidly increasing trend For a count or rate, mathematically it is preferable to use an exponential curve, but need to beware of other practical constraints: there will usually be some practical limit to a count or rate If the continued rise in the count or rate is implausible then it is better to use a linear model or logit...

Transformations – log-transform counts and rates Fitting an exponential curve: Equation of curve: ln(y) = ln(a) + ln(b)t or y = a × bt where y = value of variable being studied a = intercept on y-axis (nominal value of indicator at time 0) t = time value b = ‘gradient’ (amount y is multiplied by for each increase of 1 in time) ln(0) = –∞ ln(∞) = ∞

Transformations – logistic curve Proportions can not go below zero or above 1 The tails are equivalent: e.g. proportion surviving = 1 – proportion dying Particularly important for proportions that span a large range, from under 0.5 to nearly 1, e.g. percentage achievement on QOF scores For proportions or percentages close to zero, the logit is equivalent to the log For proportions always close to 1 could subtract from 1 and use log

Transformations – logit-transform proportions The logit function: logit(y) = ln(y/(1–y)) = ln(y) – ln(1–y) logit(0) = –∞ logit(½) = 0 logit(1) = ∞ We transform proportions by applying the logit function, then fit a regression line to the transformed data For rates or counts which have a practical limit, if we have a sound basis for estimating that realistic maximum then we could do so and treat the rate or count as a proportion of that upper limit

Methods for analysing time series Regression Most common method: simply fit a line or curve to the data, treating ‘time’ as any other explanatory variable Gives equal weight to all points in the time series Assumes points are independent, identically distributed, observations Gradient has confidence intervals: if CIs don’t include zero, the gradient is signicant Two other concepts that are used as the basis for analysing time series: Moving average Autocorrelation

Linear regression  

Confidence intervals for the gradient  

Moving average Familiar as a method of presenting data For annual data, rather than presenting data for 2004, 2005, 2006, 2007 and 2008, we may present three-year figures: 2004-06, 2005-07 and 2006-08 Smoothes out fluctuations in the data, making trends easier to see Also called ‘rolling averages’ Moving averages of different periods can be used to highlight different features of a time series (example follows) BUT!!! Moving averages must not be used as the basis for regression, time series analysis or forecasting as they are not independent observations (they share their data with their neighbours) [Note: time series methods such as Holt’s Method and Box-Jenkins (ARIMA) models use moving averages within the analysis, but the data from which the model is derived should not be moving averages]

Monthly emphysema deaths

3-point moving average, highlighting seasonality

13-point moving average, highlighting trend

Autocorrelation In time series, observations can often be predicted by combinations of previous observations If the observations are correlated with their immediate predecessors, we can calculate the Pearson correlation coefficient between them This is called autocorrelation of lag 1 Observations can also be correlated with predecessors from further back in the time series – autocorrelation of lag k (where k is number of observations back in the series) In time series, observations can be predicted by combinations of previous observations Smoothes out fluctuations in the data, making trends easier to see

Forecasting Why do we need to forecast? Extrapolating Forecasting methods Examples Holt’s Method Interval forecasts How far back and how far forward? Using forecasts to set and monitor progress against targets to estimate current health outcomes/indicators

Why do we need to forecast? To inform planning by estimating future needs the health of the population tends to change slowly and react slowly to public health interventions so we need to look ahead To anticipate future major events e.g. outbreaks To set and monitor progress against targets where are we likely to be on current progress? are we on track to meet targets? To estimate current health outcomes our most recent data tend to be a year or more out of date so if we want to know where we are now or even where we were last year we have to forecast

Forecasting from past trends If we have time series for a health outcome, health service output indicator or risk factor, we can use this to forecast future values eg: mortality rates teenage pregnancy rates hospital activity rates prevalence estimates Assumes: consistent definitions and measurement, past and future either that nothing significant changes, or that changes/ improvements continue at the same rate

Extrapolating from regression lines A common method is to fit a regression line (or curve) to the historic data and extrapolate it to the future This is OK for a short time into the future as long as the historic data are stable, ie changing at a steady rate But: The regression line is fitted across the whole of the historic data, and gives equal weight to all points: e.g. the value for last year is given the same weight as one from 20 years ago – it doesn’t give the best estimate of ‘current trends’ We cannot give realistic confidence intervals for future values (‘prediction intervals’ or ‘forecast intervals’)

Forecasting methods There is a range of methods which are intended for forecasting, eg moving average methods, autocorrelation methods, Box-Jenkins methods These methods take into account fluctuations from year to year, trends (ie gradual changes over time) and seasonal variations They tend to give greater weight to more recent values, hence ‘start from where we are’ They give confidence intervals for forecasts, which tend to get wider as we move further into the future The most useful methods for public health applications tend to be Holt’s Method (which includes a trend component) and Holt-Winters (which adds a seasonal component) Note, as with regression analysis, the points in the time series must be independent of each other: rolling averages must never be used for forecasting

Teenage conceptions – England http://www.empho.org.uk/pages/viewResource.aspx?id=11285

Teenage conceptions – London GOR http://www.empho.org.uk/pages/viewResource.aspx?id=11285

Teenage conceptions – Newham http://www.empho.org.uk/pages/viewResource.aspx?id=11285

Teenage conceptions – Harrow http://www.empho.org.uk/pages/viewResource.aspx?id=11285

Alcohol-related admission rates – Bassetlaw PCT Data provided to Nottinghamshire and Bassetlaw PCTs for WCC trajectories

Stroke mortality rates – Nottinghamshire PCT Data provided to Nottinghamshire and Bassetlaw PCTs for WCC trajectories

Fractured neck of femur admission rates – Nottinghamshire PCT Data provided to Nottinghamshire and Bassetlaw PCTs for WCC trajectories

Deaths occurring at home – Bassetlaw PCT Data provided to Nottinghamshire and Bassetlaw PCTs for WCC trajectories

Emergency admission rates for stroke/TIA – East Midlands – males Report to East Midlands Cardiac & Stroke Network

Emergency admission rates for acute coronary syndrome – East Midlands – males Report to East Midlands Cardiac & Stroke Network

Emergency admission rates for acute coronary syndrome – East Midlands – females Report to East Midlands Cardiac & Stroke Network

Holt’s Method Holt’s exponential smoothing (aka double exponential smoothing) is a moving average method There are two equations involved in fitting the model: Lt = axt + (1–a)(Lt–1 + Tt–1) Tt = g(Lt–Lt–1) + (1–g)Tt–1 where xt is the observed value at time t Lt is the forecast at time t (the ‘level’ parameter) Tt is the estimated slope at time t (the ‘trend’ parameter) a is the first smoothing constant, used to smooth the level g is the second smoothing constant, used to smooth the trend The model is fitted iteratively from the start of the time series, usually setting L1 initially to x1 and T1 to x2 – x1 A software package optimises the constants a and g such that the squared differences between the observed values and the forecasts are minimised

Holt’s Method in practice Several statistical packages will do this: ForecastPro – not free but very easy to use Stata – not free and needs code, but PHE has a corporate licence R – open source software which requires code Excel – you can put the equations into Excel but have to optimise the parameters manually If you use Stata, R or Excel, you need to put some effort into optimising the parameters, which requires some expertise and time ForecastPro has very clever optimisation routines, which always seem to result in sensible forecasts and forecast intervals BUT!!! Every forecast should be graphed and checked – even the most expert of automated ‘expert systems’ cannot and should not be totally relied on

Interval forecasts not point forecasts When we forecast the future we give a single figure for each forecast that is our best estimate of the future value

Interval forecasts not point forecasts When we forecast the future we give a single figure for each forecast that is our best estimate of the future value However, of course there is uncertainty about that prediction Forecast intervals give an indication of the degree of uncertainty, and are far more valuable than the actual point forecasts These forecast intervals are calculated by the forecasting software

How far back and how far forward? As discussed earlier, if the graph shows a distinct change in trend or step change then we should ignore the data before the current trend If we use Holt’s Method or similar, it is less critical because the method tends to give more weight to recent data and largely ignores earlier points, but if it is clear from the graph, it is still wise to use only the data which exhibit the current trend If the change is very recent, then we probably don’t have a sound basis for forecasting – this would be reflected in the forecast intervals (covered later) How far ahead can we forecast – the ‘forecast horizon’? A rule of thumb is quoted, that you can forecast around half as far forwards as you have data going back, however it depends on the stability of the series, and common sense should be applied The question is less critical if you present forecast intervals: these will become extremely wide as you get further into the future, demonstrating that the forecasts are meaningless

Using forecasts to set and monitor targets – 1 Our Healthier Nation set targets to reduce circulatory disease death rates by 40% between 1995-97 and 2010 This seemed reasonable at the time However, if everywhere reduces circulatory disease by 40%, the gap between affluent and deprived parts of the country remains the same (e.g. Doncaster’s SMR will be the same in 2010 as it was in 1997)

Using forecasts to set and monitor targets – 2 In 2004, there was a view that the OHN targets were being achieved more easily in more affluent areas New Spearhead targets were set to reduce the gap between the Spearhead group of local authorities and the national average by 40% between 1995-97 and 2010 In fact rates were dropping much faster than the OHN targets required

Using forecasts to set and monitor targets – 3 Spearhead targets are set relative to national rates so they have to be updated annually, taking into account national forecasts We have to forecast the England rate in 2010 (current forecast: 61 deaths per 100,000 population)

Using forecasts to set and monitor targets – 4 Spearhead targets are set relative to national rates so they have to be updated annually, taking into account national forecasts We have to forecast the England rate in 2010 (current forecast: 61 deaths per 100,000 population) Then set the local Spearhead target to give the required 40% narrowing of the gap (target is 65 deaths per 100,000)

Using forecasts to set and monitor targets – 5 Spearhead targets are set relative to national rates so they have to be updated annually, taking into account national forecasts We have to forecast the England rate in 2010 (current forecast: 61 deaths per 100,000 population) Then set the local Spearhead target to give the required 40% narrowing of the gap (target is 65 deaths per 100,000) We can forecast the local rate to assess whether we’re on target

Within district target In 1995-97 Doncaster’s deprived communities had an all age all cause mortality rate 16.7% greater than Doncaster as a whole To reduce the gap by 10% it must only be 15.0% above the Doncaster average by 2010 Doncaster forecast to be 713 deaths per 100,000 person- years in 2010 Target for the deprived quintile is 821 This represents a reduction in death rates of 18%

Within district target Using forecasting with realistic forecast intervals often serves to demonstrate the impossibility of measuring outcomes at PCT/local authority level or below It is essential always to plot as long a time series as possible to illustrate the wide year-on-year fluctuations Forecast intervals, even for the next year, are very broad

Within district target Clearer to graph the annual excess mortality as a percentage Is this measurable? One weakness of this method is that it only uses the deprived quintile – it ignores most of the distribution

Using forecasting to estimate ‘current’ rates By ‘current’ we normally mean ‘the average of the last three years for which data are available’ For deaths, the ‘current’ values, used for analysing our current mortality rates, for example, are based on 2010-2012 data, i.e. data from between 4½ and 1½ years ago: on average 3 years out of date For small areas, even with 3 years’ data, we still have very few deaths or cases to work with and hence our baseline can be pretty arbitrary We may be able to use forecasting methodology to help with both of these problems: If we forecast 2014 values based on a time series from 2000 to 2012 then we have a) a more robust baseline, based on 13 years’ data not 3 b) a baseline which reflects ‘now’ rather than 3 years ago Forecasts of ‘current’ periods can give us robust ‘underlying’ values or rates

Example – rapidly changing rates Circulatory disease death rates are falling dramatically 2004-06 average rate was 91 deaths per 100,000 population-years 2008 forecast was 74 In 2008, by taking the average of 2004-06 as our ‘current’ rate we were potentially overestimating the impact of interventions by 23%

Summary – key points Look at a graph of the data, and think about the data you are working with, considering whether there are reasons why past trends may not be a sound basis for future changes Decide how far back you should start Transform data to ensure that the shape of the graph and any logical limits on variability (e.g. >0, <100%) are reflected in the mathematical assumptions Use regression to analyse past changes Use forecasting methods such as Holt’s Method (or Holt-Winters for seasonal data) to make predictions of future rates with realistic forecast intervals Ensure that data are independent of one another: no rolling averages Always graph the results, to ensure that the maths hasn’t had an off day

Contact Paul Fryers paul.fryers@phe.gov.uk