Presentation on theme: "SADC Course in Statistics Time Series: An Introduction (Session 01)"— Presentation transcript:
SADC Course in Statistics Time Series: An Introduction (Session 01)
To put your footer here go to View > Header and Footer 2 Time Series Learning Objectives By the end of the next 4 sessions, devoted to time series, you will be able to appreciate the broader concept of data where time is a factor understand basic time series concepts and terminology be able to decompose a time series to look at trends and seasonal effects, and do simple forms of forecasting be able to concisely summarize results of time series analysis in writing
To put your footer here go to View > Header and Footer 3 Learning Objectives – this session By the end of this session, you will be able to give examples of data collected over time state objectives of a time series analysis appreciate the importance of graphing data interpret key features emerging from an examination of a time series report main findings from a graphical presentation of time series data
To put your footer here go to View > Header and Footer 4 Basics: Definitions and Notation A time series is a collection of observations made sequentially through time Such observations may be denoted by Y 1, Y 2,Y 3, …Y t, …, Y T observation at time t since data are usually collected at discrete points in time The interval between observations can be any time interval (hours within days, days, weeks, months, years, etc).
To put your footer here go to View > Header and Footer 5 Some areas of applications Time series can occur in a wide range of fields – from economics to sociology, meteorology to financial investment, etc Some examples of time series are: –Monthly closings of the stock exchange index –Malaria incidence or deaths over calendar years –Daily maximum temperatures –Hourly records of babies born at a maternity hospital Can you suggest other examples?
To put your footer here go to View > Header and Footer 6 Basics: Types of time series Observations made continually in time give rise to a Continuous Time Series, e.g. –Thermometer readings at a Met station (continuously measured) –Measurement of whether air pollution reached increasing levels of unacceptability at an industrial site (air pollution levels are continuous) More often, observations are taken only at specific points in time, giving rise to a Discrete Time Series, e.g. –annual number of road accidents (discrete) –maximum daily temperature (continuous) –whether or not there was daily rain (binary)
To put your footer here go to View > Header and Footer 7 Objectives of a time series Description (often with monitoring data) –Merely to describe the patterns over time Explanation –Can the pattern observed over time be explained in terms of other factors or causes? Helps in understanding the behaviour of the series Prediction (forecasting) –Can past records help us to predict what will happen in the future? Improving the past system/behaviour –If factors affecting the behaviour of a variable over time can be identified, action may be taken to improve the system, e.g. action over increasing levels of air pollution
To put your footer here go to View > Header and Footer 8 Analysing Series with time element Where the time element is just incidental, it may not be necessary to use a formal time series analysis approach –e.g. start of the rainy season each year at a tobacco farm The analysis used depends on the objective(s) of the study It can vary from just descriptive methods to more advanced analysis approaches In these time series sessions, we will largely concentrate on simple approaches.
To put your footer here go to View > Header and Footer 9 Approach in this session We begin with some examples showing the importance of graphing the data to get an insight into the distribution over time For other examples, refer to 2.1.1 in CAST for SADC – Higher Level We then summarise some lessons that can be learnt from graphing the data in time
To put your footer here go to View > Header and Footer 10 Jumping to conclusions from raw data Data (interval-scale): Company profits (000 dollars) Objective: To study changes in profit figures over consecutive quarters YearQuarter 1 Quarter 2 Quarter 3 Quarter 4 1667631675699 2739695751779 3823795835875 4931855939967 Impression is that the 4 th quarter is always higher than the 1 st quarter
To put your footer here go to View > Header and Footer 11 Take a look again… Previous impression is largely because there is a general increase over time
To put your footer here go to View > Header and Footer 12 Objective: to emphasize the need for graphing distributions in order to get a clearer understanding of the data distribution Day 1Day 2Day 3 Mean20.81 Std Dev0.72 Jumping to conclusions from summaries Data source: Petruccelli, J; MSOR Connections Vol 7 No 2, 2007 Data (interval-scale): Breaking strengths of parcel string tested on a piece selected every 5 minutes from one spool during production. 100 samples from each of 3 different days (simulated data) Summary statistics identical!
To put your footer here go to View > Header and Footer 13 Take a look again… The distributions are definitely different!
To put your footer here go to View > Header and Footer 14 Discussion exercise Level of data for analysis depends on objectives –Level : time period »Botswana hours of sunshine data –Level: Local, National, International »Malaria incidence with rainfall pattern relationship (between variables) »Malaria incidence comparisons (between countries) In small groups, study the information on slides 15-20. Discuss what the graphs indicate and report back to the whole class after 20 minutes.
To put your footer here go to View > Header and Footer 15 Zambia Rainfall Data Problem: Farmers in Southern Zambia are moving out of the province because they believe that climate change is affecting farm production. A local NGO promoting Conservation Farming insists that the problem is due to bad farming practice. Study commissioned to investigate the problem; one of the events investigated was Start of the Rains ( defined as >20mm of rainfall in 3 days, after 15 November)
To put your footer here go to View > Header and Footer 16 Start of Rains Objective: to investigate if there has been any change in start of the rainy season in Southern Zambia Data source: Moorings Station, Monze, Southern Zambia Data (interval-scale): Start of Rains calculated as day number (from July 1 st ) of the first 3-day spell with >20 mm rain after November 15 th What is your answer to the question?
To put your footer here go to View > Header and Footer 17 Case for using Solar cookers Problem: Quality of life of women in Botswana is affected by the need to look for firewood for cooking. Extensive use of firewood also degrades the environment. With 1hr of sunshine, water can be boiled; with at least 3hrs of sunshine, it would be possible to cook the midday meal A study was designed to investigate the feasibility of using solar cookers as an alternative to using firewood
To put your footer here go to View > Header and Footer 18 Cooking with Solar Cookers Objective: to investigate what proportion of days in February will have at least (a) 1hr (b) 3hrs, of morning sunshine per day Data source: Gaborone, Botswana, courtesy Botswana Met service Data/interval: daily sunshine hours in the morning using summarised February hourly sunshine records, 1978 – 1997
To put your footer here go to View > Header and Footer 19 Malaria data from Uganda Objective: to compare two apparently related time series data Data source: Onapa & Mouchet(1996), www.who.int/malaria/ Data/interval: rainfall totals (mm) & malaria, out patient cases treated per month at Kisizi Mission Hospital between 1992 & 1995 (See WHO disclaimer on final slide)
To put your footer here go to View > Header and Footer 20 Reported Malaria Cases Objective: to investigate trends in reported malaria cases in selected countries Data source: UNEP/DEWA/GRID- Europe, GEO Data Portal; compiled from WHO/UNICEF, World Malaria Report 2005 Data(interval): total (per 100,000 population) reported malaria cases
To put your footer here go to View > Header and Footer 21 Lessons summarised The level to which the data needs to be summarised before analysis depends on the objective(s) of the study The specific analysis depends on the objectives - a descriptive analysis will often be sufficient Different levels of data will be needed depending on whether the problem is being looked at the international level, national level or local level Imperative however that quality data be made accessible to ensure that conclusions arising from the analysis are correct.
To put your footer here go to View > Header and Footer 22 Time Plots This is a plot of the measurement of interest against the time of the observation No matter what you decide is the appropriate way to analyse your data, the time factor must not be ignored. As we have seen in the examples considered in this session, it is very important to start the exploration of a time series with a graphical representation of the data. However, there are a number of points to be kept in mind when drawing such a plot, as discussed in the next two slides
To put your footer here go to View > Header and Footer 23 Choice of sampling interval The two figures are of an ECG of a healthy woman, but whereas the bottom one is measured at a smaller interval, the top one is measured at a longer interval – and misses the peculiar peak of the heartbeat. So the choice of the sampling interval is quite important: too frequent can be costly & too infrequent might miss out essential characteristics
To put your footer here go to View > Header and Footer 24 Choice of aspect ratio Notice: different aspect ratios emphasize different characteristics of the series – the top one brings out the differences in the peaks while the lower one highlights the way the peaks rise and fall
To put your footer here go to View > Header and Footer 25 To join or not to join Same data as in slide 18 but without the points joined up
To put your footer here go to View > Header and Footer 26 To join or not to join Advantage of joining – usually easier to digest Disadvantage – gives impression of continuity; definitely a risk when missing values exist Return now to example on slide 15 for some practical work in order to ensure learning objectives are achieved… (Details are outlined in Practical 01)
To put your footer here go to View > Header and Footer 27 With reference to slide 19, note that: The World Health Organisation does not warrant that the information contained in the web site is complete and correct and shall not be liable whatsoever for any damages incurred as a result of its use. The WHO website further add Extracts of WHO information can be used for private study or for educational purposes without permission. Wider use requires permission to be obtained from WHO.