Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall.

Similar presentations


Presentation on theme: "Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall."— Presentation transcript:

1 Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

2 2 Outline t Examples of monotone relationships in environmental data t Monotone regression in one or more independent variable t Simultaneous estimation of monotone trends and seasonal patterns  Monte Carlo methods for constrained least squares regression  Simple averaging techniques

3 3 Tot-P concentrations (Brunsbüttel) versus water discharge(NeuDarchau) in the Elbe River, Mean values for April 1985-2000

4 4 Average monthly ozone concentrations versus humidity at Ähtäri in central Finland

5 5 Tot-N concentrations (mg/l) Monthly mean concentrations of total nitrogen at Brunsbüttel in the Elbe River Year Month

6 6 Tot-N concentrations (Brunsbüttel) in the Elbe River, Mean values for July 1985-2000

7 7 Monotone regression (isotonic or antitonic regression) t Given a set of two-dimensional data  Sort the data by x into  Minimise under the constraints or t A well-known algorithm used to solve the problem is the PAV Algorithm (Pool-Adjacent-Violators Algorithm), (Ayer, 1955; Barlow et al., 1972; Hanson et al., 1973)

8 8 Tot-N concentrations (Brunsbüttel) in the Elbe River, Mean values for July 1985-2000

9 9 t If the data are already monotone, then the PAV algorithm will reproduce them t The solution is a step function t If there are outliers, then the PAV algorithm will produce long, flat levels. t The impact of outliers can be reduced by first smoothing the data (Friedman and Tibshirani, 1984). The PAV Algorithm

10 10 t An algorithm for computing the least squares regression function which is constrained to be nondecreasing in each of several independent variables was developed by R. Dykstra & T. Robertson, 1984.  The algorithm was written specifically for two independent variables, and it is to produce the solution of  where is a given two-dimensional array of the original values;  is a nonnegative array of weights; and K is the class of two-dimensional arrays, G=( ) such that whenever Monotone regression in two independent variables

11 11 t Inefficient for relatively small data sets t Can not handle typical multiple regression data where at least one of the explanatory variables is continuous t Unclear how seasonality can be handled Limitations of classical algorithms for monotone regression in two or more explanatory variables

12 12 Tot-N concentrations (mg/l) Monthly mean concentrations of total nitrogen at Brunsbüttel in the Elbe River Year Month

13 13 Example of linear trend with a superimposed trigonometric seasonal pattern Month Year

14 14 Monte Carlo methods for constrained least squares regression t Let denote a time series of data collected over m seasons t Let denote the sum of the trend and seasonal components at time i t Determine by minimising under the following constraints: Method I

15 15 Monte Carlo methods for constrained least squares regression t Monotonicity constraints is either decreasing or increasing for each season or t Seasonality constraints The seasonal pattern is composed of convex and concave curve pieces, i.e., for all time points belonging to a given season. Method I

16 16 Algorithm General Information t The problem is a classical quadratic optimisation problem t The computational burden increases rapidly with the number of variables and constraints t This burden can be a serious problem if the suggested algorithms do not take into considerations the special features of the constraints Method I

17 17 Algorithm Theoretical Solution t Given a crude initial estimate of t form new estimates, k = 1, 2, …, by employing an updating formula : is a vector defining the shape of the adjustment : is a scaling factor Method I

18 18 Algorithm Shapes of the functions used for updating the response surface Method I

19 19 t is determined in such a way that t is minimised and the desired constraints are satisfied.  Applying such a solution will reduce the original multivariate optimisation problem to a sequence of univariate optimisation problems. Algorithm Method I

20 20 Response surface satisfying monotonicity and convexity constraints Month Year Fitted Tot-N concentration (mg/l) Method I

21 21 Simple averaging techniques t Consider satisfying, t where denotes a vector of m explanatory variables, t is assumed to be monotone in ( nondecreasing or non- increasing ). t Nondecreasing case, let be an initial estimate of which could be the data itself, then consider and Method II

22 22 Simple averaging techniques Method II M 2 values M 1 values

23 23 t For, the set of estimators are non decreasing in, and work well for light-tailed error distributions (Strand, 2003; Mukerjee & Stern, 1994). t The estimate of is the value that minimises which is Method II Simple averaging techniques

24 24 t Nonincreasing case, the same steps to create estimates based on instead of and changing the signs on the estimates back at the end to get the nonincreaing function. t Seasonality was handled by defining two monotone function with respect to the seasons having high and low concentration values. Method II Simple averaging techniques

25 25 Method II Simple averaging techniques Maximum Minimum

26 26 Monthly mean concentrations of total nitrogen at Brunsbüttel in the Elbe River Year Monthly mean Tot-N concentration (mg/l) Method II Month

27 27 Method II Fitted monthly mean concentrations of total nitrogen at Brunsbüttel in the Elbe River Fitted monthly mean Tot-N concentration (mg/l) based on the observed data. Year Month

28 28 Smoothed monthly mean concentrations of total nitrogen at Brunsbüttel in the Elbe River Lightly Smoothed monthly mean Tot-N concentration (mg/l) bandwidth( 0.05 ) Method II Year Month

29 29 Smoothed monthly mean concentrations of total nitrogen at Brunsbüttel in the Elbe River Strongly Smoothed monthly mean Tot-N concentration (mg/l) bandwidth( 0.3 ) Method II Year Month

30 30 Method II Simple averaging techniques in multidimensional case Year Water discharge levels (10^9 m^3/month) Smoothed Tot_N transport kton/month), February values 1985-2000

31 31 Method II Simple averaging techniques in multidimensional case Fitted Tot_N transport kton/month), February values 1985-2000 Water discharge levels (10^9 m^3/month) Year

32 32 Results t The two Algorithms have performed satisfactorily on water quality data from the Elbe River and other rivers, t Regardless of the features of the data sets that were examined, the obtained sequences of fitted surfaces converges to a function that could be interpreted as a sum of trend and seasonal components, t The components representing irregular variation provided a good starting point for the detection of outliers, t The major drawback of The Monte Carlo Algorithm was the computational burden

33 33 Results t Simple averaging techniques are efficient and work well for initial estimates that have light-tailed t Simple averaging techniques are sensitive to outliers, and can have problems with sparse data t For monotone functions, an alternative to using large bandwidth is to use a slightly smaller bandwidth and then improve the accuracy by making the estimates monotone

34 34 Conclusions t It is possible to combine non-parametric procedures with very natural constraints on the trend and seasonal components of time series of environmental data t The proposed procedures are so generally applicable that they can form the basis of fully-automatic systems for quality assessment and decomposition of time series of environmental data t Applications involving several explanatory variables or sparse data sets require further methodological work


Download ppt "Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall."

Similar presentations


Ads by Google