Use of linear discriminant methods for calibration of seasonal probability forecasts Andrew Colman, Richard Graham. © Crown copyright 2007 07/0145 Met.

Use of linear discriminant methods for calibration of seasonal probability forecasts Andrew Colman, Richard Graham. © Crown copyright 2007 07/0145 Met Office and the Met Office logo are registered trademarks Met Office Hadley Centre FitzRoy Road Exeter Devon EX1 3PB United Kingdom Tel: 01392 884509 Fax: 0870 9005050 Email: andrew.colman@metoffice.gov.uk 1. Introduction Seasonal forecasting often requires combining information from a number of sources to produce an optimum forecast. The forecasts are presented as probabilities to reflect the uncertainty in modelling the physical processes involved in the atmosphere and ocean. Here we present two examples where discriminant analysis is used to produce probability forecasts from a combination of Global Circulation Model (GCM) output and statistical predictions from Sea Surface Temperature (SST) indices. 2. Discriminant analysis Discriminant analysis is similar to regression in that a relationship is defined between one or more predictor (independent) variables and a predictand (dependent) variable using a set of data called training data. An equation is derived into which predictor values are substituted to predict the predictand (independent) variable. The difference is that the final product is not a point estimate of some variable, but the probability of an event occurring. In the seasonal forecasting case the training data are trial GCM forecasts and/or historical values of a climate index which are paired with a database of corresponding observed temperature or rainfall tercile or quintile categories which are the events being predicted. Discriminant probabilities for a given event (t) are inversely proportional to the standardised distance D t in predictor space between the predictor values at the time of forecast (x) and the mean predictor values prior to occurrences of the event during the training period (m t ). The covariance matrix S is used to standardise the distances and enable account to be taken of correlation between predictors.  Event probabilities are evaluated on the distances for all the possible categories, i.e. three tercile categories or five quintile categories Figure 2 Findings from figure 2: – The largest improvements in ROC skill come from adding the NAO predictor to the GloSea predictor. Adding the extra ECMWF and Meteo France models seems to have no benefit it this case. – Combining discriminant probabilities with raw ensemble probabiliies results in a modest improvement in ROC skill (e.g. over S Scandinavia, SE UK) but no reduction in reliability. Discussion From these results, combining the discriminant and raw ensemble probabilities is worthwhile. The assessments may be underestimating the skill of the GloSea model currently being produced as current runs have the benefit of improved ocean observation data from instruments such as ARGO, which have become available over the last 5 years. NAO skill may be being over estimated as there is overlap in data used to identify the statistical predictor and data used to produce and assess the forecasts. Overestimating or underestimating of skill can be proven due to insufficient data but both factors support the averaging of discriminant and raw ensemble probabilities. Finally, it is probable that skill is data dependent and may be higher in certain years. The identification of such cases and the development of improved dynamical models are topics of current research. ENSEMBLES hindcasts are likely to be an important tool for this. Example 1. Prediction of north-west European winter temperature from the Met Office GloSea model combined with an NAO related North Atlantic SST index. Example of a region where while maintaining reliability, ROC Skill from GloSea is improved by adding a statistical predictor and by adding extra signal from the ensemble. Rodwell and Folland (2003) identified an index of May N Atlantic SST which could be used to predict subsequent winter (DJF) NAO which in turn is strongly associated with concurrent temperature and rainfall in NW Europe. The predictor has been combined with output from the GloSea model to make direct predictions of grid box temperatures using discriminant analysis. The discriminant equations are calculated using trial GloSea forecasts for 1959-2001 produced as part of the DEMETER project (www.ecmwf.int/demeter), a North Atlantic SST index calculated using HadISST (referred to also as NAO as it was initially produced to predict winter NAO) and temperature observations from the CRU TS 2.1 dataset (www.cru.uea.ac.uk/~timm/grid/CRU_TS_2_1.html). Figure 1 is a schematic diagram showing how discriminant analysis is used to predict the temperature tercile category probabilities for a given grid box from 2 predictors, GLOSEA model output (vertical axis) and the NAO statistical predictor (horizontal axis). The orange, green and blue solid circles surround the predictor means for the warm, average and cold tercile categories marked W, A and C respectively. Probabilities are inversely related to the distance between the forecast value FC and the 3 category means. The three smaller solid circles represent distances of 1 standardised unit away from the means (m t ). Where these circles intersect, the probabilities for the 2 categories are the same provided the predictors are uncorrelated. (The larger blue solid circle represents a distance of 2 standardised units from the cold tercile category mean). The dashed blue circle shows the effect of compensating for correlation between the 2 predictors. The compensation in this example is based on a correlation of 0.33 between the GLOSEA and NAO predictor. The circle is elongated into an ellipse along the axis of positive correlation. So if 2 positively correlated predictors agree, then probabilities are enhanced because the predictors are supporting each other but if the positively correlated predictors disagree as they do in the example (NAO is negative, GLOSEA is positive) then probabilities are reduced. Findings from Figure 3:  The multi-variate forecast (c) which we usually issue as our best estimate was strongly favouring the wet tercile category. The probabilities for the wet category were substantially higher than those predicted using the statistical predictors alone (a) or GloSea alone (b) an average of the two (d).  The high probability is explained by (i) the pre-season value for the Atlantic predictor being very close to the wet category mean and (ii) high correlation between GloSea forecasts and the SST predictor indices.  A moderated forecast (e) is evaluated by adding an uncertainty estimate to the category means of 1 standardised units (so the category means are represented by the solid circles in figure 1 rather than point values in the centre). Distances from points within the circle to the centre set to 1 otherwise they are calculated as before.  2005 was dry to average over the region so in the event (d) would have been the best choice.  ROC skill of forecasting systems (d) and (e) are very slightly lower than that of (c) for 1959-2001. General conclusions  Discriminant analysis is a good tool for seasonal forecasting but there are clearly cases when modifying the probabilities as in the examples here can improve forecast skill and the usefulness of the forecasts. The performance of trial forecasts (hindcasts) of temperature tercile categories measured by the ROC score and reliability is presented in figure 2. The plots show the impact of: Adding the NAO predictor to GLOSEA to make a 2 predictor equation Adding output from 2 extra DEMETER models, the ECMWF and Meteo-France GCMs to make a 4 predictor equation. Combining discriminant forecasts with raw ensemble using 2 different weights References Afifi, A.A. and Azen,S.P. Statistical analysis, a computer orientated approach. 2nd edition Academic press New York (1979). Folland, C.K., Colman, A.W., Rowell, D.P & Davey M.K. Predictability of northeast Brazil rainfall and real-time forecast skill, 1987-98. J. Climate 14 1937-1958 (2001). Rodwell, M. and Folland, C.K. Quarterly Journal of the Royal Meteorological Society, 128, 1413-1443 (2003). Stanski, H.R., Wilson, L.J. and Burrows, W.R. “Survey of common verification methods in Meteorology” World Weather Watch Technical Report No* WMO/TD 358, WMO, Geneva, Switzerland (1989). Figure 3. Example Precipitation tercile category probability forecasts for Feb-May 2005 Seasonal predictability in extra-tropical regions is low or non-existent and Europe is no exception. In such regions, discriminant probabilities tend to be close to climatology (chance) levels as signal is lost due to noise in the training data. Signal can be enhanced by combining the discriminant probabilities with raw ensemble probabilities. Raw ensemble probability = number of ensemble members within predicted category as defined from model climate/total number of ensemble members. Hence unlike discriminant probabilities, raw ensemble probabilities do not take account of performance in trial forecasts in predicting observed events Example 2. Prediction of north-east Brazil rainfall from the Met Office Glosea model combined with two tropical SST indices. Example of a forecast where the closeness of one of the predictors to a category mean value has resulted in some dubiously high probabilities. Forecasts are issued by the Met Office every year for the N E Brazil rainy season (Feb-May). Predictions used to be made using 2 SST indices, an index of Pacific SST representing ENSO and an index of tropical Atlantic SST representing an interhemispheric dipole. For more information about these predictions see Folland et al 2001. In recent years Glosea model output has been used as a third predictor following the identification of high skill in Folland et al (2001). Here we focus on the forecast for the 2005 season issued in January 2005 using SST data for the preceding November and December. Tercile category probability forecasts for 11 grid boxes are presented in figure 3. (a) Statistics only (b) Glosea model only (c) Glosea + Stat multivariate discriminant prediction (d) Average of Glosea + Stat predictions (e) Glosea + Stat multivariate discriminant prediction with 1 standardised unit error margin applied to predictor averages (f) ROC skill of (c) type predictions 1959-2001 (* 95% significant) (f) ROC skill of (d) type predictions 1959-2001 (f) ROC skill of (e) type predictions 1959-2001 Figure 1

Use of linear discriminant methods for calibration of seasonal probability forecasts Andrew Colman, Richard Graham. © Crown copyright 2007 07/0145 Met.

Similar presentations

Presentation on theme: "Use of linear discriminant methods for calibration of seasonal probability forecasts Andrew Colman, Richard Graham. © Crown copyright 2007 07/0145 Met."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Use of linear discriminant methods for calibration of seasonal probability forecasts Andrew Colman, Richard Graham. © Crown copyright 2007 07/0145 Met.

Similar presentations

Presentation on theme: "Use of linear discriminant methods for calibration of seasonal probability forecasts Andrew Colman, Richard Graham. © Crown copyright 2007 07/0145 Met."— Presentation transcript:

Similar presentations

About project

Feedback