Download presentation

Presentation is loading. Please wait.

Published byJennifer Horton Modified over 2 years ago

1
Training Course 2009 – NWP-PR: Calibration of EPSs 1/42 Calibration of EPSs Renate Hagedorn European Centre for Medium-Range Weather Forecasts

2
Training Course 2009 – NWP-PR: Calibration of EPSs 2/42 Outline Motivation Methods Training data sets Results

3
Training Course 2009 – NWP-PR: Calibration of EPSs 3/42 Motivation EPS forecasts are subject to forecast bias and dispersion errors, i.e. uncalibrated The goal of calibration is to correct for such known model deficiencies, i.e. to construct predictions with statistical properties similar to the observations A number of statistical methods exist for post-processing ensembles Calibration needs a record of prediction-observation pairs Calibration is particularly successful at station locations with long historical data record (-> downscaling)

4
Training Course 2009 – NWP-PR: Calibration of EPSs 4/42 Calibration methods for EPSs Bias correction Multiple implementation of deterministic MOS Ensemble dressing Bayesian model averaging Non-homogenous Gaussian regression Logistic regression Analog method

5
Training Course 2009 – NWP-PR: Calibration of EPSs 5/42 Bias correction As a simple first order calibration a bias correction can be applied: This correction factor is applied to each ensemble member, i.e. spread is not affected Particularly useful/successful at locations with features not resolved by model and causing significant bias with: e i = ensemble mean of the i th forecast o i = value of i th observation N = number of observation-forecast pairs

6
Training Course 2009 – NWP-PR: Calibration of EPSs 6/42 Bias correction OBS DET EPS

7
Training Course 2009 – NWP-PR: Calibration of EPSs 7/42 Multiple implementation of det. MOS A possible approach for calibrating ensemble predictions is to simply correct each individual ensemble member according to its deterministic model output statistic (MOS) BUT: this approach is conceptually inappropriate since for longer lead-times the MOS tends to correct towards climatology all ensemble members tend towards climatology with longer lead-times decreased spread with longer lead-times in contradiction to increasing uncertainty with increasing lead-times Experimental product at but no objective verification yet…

8
Training Course 2009 – NWP-PR: Calibration of EPSs 8/42 Define a probability distribution around each ensemble member (dressing) A number of methods exist to find appropriate dressing kernel (best- member dressing, error dressing, second moment constraint dressing, etc.) Average the resulting n ens distributions to obtain final pdf Ensemble dressing

9
Training Course 2009 – NWP-PR: Calibration of EPSs 9/42 Ensemble Dressing (Gaussian) ensemble dressing calculates the forecast probability for the quantiles q as: key parameter is the standard deviation of the Gaussian dressing kernel error variance of the ensemble-mean FC average of the ensemble variances over the training data with: Φ = CDF of standard Gaussian distribution x i = bias-corrected ensemble-member ~

10
Training Course 2009 – NWP-PR: Calibration of EPSs 10/42 Bayesian Model Averaging BMA closely linked to ensemble dressing Differences: dressing kernels do not need to be the same for all ensemble members different estimation method for kernels Useful for giving different ensemble members (models) different weights: Estimation of weights and kernels simultaneously via maximum likelihood, i.e. maximizing the log-likelihood function: with: w 1 + w e (n ens - 1) = 1 g 1, g e = Gaussian PDFs

11
Training Course 2009 – NWP-PR: Calibration of EPSs 11/42 BMA: example 90% prediction interval of BMA OBS single model ensemble members Ref: Raftery et al., 2005, MWR

12
Training Course 2009 – NWP-PR: Calibration of EPSs 12/42 BMA: recovered ensemble members OBS single model ensemble members 100 equally likely values drawn from BMA PDF Ref: Raftery et al., 2005, MWR

13
Training Course 2009 – NWP-PR: Calibration of EPSs 13/42 Non-homogenous Gaussian regression In order to account for existing spread-skill relationships we model the variance of the error term as a function of the ensemble spread s ens : The parameter a,b,c,d are fit iteratively by minimizing the CRPS of the training data set Interpretation of parameters: bias & general performance of ens-mean are reflected in a and b large spread-skill relationship: c 0.0, d 1.0 small spread-skill relationship: d 0.0 Calibration provides mean and spread of Gaussian distribution (called non-homogenous since variances of regression errors not the same for all values of the predictor, i.e. non-homogenous)

14
Training Course 2009 – NWP-PR: Calibration of EPSs 14/42 Logistic regression Logistic regression is a statistical regression model for Bernoulli- distributed dependent variables P is bound by 0,1 and produces an s-shaped prediction curve steepness of curve ( β 1 ) increases with decreasing spread, leading to sharper forecasts (more frequent use of extreme probabilities) parameter β 0 corrects for bias, i.e. shifts the s-shaped curve

15
Training Course 2009 – NWP-PR: Calibration of EPSs 15/42 How does logistic regression work? + training data 100 cases (EM) height of obs y/n + test data (51 members) height of raw prob event threshold event observed yes/no (0/1) calibrated prob GP: 51N, 9E, Date: , Lead: 96h

16
Training Course 2009 – NWP-PR: Calibration of EPSs 16/42 LR-Probability worse! + training data 100 cases (EM) height of obs y/n + test data (51 members) height of raw prob event threshold event observed yes/no (0/1) calibrated prob GP: 51N, 9E, Date: , Lead: 168h

17
Training Course 2009 – NWP-PR: Calibration of EPSs 17/42 LR-Probability better! + training data 100 cases (EM) height of obs y/n + test data (51 members) height of raw prob event threshold event observed yes/no (0/1) calibrated prob GP: 15.5S, 149.5W, Date: , Lead: 168h

18
Training Course 2009 – NWP-PR: Calibration of EPSs 18/42 Analog method Full analog theory assumes a nearly infinite training sample Justified under simplifying assumptions: Search only for local analogs Match the ensemble-mean fields Consider only one model forecast variable in selecting analogs General procedure: Take the ensemble mean of the forecast to be calibrated and find the n ens closest forecasts to this in the training dataset Take the corresponding observations to these n ens re-forecasts and form a new calibrated ensemble Construct probability forecasts from this analog ensemble

19
Training Course 2009 – NWP-PR: Calibration of EPSs 19/42 Analog method Ref: Hamill & Whitaker, 2006, MWR Forecast to be calibrated Closest re-forecasts Corresponding obs Probabilities of analog-ens Verifying observation

20
Training Course 2009 – NWP-PR: Calibration of EPSs 20/42 Training datasets All calibration methods need a training dataset, containing a number of forecast-observation pairs from the past The more training cases the better The model version used to produce the training dataset should be as close as possible to the operational model version For research applications often only one dataset is used to develop and test the calibration method. In this case cross-validation has to be applied. For operational applications one can use: Operational available forecasts from e.g. past days Data from a re-forecast dataset covering a larger number of past forecast dates / years

21
Training Course 2009 – NWP-PR: Calibration of EPSs 21/ FebMarchApr Perfect Reforecast Data Set

22
Training Course 2009 – NWP-PR: Calibration of EPSs 22/42 Early motivating results from Hamill et al., 2004 Bias corrected with refc data LR-calibrated ensemble Bias corrected with 45-d data Achieved with perfect reforecast system! Raw ensemble

23
Training Course 2009 – NWP-PR: Calibration of EPSs 23/42 The 32-day unified VAREPS Unified VarEPS/Monthly system enables the production of unified reforecast data set, to be used by: EFI model climate day EPS calibration Monthly forecasts anomalies and verification Efficient use of resources (computational and operational) Perfect reforecast system would produce for every forecast a substantial number of years of reforecast Realistic reforecast system has to be an optimal compromise between affordability and needs of all three applications

24
Training Course 2009 – NWP-PR: Calibration of EPSs 24/42 Unified VarEPS/Monthly Reforecasts 2009 FebMarchApr

25
Training Course 2009 – NWP-PR: Calibration of EPSs 25/42 Unified VarEPS/Monthly Reforecasts 2009 FebMarchApr

26
Training Course 2009 – NWP-PR: Calibration of EPSs 26/42 Calibration of medium-range forecasts A limited set of reforecasts has been produced for a preliminary assessment of the value of reforecasts for calibrating the medium- range EPS Test Reforecast data set consists of: 14 refc cases (01/09/2005 – 01/12/2005) 20 refc years (1982 – 2001) 15 refc ensemble members (1 ctrl pert.) Model cycle 29r2, T255, ERA-40 initial conditions Used to calibrate the period Sep-Nov 2005 (91 cases) Calibrating upper air model fields vs. analysis demonstrated less scope for calibration Greater impact for surface variables, in particular at station locations

27
Training Course 2009 – NWP-PR: Calibration of EPSs 27/42 I.ECMWF forecasts, though better than GFS forecasts, can be improved through calibration II.Main improvement through bias correction (60-80%), but when using advanced methods (e.g. NGR) calibration of spread adds to general improvements, in particular at early lead times III.Improvements occur mainly at locations with low skill IV.Operational training data can be used for short-lead forecasts and/or light precipitation events, however, reforecasts beneficial at long leads and/or extremer precipitation events V.Usually, near-surface multi-model forecasts are better than single model forecasts, however, reforecast calibrated ECMWF forecasts are competitive for short lead times and even better than the MM for longer lead times Main messages

28
Training Course 2009 – NWP-PR: Calibration of EPSs 28/42 I. ECMWF vs. GFS Results from cross-validated reforecast data: 14 weekly start dates, Sep-Dec (280 cases) Hagedorn et al., 2008

29
Training Course 2009 – NWP-PR: Calibration of EPSs 29/42 I.ECMWF forecasts, though better than GFS forecasts, can be improved through calibration II.Main improvement through bias correction (60-80%), but when using advanced methods (e.g. NGR) calibration of spread adds to general improvements, in particular at early lead times III.Improvements occur mainly at locations with low skill IV.Operational training data can be used for short-lead forecasts and/or light precipitation events, however, reforecasts beneficial at long leads and/or extremer precipitation events V.Usually, near-surface multi-model forecasts are better than single model forecasts, however, reforecast calibrated ECMWF forecasts are competitive for short lead times and even better than the MM for longer lead times Main messages

30
Training Course 2009 – NWP-PR: Calibration of EPSs 30/42 II. Bias-correction vs. NGR-calibration NGR-calibration Bias-correction DMO REFC-data: 15 members, 20 years, 5 weeks 2m temperature forecasts (1 Sep – 30 Nov 2005), 250 European stations CRPSS BC-spread = DMO-spread NGR-spread RMSE & spread

31
Training Course 2009 – NWP-PR: Calibration of EPSs 31/42 I.ECMWF forecasts, though better than GFS forecasts, can be improved through calibration II.Main improvement through bias correction (60-80%), but when using advanced methods (e.g. NGR) calibration of spread adds to general improvements, in particular at early lead times III.Improvements occur mainly at locations with low skill IV.Operational training data can be used for short-lead forecasts and/or light precipitation events, however, reforecasts beneficial at long leads and/or extremer precipitation events V.Usually, near-surface multi-model forecasts are better than single model forecasts, however, reforecast calibrated ECMWF forecasts are competitive for short lead times and even better than the MM for longer lead times Main messages

32
Training Course 2009 – NWP-PR: Calibration of EPSs 32/42 III. Individual locations CRPSS: 2m-Temperature, Sep-Nov 2005, Lead: 48h DMO NGR NGR - DMO

33
Training Course 2009 – NWP-PR: Calibration of EPSs 33/42 I.ECMWF forecasts, though better than GFS forecasts, can be improved through calibration II.Main improvement through bias correction (60-80%), but when using advanced methods (e.g. NGR) calibration of spread adds to general improvements, in particular at early lead times III.Improvements occur mainly at locations with low skill IV.Operational training data can be used for short-lead forecasts and/or light precipitation events, however, reforecasts beneficial at long leads and/or extremer precipitation events V.Usually, near-surface multi-model forecasts are better than single model forecasts, however, reforecast calibrated ECMWF forecasts are competitive for short lead times and even better than the MM for longer lead times Main messages

34
Training Course 2009 – NWP-PR: Calibration of EPSs 34/42 IV. Operational training data vs REFC data Operational training data can give similar benefit for short lead times REFC data much more beneficial for longer lead times ECMWF GFS Hagedorn et al., 2008

35
Training Course 2009 – NWP-PR: Calibration of EPSs 35/42 IV. Operational training data vs REFC data NGR calibrated NGR calibration method particularly sensitive to available training data Bias correction only 20y REFC training data 45-day operational data DMO 20y REFC training data 45-day operational data DMO

36
Training Course 2009 – NWP-PR: Calibration of EPSs 36/42 IV. REFC beneficial for extreme precip Precipitation > 1mm Precipitation > 10mm REFC data much more beneficial for extreme precipitation events Daily reforecast data not more beneficial than weekly REFC Hamill et al., 2008

37
Training Course 2009 – NWP-PR: Calibration of EPSs 37/42 I.ECMWF forecasts, though better than GFS forecasts, can be improved through calibration II.Main improvement through bias correction (60-80%), but when using advanced methods (e.g. NGR) calibration of spread adds to general improvements, in particular at early lead times III.Improvements occur mainly at locations with low skill IV.Operational training data can be used for short-lead forecasts and/or light precipitation events, however, reforecasts beneficial at long leads and/or extremer precipitation events V.Usually, near-surface multi-model forecasts are better than single model forecasts, however, reforecast calibrated ECMWF forecasts are competitive for short lead times and even better than the MM for longer lead times Main messages

38
Training Course 2009 – NWP-PR: Calibration of EPSs 38/42 V. TIGGE multi-model Multi-Model ECMWF Met Office NCEP Solid: no BC T-2m, 250 European stations – (60 cases)

39
Training Course 2009 – NWP-PR: Calibration of EPSs 39/42 V. TIGGE multi-model Multi-Model ECMWF Met Office NCEP Dotted: 30d-BC Solid: no BC T-2m, 250 European stations – (60 cases)

40
Training Course 2009 – NWP-PR: Calibration of EPSs 40/42 V. TIGGE multi-model Multi-Model ECMWF Met Office NCEP Dashed: REFC-NGR Dotted: 30d-BC Solid: no BC T-2m, 250 European stations – (60 cases)

41
Training Course 2009 – NWP-PR: Calibration of EPSs 41/42 Summary The goal of calibration is to correct for known model deficiencies A number of statistical methods exist to post-process ensembles Every methods has its own strengths and weaknesses Analog methods seems to be useful when large training dataset available Logistic regression can be helpful for extreme events not seen so far in training dataset NGR method useful when strong spread-skill relationship exists, but relative expensive in computational time Greatest improvements can be achieved on local station level Bias correction constitutes a large contribution for all calibration methods ECMWF reforecasts very valuable training dataset for calibration

42
Training Course 2009 – NWP-PR: Calibration of EPSs 42/42 References and further reading Gneiting, T. et al, 2005: Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation. Monthly Weather Review, 133, Hagedorn, R, T. M. Hamill, and J. S. Whitaker, 2008: Probabilistic forecast calibration using ECMWF and GFS ensemble forecasts. Part I: 2-meter temperature. Monthly Weather Review, 136, Hamill T.M. et al., 2004: Ensemble Reforcasting: Improving Medium-Range Forecast Skill Using Retrospective Forecasts. Monthly Weather Review, 132, Hamill, T.M. and J.S. Whitaker, 2006: Probabilistic Quantitative Precipitation Forecasts Based on Reforecast Analogs: Theory and Application. Monthly Weather Review, 134, Hamill, T. M., R. Hagedorn, and J. S. Whitaker, 2008: Probabilistic forecast calibration using ECMWF and GFS ensemble forecasts. Part II: precipitation. Monthly Weather Review, 136, Raftery, A.E. et al., 2005: Using Bayesian Model Averaging to Calibrate Forecast Ensembles. Monthly Weather Review, 133, Wilks, D. S., 2006: Comparison of Ensemble-MOS Methods in the Lorenz 96 Setting. Meteorological Applications, 13,

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google