What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.

Slides:

Advertisements

Similar presentations

Zentralanstalt für Meteorologie und Geodynamik Calibrating the ECMWF EPS 2m Temperature and 10m Wind Speed for Austrian Stations Sabine Radanovics.

Advertisements

Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Extended range forecasts at MeteoSwiss: User experience.

Emtool Centre for the Analysis of Time Series London School of Economics Jochen Broecker & Liam Clarke ECMWF Users Meeting June 2006.

ECMWF Slide 1Met Op training course – Reading, March 2004 Forecast verification: probabilistic aspects Anna Ghelli, ECMWF.

Sub-seasonal to seasonal prediction David Anderson.

1 Uncertainty in rainfall-runoff simulations An introduction and review of different techniques M. Shafii, Dept. Of Hydrology, Feb

THE CENTRAL LIMIT THEOREM

Measuring the performance of climate predictions Chris Ferro, Tom Fricker, David Stephenson Mathematics Research Institute University of Exeter, UK IMA.

What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.

Fair scores for ensemble forecasts Chris Ferro University of Exeter 13th EMS Annual Meeting and 11th ECAM (10 September 2013, Reading, UK)

Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.

Bookmaker or Forecaster? By Philip Johnson. Jersey Meteorological Department.

Creating probability forecasts of binary events from ensemble predictions and prior information - A comparison of methods Cristina Primo Institute Pierre.

On judging the credibility of climate predictions Chris Ferro (University of Exeter) Tom Fricker, Fredi Otto, Emma Suckling 12th International Meeting.

Gridded OCF Probabilistic Forecasting For Australia For more information please contact © Commonwealth of Australia 2011 Shaun Cooper.

Verification and evaluation of a national probabilistic prediction system Barbara Brown NCAR 23 September 2009.

HFIP Ensemble Products Subgroup Sept 2, 2011 Conference Call 1.

1 Only Valuable Experts Can be Valued Moshe Babaioff, Microsoft Research, Silicon Valley Liad Blumrosen, Hebrew U, Dept. of Economics Nicolas Lambert,

Ensemble Post-Processing and it’s Potential Benefits for the Operational Forecaster Michael Erickson and Brian A. Colle School of Marine and Atmospheric.

Water Management Presentations Summary Determine climate and weather extremes that are crucial in resource management and policy making Precipitation extremes.

Evaluation of Potential Performance Measures for the Advanced Hydrologic Prediction Service Gary A. Wick NOAA Environmental Technology Laboratory On Rotational.

Computer Simulation A Laboratory to Evaluate “What-if” Questions.

Caio A. S. Coelho Supervisors: D. B. Stephenson, F. J. Doblas-Reyes (*) Thanks to CAG, S. Pezzulli and M. Balmaseda.

© Crown copyright Met Office Operational OpenRoad verification Presented by Robert Coulson.

Evaluating decadal hindcasts: why and how? Chris Ferro (University of Exeter) T. Fricker, F. Otto, D. Stephenson, E. Suckling CliMathNet Conference (3.

Criteria for Assessment of Performance of Cancer Risk Prediction Models: Overview Ruth Pfeiffer Cancer Risk Prediction Workshop, May 21, 2004 Division.

© Crown copyright Met Office Forecasting Icing for Aviation: Some thoughts for discussion Cyril Morcrette Presented remotely to Technical Infra-structure.

ECMWF WWRP/WMO Workshop on QPF Verification - Prague, May 2001 NWP precipitation forecasts: Validation and Value Deterministic Forecasts Probabilities.

Forecasting wind for the renewable energy market Matt Pocernich Research Applications Laboratory National Center for Atmospheric Research

Verification of ensembles Courtesy of Barbara Brown Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright UCAR 2012, all rights reserved.

Rank Histograms – measuring the reliability of an ensemble forecast You cannot verify an ensemble forecast with a single.

Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.

4IWVM - Tutorial Session - June 2009 Verification of categorical predictands Anna Ghelli ECMWF.

Probabilistic Mechanism Analysis. Outline Uncertainty in mechanisms Why consider uncertainty Basics of uncertainty Probabilistic mechanism analysis Examples.

WWOSC 2014, Aug 16 – 21, Montreal 1 Impact of initial ensemble perturbations provided by convective-scale ensemble data assimilation in the COSMO-DE model.

ISDA 2014, Feb 24 – 28, Munich 1 Impact of ensemble perturbations provided by convective-scale ensemble data assimilation in the COSMO-DE model Florian.

Tutorial. Other post-processing approaches … 1) Bayesian Model Averaging (BMA) – Raftery et al (1997) 2) Analogue approaches – Hopson and Webster, J.

Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania

7.4 – Sampling Distribution Statistic: a numerical descriptive measure of a sample Parameter: a numerical descriptive measure of a population.

Celeste Saulo and Juan Ruiz CIMA (CONICET/UBA) – DCAO (FCEN –UBA)

Model dependence and an idea for post- processing multi-model ensembles Craig H. Bishop Naval Research Laboratory, Monterey, CA, USA Gab Abramowitz Climate.

Process of User-oriented interactive flooding-leading rain forecast system Chen Jing 1 Zhongwei Yan 2 Jiarui Han 3 Jiao Meiyan 4 1. Numerical Weather Prediction.

© Crown copyright Met Office Probabilistic turbulence forecasts from ensemble models and verification Philip Gill and Piers Buchanan NCAR Aviation Turbulence.

61 st IHC, New Orleans, LA Verification of the Monte Carlo Tropical Cyclone Wind Speed Probabilities: A Joint Hurricane Testbed Project Update John A.

2nd SRNWP Workshop on “Short-range ensembles” – Bologna, 7-8 April Verification of ensemble systems Chiara Marsigli ARPA-SIM.

Probabilistic Forecasting. pdfs and Histograms Probability density functions (pdfs) are unobservable. They can only be estimated. They tell us the density,

ENSEMBLES RT4/RT5 Joint Meeting Paris, February 2005 Overview of the WP5.3 Activities Partners: ECMWF, METO/HC, MeteoSchweiz, KNMI, IfM, CNRM, UREAD/CGAM,

Machine Design Under Uncertainty. Outline Uncertainty in mechanical components Why consider uncertainty Basics of uncertainty Uncertainty analysis for.

EUMETCAL NWP-course 2007: The Concept of Ensemble Forecasting Renate Hagedorn European Centre for Medium-Range Weather Forecasts The General Concept of.

Verification of ensemble systems Chiara Marsigli ARPA-SIMC.

Furthermore… References Katz, R.W. and A.H. Murphy (eds), 1997: Economic Value of Weather and Climate Forecasts. Cambridge University Press, Cambridge.

Two extra components in the Brier Score Decomposition David B. Stephenson, Caio A. S. Coelho (now at CPTEC), Ian.T. Jolliffe University of Reading, U.K.

Reducing the risk of volcanic ash to aviation Natalie Harvey, Helen Dacre (Reading) Helen Webster, David Thomson, Mike Cooke (Met Office) Nathan Huntley.

Deutscher Wetterdienst Preliminary evaluation and verification of the pre-operational COSMO-DE Ensemble Prediction System Susanne Theis Christoph Gebhardt,

Common verification methods for ensemble forecasts

Details for Today: DATE:13 th January 2005 BY:Mark Cresswell FOLLOWED BY:Practical Dynamical Forecasting 69EG3137 – Impacts & Models of Climate Change.

VERIFICATION OF A DOWNSCALING SEQUENCE APPLIED TO MEDIUM RANGE METEOROLOGICAL PREDICTIONS FOR GLOBAL FLOOD PREDICTION Nathalie Voisin, Andy W. Wood and.

Predicting the performance of climate predictions Chris Ferro (University of Exeter) Tom Fricker, Fredi Otto, Emma Suckling 13th EMS Annual Meeting and.

QJ Wang and Andrew Schepen LAND AND WATER CBaM for post-processing seasonal climate forecasts.

Improving Numerical Weather Prediction Using Analog Ensemble Presentation by: Mehdi Shahriari Advisor: Guido Cervone.

Uncertainty cones deduced from an ensemble prediction system

Verifying and interpreting ensemble products

Evaluating forecasts and models

forecasts of rare events

Christoph Gebhardt, Zied Ben Bouallègue, Michael Buchhold

Measuring the performance of climate predictions

Verification of probabilistic forecasts: comparing proper scoring rules Thordis L. Thorarinsdottir and Nina Schuhen

the performance of weather forecasts

What is a good ensemble forecast?

CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.

Presentation transcript:

What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin Williams

What are ensemble forecasts? An ensemble forecast is a set of possible values for the predicted quantity, often produced by running a numerical model with perturbed initial conditions. A forecast’s quality should depend on the meaning of the forecast that is claimed by the forecaster or adopted by the user, not the meaning preferred by the verifier. For example, our assessment of a point forecast of 30°C for today’s maximum temperature would depend on whether this is meant to be the value that will be exceeded with probability 50% or 10%.

What are ensembles meant to be? Ensembles are intended to be simple random samples. “the perturbed states are all equally likely” — ECMWF website “each member should be equally likely” — Met Office website “equally likely candidates to be the true state” — Leith (1974)

How are ensembles used? Turned into probability forecasts, e.g. proportion of members predicting a certain event Turned into point forecasts, e.g. ensemble mean What use is a random sample of possible values?! inputs for an impact (e.g. hydrological) model easy for users to turn them into probability or point forecasts of their choice, e.g. sample statistics

How should we verify ensembles? Should we verify... probability forecasts derived from the ensemble, point forecasts derived from the ensemble, or the ensemble forecast as a random sample? The first two approaches are more common, but ensembles that do well as random samples need not produce the best probability or point forecasts.

Results depend on what you verify The ideal probability forecast should behave as if the verification is sampled from the forecast distribution (Diebold et al. 1998). The ideal ensemble forecast should behave as if it and the verification are sampled from the same distribution (Anderson 1997). An ideal ensemble will not appear optimal if we verify a derived probability forecast because the latter will only estimate the ideal probability forecast.

An illustrative example Expected Continuous Ranked Probability Score for probability forecasts (empirical distributions) derived from two ensembles, each with ten members. Ideal ensemble: CRPS = 0.62 Underdispersed ensemble: CRPS = 0.61 (better!)

How should we verify ensembles? Verify an ensemble according to its intended use. If we want its empirical distribution to be used as a probability forecast then we should verify the empirical distribution with a proper score. If we want its ensemble mean to be used as a mean point forecast then we should verify the ensemble mean with the squared error. If we want the ensemble to be used as a random sample then what scores should we use?

Fair scores for ensembles Fair scores favour ensembles that behave as if they and the verification are sampled from the same distribution. A score, s(x,y), for an ensemble forecast, x, sampled from p, and a verification, y, is fair if (for all p) the expected score, E x,y {s(x,y)}, is optimized when y ~ p. In other words, the expected value of the score over all possible ensembles is a proper score. Fricker et al. (2013)

Characterization: binary case Let y = 1 if an event occurs, and let y = 0 otherwise. Let s i,y be the (finite) score when i of m ensemble members forecast the event and the verification is y. The (negatively oriented) score is fair if (m – i)(s i+1,0 – s i,0 ) = i(s i-1,1 – s i,1 ) for i = 0, 1,..., m and s i+1,0 ≥ s i,0 for i = 0, 1,..., m – 1. Ferro (2013)

Examples of fair scores Verification y and m ensemble members x 1,..., x m. The fair Brier score is s(x,y) = (i/m – y) 2 – i(m – i)/{m 2 (m – 1)} where i members predict the event {y = 1}. The fair CRPS is where i(t) members predict the event {y ≤ t} and H is the Heaviside function.

Example: CRPS Verifications y ~ N(0,1) and m ensemble members x i ~ N(0,σ 2 ) for i = 1,..., m. Expected values of the fair and unfair CRPS against σ. The fair CRPS is always optimized when ensemble is well dispersed (σ = 1). unfair score fair score m = 2 m = 4 m = 8 all m

Example: CRPS Relative underdispersion of the ensemble that optimizes the unfair CRPS

Example: CRPS

If the verification is an ensemble? A score, s(p,y), for a probability forecast, p, and an n- member ensemble verification, y, is n-proper if (for all p) the expected score, E y {s(p,y)}, is optimized when y 1,..., y n ~ p (Thorarinsdottir et al. 2013). A score, s(x,y), for an m-member ensemble forecast, x, sampled from p, and an n-member ensemble verification, y, is m/n-fair if (for all p) the expected score, E x,y {s(x,y)}, is optimized when y 1,..., y n ~ p.

If the verification is an ensemble? If s(x,y i ) is (m-)fair then is m/n-fair. The m/n-fair Brier score is s(x,y) = (i/m – j/n) 2 – i(m – i)/{m 2 (m – 1)} where i members and j verifications predict {y = 1}. The m/n-fair CRPS is where i(t) members and j(t) verifications predict {y ≤ t}.

Summary Verify an ensemble according to its intended use. Verifying derived probability or point forecasts will not favour ensembles that behave as if they and the verifications are sampled from the same distribution. Fair scores should be used to verify ensemble forecasts that are to be interpreted as random samples. Fair scores are analogues of proper scores, and rank histograms are analogues of PIT histograms. Current work: analogues of reliability diagrams etc.

Anderson JL (1997) The impact of dynamical constraints on the selection of initial conditions for ensemble predictions: low-order perfect model results. Monthly Weather Review, 125, Diebold FX, Gunther TA, Tay AS (1998) Evaluating density forecasts with applications to financial risk management. International Economic Review, 39, Ferro CAT (2013) Fair scores for ensemble forecasts. Quarterly Journal of the Royal Meteorological Society, in press Fricker TE, Ferro CAT, Stephenson DB (2013) Three recommendations for evaluating climate predictions. Meteorological Applications, 20, Leith CE (1974) Theoretical skill of Monte Carlo forecasts. Monthly Weather Review, 102, Thorarinsdottir TL, Gneiting T, Gissibl N (2013) Using proper divergence functions to evaluate climate models. SIAM/ASA J. on Uncertainty Quantification, 1, Weigel AP (2012) Ensemble forecasts. In Forecast Verification: A Practitioner’s Guide in Atmospheric Science. IT Jolliffe, DB Stephenson (eds) Wiley, pp References