Download presentation

Presentation is loading. Please wait.

Published byZachariah Stockley Modified over 4 years ago

1
Fair scores for ensemble forecasts Chris Ferro University of Exeter 13th EMS Annual Meeting and 11th ECAM (10 September 2013, Reading, UK)

2
Evaluating ensemble forecasts Multiple predictions, e.g. model simulations from several initial conditions. Want scores that favour ensembles whose members behave as if they and the observation are drawn from the same probability distribution.

3
Current practice is unfair Current practice evaluates a proper scoring rule for the empirical distribution function of the ensemble. A scoring rule, s(p,y), for a probability forecast, p, and an observation, y, is proper if (for all p) the expected score, E y {s(p,y)}, is optimized when y has distribution p. Proper scoring rules favour probability forecasts that behave as if the observations are randomly sampled from the forecast distributions.

4
Examples of proper scoring rules Brier score: s(p,y) = (p – y) 2 for observation y = 0 or 1, and probability forecast 0 ≤ p ≤ 1. Ensemble Brier score: s(x,y) = (i/n – y) 2 where i of the n ensemble members predict the event {y = 1}. CRPS: for real y and forecast p(t) = Pr(y ≤ t), where H is the Heaviside function. Ensemble CRPS: where i(t) of the n ensemble members predict the event {y ≤ t}.

5
Example: ensemble CRPS Observations y ~ N(0,1) and n ensemble members x i ~ N(0,σ 2 ) for i = 1,..., n. Plot expected value of the ensemble CRPS against σ. The ensemble CRPS is optimized when ensemble is underdispersed (σ < 1). n = 2 n = 4 n = 8

6
Fair scoring rules for ensembles Interpret the ensemble as a random sample. Fair scoring rules favour ensembles that behave as if the observations are sampled from the same distribution. A scoring rule, s(x,y), for an ensemble forecast, x, sampled from p, and an observation, y, is fair if (for all p) the expected score, E x,y {s(x,y)}, is optimized when y ~ p. Fricker, Ferro, Stephenson (2013) Three recommendations for evaluating climate predictions. Meteorological Applications, 20, 246-255 (open access)

7
Characterization: binary case Let y = 1 if an event occurs, and let y = 0 otherwise. Let s i,y be the (finite) score when i of n ensemble members forecast the event and the observation is y. The (negatively oriented) score is fair if (n – i)(s i+1,0 – s i,0 ) = i(s i-1,1 – s i,1 ) for i = 0, 1,..., n and s i+1,0 ≥ s i,0 for i = 0, 1,..., n – 1. Ferro (2013) Fair scores for ensemble forecasts. Submitted.

8
Examples of fair scoring rules Ensemble Brier score: s(x,y) = (i/n – y) 2 where i of the n ensemble members predict the event {y = 1}. Fair Brier score: s(x,y) = (i/n – y) 2 – i(n – i)/{n 2 (n – 1)}. Ensemble CRPS: where i(t) of the n ensemble members predict the event {y ≤ t}. Fair CRPS: if (x 1,..., x n ) are the n ensemble members,

9
Example: ensemble CRPS Observations y ~ N(0,1) and n ensemble members x i ~ N(0,σ 2 ) for i = 1,..., n. Plot expected value of the fair CRPS against σ. The fair CRPS is always optimized when ensemble is well dispersed (σ = 1). unfair score fair score n = 2 n = 4 n = 8 all n

10
Summary Evaluate ensemble forecasts (not only probability forecasts) to learn about ensemble prediction systems. Use fair scoring rules to favour ensembles whose members behave as if they and the observation are drawn from the same probability distribution. Unfair scoring rules will favour ensembles whose members are drawn from mis-calibrated distributions.

12
Dependent ensemble members A scoring rule, s(x,y), for an exchangeable ensemble, x, with marginal distribution p, and an observation, y, is fair if (for all p) the expected score is optimized when y ~ p. Fair scores exist only for some dependence structures. We rarely know the ‘correct’ dependence structure for an ensemble, and using an estimate sacrifices fairness. Use scores that are fair for those dependence structures that may be adopted when using the ensemble.

Similar presentations

OK

Testing Models on Simulated Data Presented at the Casualty Loss Reserve Seminar September 19, 2008 Glenn Meyers, FCAS, PhD ISO Innovative Analytics.

Testing Models on Simulated Data Presented at the Casualty Loss Reserve Seminar September 19, 2008 Glenn Meyers, FCAS, PhD ISO Innovative Analytics.

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google