Measuring the performance of climate predictions Chris Ferro, Tom Fricker, David Stephenson Mathematics Research Institute University of Exeter, UK IMA.

Slides:



Advertisements
Similar presentations
Multi-model ensemble post-processing and the replicate Earth paradigm (Manuscript available on-line in Climate Dynamics) Craig H. Bishop Naval Research.
Advertisements

Slide 1ECMWF forecast products users meeting – Reading, June 2005 Verification of weather parameters Anna Ghelli, ECMWF.
Sub-seasonal to seasonal prediction David Anderson.
Robin Hogan Ewan OConnor University of Reading, UK What is the half-life of a cloud forecast?
NCAS Conference December 2007, Park Inn Hotel, York The Indian monsoon and climate change Andrew Turner, Julia Slingo.
Chapter 3 Properties of Random Variables
An analysis of a decadal prediction system
What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.
What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.
Fair scores for ensemble forecasts Chris Ferro University of Exeter 13th EMS Annual Meeting and 11th ECAM (10 September 2013, Reading, UK)
LRF Training, Belgrade 13 th - 16 th November 2013 © ECMWF Sources of predictability and error in ECMWF long range forecasts Tim Stockdale European Centre.
6th WMO tutorial Verification Martin GöberContinuous 1 Good afternoon! नमस्कार नमस्कार Guten Tag! Buenos dias! до́брый день! до́брыйдень Qwertzuiop asdfghjkl!
A Metrics Framework for Interannual-to-Decadal Predictions Experiments L. Goddard, on behalf of the US CLIVAR Decadal Predictability Working Group & Collaborators:
Februar 2003 Workshop Kopenhagen1 Assessing the uncertainties in regional climate predictions of the 20 th and 21 th century Andreas Hense Meteorologisches.
© Crown copyright Met Office Decadal Climate Prediction Doug Smith, Nick Dunstone, Rosie Eade, Leon Hermanson, Adam Scaife.
L.M. McMillin NOAA/NESDIS/ORA Regression Retrieval Overview Larry McMillin Climate Research and Applications Division National Environmental Satellite,
1 Seasonal Forecasts and Predictability Masato Sugi Climate Prediction Division/JMA.
Creating probability forecasts of binary events from ensemble predictions and prior information - A comparison of methods Cristina Primo Institute Pierre.
Jon Robson (Uni. Reading) Rowan Sutton (Uni. Reading) and Doug Smith (UK Met Office) Analysis of a decadal prediction system:
On judging the credibility of climate predictions Chris Ferro (University of Exeter) Tom Fricker, Fredi Otto, Emma Suckling 12th International Meeting.
The potential to narrow uncertainty in regional climate predictions Ed Hawkins, Rowan Sutton NCAS-Climate, University of Reading IMSC 11 – July 2010.
The Current State of Our Climate Prof. Tim Raymond Chemical Engineering Dept. Bucknell University Focus the Nation – Obstacles to Change January 31, 2008.
Naive Extrapolation1. In this part of the course, we want to begin to explicitly model changes that depend not only on changes in a sample or sampling.
Caio A. S. Coelho Supervisors: D. B. Stephenson, F. J. Doblas-Reyes (*) Thanks to CAG, S. Pezzulli and M. Balmaseda.
Evaluating decadal hindcasts: why and how? Chris Ferro (University of Exeter) T. Fricker, F. Otto, D. Stephenson, E. Suckling CliMathNet Conference (3.
Barcelona, 2015 Ocean prediction activites at BSC-IC3 Virginie Guemas and the Climate Forecasting Unit 9 February 2015.
Verification of ensembles Courtesy of Barbara Brown Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright UCAR 2012, all rights reserved.
Sampling Uncertainty in Verification Measures for Binary Deterministic Forecasts Ian Jolliffe and David Stephenson 1EMS September Sampling uncertainty.
Volcanoes and decadal forecasts with EC-Earth Martin Ménégoz, Francisco Doblas-Reyes, Virginie Guemas, Asif Muhammad EC-Earth Meeting, Reading, May 2015.
Model dependence and an idea for post- processing multi-model ensembles Craig H. Bishop Naval Research Laboratory, Monterey, CA, USA Gab Abramowitz Climate.
Future Climate Projections. Lewis Richardson ( ) In the 1920s, he proposed solving the weather prediction equations using numerical methods. Worked.
Toward Probabilistic Seasonal Prediction Nir Krakauer, Hannah Aizenman, Michael Grossberg, Irina Gladkova Department of Civil Engineering and CUNY Remote.
1 Climate Test Bed Seminar Series 24 June 2009 Bias Correction & Forecast Skill of NCEP GFS Ensemble Week 1 & Week 2 Precipitation & Soil Moisture Forecasts.
Evaluation of climate models, Attribution of climate change IPCC Chpts 7,8 and 12. John F B Mitchell Hadley Centre How well do models simulate present.
ENSEMBLES RT4/RT5 Joint Meeting Paris, February 2005 Overview of the WP5.3 Activities Partners: ECMWF, METO/HC, MeteoSchweiz, KNMI, IfM, CNRM, UREAD/CGAM,
Research Needs for Decadal to Centennial Climate Prediction: From observations to modelling Julia Slingo, Met Office, Exeter, UK & V. Ramaswamy. GFDL,
Ben Kirtman University of Miami-RSMAS Disentangling the Link Between Weather and Climate.
. Outline  Evaluation of different model-error schemes in the WRF mesoscale ensemble: stochastic, multi-physics and combinations thereof  Where is.
Deutscher Wetterdienst Bootstrapping – using different methods to estimate statistical differences between model errors Ulrich Damrath COSMO GM Rome 2011.
MULTIVARIATE REGRESSION Multivariate Regression; Selection Rules LECTURE 6 Supplementary Readings: Wilks, chapters 6; Bevington, P.R., Robinson, D.K.,
Based on data to 2000, 20 years of additional data could halve uncertainty in future warming © Crown copyright Met Office Stott and Kettleborough, 2002.
Decadal Climate Prediction Project (DCPP) © Crown copyright 09/2015 | Met Office and the Met Office logo are registered trademarks Met Office FitzRoy Road,
Furthermore… References Katz, R.W. and A.H. Murphy (eds), 1997: Economic Value of Weather and Climate Forecasts. Cambridge University Press, Cambridge.
Two extra components in the Brier Score Decomposition David B. Stephenson, Caio A. S. Coelho (now at CPTEC), Ian.T. Jolliffe University of Reading, U.K.
Montserrat Fuentes Statistics Department NCSU Research directions in climate change SAMSI workshop, September 14, 2009.
On the Challenges of Identifying the “Best” Ensemble Member in Operational Forecasting David Bright NOAA/Storm Prediction Center Paul Nutter CIMMS/Univ.
© Crown copyright Met Office The impact of initial conditions on decadal climate predictions Doug Smith, Nick Dunstone, Rosie Eade, James Murphy, Holger.
Diagnostic verification and extremes: 1 st Breakout Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) Identified (and began.
Eidgenössisches Departement des Innern EDI Bundesamt für Meteorologie und Klimatologie MeteoSchweiz Assessing the skill of decadal predictions Reidun Gangstø,
Verification methods - towards a user oriented verification The verification group.
Global Warming The heat is on!. What do you know about global warming? Did you know: Did you know: –the earth on average has warmed up? –some places have.
Predicting the performance of climate predictions Chris Ferro (University of Exeter) Tom Fricker, Fredi Otto, Emma Suckling 13th EMS Annual Meeting and.
1/39 Seasonal Prediction of Asian Monsoon: Predictability Issues and Limitations Arun Kumar Climate Prediction Center
National Oceanic and Atmospheric Administration’s National Weather Service Colorado Basin River Forecast Center Salt Lake City, Utah 11 The Hydrologic.
Climate Change Spring 2016 Kyle Imhoff. Let’s start with the big picture (climate forcings)…
Statistical Forecasting
SUR-2250 Error Theory.
Challenges of Seasonal Forecasting: El Niño, La Niña, and La Nada
Stat 112 Notes 4 Today: Review of p-values for one-sided tests
The Carbon Cycle.
forecasts of rare events
Emerging signals at various spatial scales
Predictability assessment of climate predictions within the context
Linking operational activities and research
Raw plume forecast data
GloSea4: the Met Office Seasonal Forecasting System
Measuring the performance of climate predictions
the performance of weather forecasts
Peter May and Beth Ebert CAWCR Bureau of Meteorology Australia
What is a good ensemble forecast?
Presentation transcript:

Measuring the performance of climate predictions Chris Ferro, Tom Fricker, David Stephenson Mathematics Research Institute University of Exeter, UK IMA Conference on the Mathematics of the Climate System Reading, 14 September 2011

How good are climate predictions? Predictions are useless without some information about their quality. Focus on information contained in hindcasts, i.e. retrospective forecasts of past events. 1.How should we measure the performance of climate predictions? 2.What does past performance tell us about future performance?

Hindcasts Thanks: Doug Smith (Met Office Hadley Centre)

Challenges Sample sizes are small, e.g. CMIP5 core hindcast experiments give 10 predictions for each lead time. Some external forcings (e.g. greenhouse gases and volcanoes) are prescribed, not predicted. The quality of measurements of predictands varies over time and space. Observations from the hindcast period are used (to some extent) to construct the prediction system.

Common practice Choice of predictand: Evaluate predictions only after removing biases Evaluate predictions of only long-term averages Choice of performance measure: Evaluate only the ensemble mean predictions Evaluate using correlation or mean square error Resample to estimate the sampling uncertainty

Common practice Choice of predictand: Evaluate predictions only after removing biases Evaluate predictions of only long-term averages Choice of performance measure: Evaluate only the ensemble mean predictions Evaluate using correlation or mean square error Resample to estimate the sampling uncertainty

Conventional reasoning We can’t predict weather at long lead times. So, don’t compare predicted and observed weather. Instead, compare predicted and observed climate, e.g. multi-year averages. Reduces noise and increases evaluation precision.

Evaluate weather, not climate! The foregoing argument is wrong for two reasons. We should evaluate predictands relevant to users. Evaluating climate averages reduces signal-to-noise ratios and so decreases evaluation precision. Better to evaluate predictions as weather forecasts and then average over time to improve precision.

D i prediction error for lead time i = 1,..., n Derror after averaging over the n lead times S 1 mean of the square of the errors D 1,..., D n S n square of the mean error D Under moderate conditions, the signal-to-noise ratio E(S n ) 2 / var(S n ) of S n becomes increasingly small relative to the signal-to-noise ratio of S 1 as the averaging length, n, increases. Evaluate weather, not climate!

Common practice Choice of predictand: Evaluate predictions only after removing biases Evaluate predictions of only long-term averages Choice of performance measure: Evaluate only the ensemble mean predictions Evaluate using correlation or mean square error Resample to estimate the sampling uncertainty

Skill inflation Predictions initialized along trending observations.

Skill inflation Strong association even if predictions fail to follow observations over the lead time. Performance measures can mislead and mask differences between prediction systems.

Avoiding skill inflation Observations X t and predictions P t sampled over time t from a joint distribution function F. Real-valued performance measure, s(F). Suppose that the joint distribution, F t, of (X t, P t ) changes with t so that F is a mixture distribution. No skill inflation if s satisfies the following property: s(F t ) = s 0 for all t implies s(F) = s 0 for all mixtures F.

Avoiding skill inflation All convex properties of real-valued scoring rules, σ(X,P), are immune to skill inflation. These include s(F) = expected value of σ(X,P), e.g. mean square error, and s(F) = any quantile of σ(X,P), e.g. median absolute deviation. Also monotonic functions of these, e.g. RMSE.

Summary Measuring performance can help to improve predictions and to guide responses to predictions. Evaluating climate predictions is hard because of small sample sizes, unpredicted forcings etc. Evaluate as weather forecasts then average! Use performance measures such as scoring rules that are immune to skill inflation from trends!

Related questions How does performance vary with the timescale of the predictand and of variations in the predictand? What can we learn by evaluating across a range of lead times and evaluation periods? What does past performance tell us about future performance? How should hindcast experiments be designed to yield as much information as possible?

References Ferro CAT, Fricker TE (2011) An unbiased decomposition of the Brier score. Submitted. Fricker TE, Ferro CAT (2011) A framework for evaluating climate predictions. In preparation. Goddard L and co-authors (2011) A verification framework for interannual-to- decadal prediction experiments. In preparation. Jolliffe IT, Stephenson DB (2011) Forecast Verification: A Practitioner’s Guide in Atmospheric Science. 2nd edition. Wiley. In press. Smith DM and co-authors (2007) Improved surface temperature prediction for the coming decade from a global climate model. Science, 317, 796—799. The EQUIP project: