4IWVM - Tutorial Session - June 2009 Verification of categorical predictands Anna Ghelli ECMWF.

Slides:



Advertisements
Similar presentations
ECMWF Slide 1Met Op training course – Reading, March 2004 Forecast verification: probabilistic aspects Anna Ghelli, ECMWF.
Advertisements

Slide 1ECMWF forecast User Meeting -- Reading, June 2006 Verification of weather parameters Anna Ghelli, ECMWF.
Slide 1ECMWF forecast products users meeting – Reading, June 2005 Verification of weather parameters Anna Ghelli, ECMWF.
Robin Hogan Ewan OConnor University of Reading, UK What is the half-life of a cloud forecast?
14 May 2001QPF Verification Workshop Verification of Probability Forecasts at Points WMO QPF Verification Workshop Prague, Czech Republic May 2001.
Validation of Satellite Precipitation Estimates for Weather and Hydrological Applications Beth Ebert BMRC, Melbourne, Australia 3 rd IPWG Workshop / 3.
Verification of probability and ensemble forecasts
Daria Kluver Independent Study From Statistical Methods in the Atmospheric Sciences By Daniel Wilks.
Introduction to Probability and Probabilistic Forecasting L i n k i n g S c i e n c e t o S o c i e t y Simon Mason International Research Institute for.
Analysis of frequency counts with Chi square
Paul Fajman NOAA/NWS/MDL September 7,  NDFD ugly string  NDFD Forecasts and encoding  Observations  Assumptions  Output, Scores and Display.
Probability. Probability Definitions and Relationships Sample space: All the possible outcomes that can occur. Simple event: one outcome in the sample.
European Storm Forecast Experiment Verification of Dichotomous Lightning Forecasts at the European Storm Forecast Experiment (ESTOFEX) Pieter Groenemeijer.
Chapter 4 Probability.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Quantitative precipitation forecasts in the Alps – first.
THE CONCEPT OF STATISTICAL SIGNIFICANCE:
Chapter 4: Basic Probability
Probability (cont.). Assigning Probabilities A probability is a value between 0 and 1 and is written either as a fraction or as a proportion. For the.
COSMO General Meeting Zurich, 2005 Institute of Meteorology and Water Management Warsaw, Poland- 1 - Verification of the LM at IMGW Katarzyna Starosta,
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Verification has been undertaken for the 3 month Summer period (30/05/12 – 06/09/12) using forecasts and observations at all 205 UK civil and defence aerodromes.
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Verification of extreme events Barbara Casati (Environment Canada) D.B. Stephenson (University of Reading) ENVIRONMENT CANADA ENVIRONNEMENT CANADA.
Chapter 4 Probability See.
Statistics 1 Course Overview
© Crown copyright Met Office Operational OpenRoad verification Presented by Robert Coulson.
Richard (Rick)Jones Regional Training Workshop on Severe Weather Forecasting Macau, April 8 -13, 2013.
1 On the use of radar data to verify mesoscale model precipitation forecasts Martin Goeber and Sean Milton Model Diagnostics and Validation group Numerical.
Verification of ensembles Courtesy of Barbara Brown Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright UCAR 2012, all rights reserved.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Sampling Uncertainty in Verification Measures for Binary Deterministic Forecasts Ian Jolliffe and David Stephenson 1EMS September Sampling uncertainty.
CHAPTER 5 Probability: Review of Basic Concepts
How can LAMEPS * help you to make a better forecast for extreme weather Henrik Feddersen, DMI * LAMEPS =Limited-Area Model Ensemble Prediction.
A Preliminary Verification of the National Hurricane Center’s Tropical Cyclone Wind Probability Forecast Product Jackie Shafer Scitor Corporation Florida.
Measuring forecast skill: is it real skill or is it the varying climatology? Tom Hamill NOAA Earth System Research Lab, Boulder, Colorado
Heidke Skill Score (for deterministic categorical forecasts) Heidke score = Example: Suppose for OND 1997, rainfall forecasts are made for 15 stations.
Model validation Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
Latest results in verification over Poland Katarzyna Starosta, Joanna Linkowska Institute of Meteorology and Water Management, Warsaw 9th COSMO General.
Chapter 4 Probability ©. Sample Space sample space.S The possible outcomes of a random experiment are called the basic outcomes, and the set of all basic.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Priority project « Advanced interpretation and verification.
61 st IHC, New Orleans, LA Verification of the Monte Carlo Tropical Cyclone Wind Speed Probabilities: A Joint Hurricane Testbed Project Update John A.
1 Presented at the 36 th Climate Prediction and Diagnostics Workshop Fort Worth, TX, 03–06 Oct W (Talas) approaches Japan 01 Sep 2011 Improvements.
26134 Business Statistics Tutorial 7: Probability Key concepts in this tutorial are listed below 1. Construct contingency table.
ECMWF Training Course Reading, 25 April 2006 EPS Diagnostic Tools Renate Hagedorn European Centre for Medium-Range Weather Forecasts.
A Superior Alternative to the Modified Heidke Skill Score for Verification of Categorical Versions of CPC Outlooks Bob Livezey Climate Services Division/OCWWS/NWS.
NOMEK - Verification Training - OSLO / 1 Pertti Nurmi ( Finnish Meteorological Institute ) General Guide to Forecast Verification.
Probabilistic Forecasts of Extreme Precipitation Events for the U.S. Hazards Assessment Kenneth Pelman 32 nd Climate Diagnostics Workshop Tallahassee,
Verification of ensemble systems Chiara Marsigli ARPA-SIMC.
Furthermore… References Katz, R.W. and A.H. Murphy (eds), 1997: Economic Value of Weather and Climate Forecasts. Cambridge University Press, Cambridge.
Short Range Ensemble Prediction System Verification over Greece Petroula Louka, Flora Gofa Hellenic National Meteorological Service.
Common verification methods for ensemble forecasts
Gridded warning verification Harold E. Brooks NOAA/National Severe Storms Laboratory Norman, Oklahoma
Jake Mittelman James Belanger Judith Curry Kris Shrestha EXTENDED-RANGE PREDICTABILITY OF REGIONAL WIND POWER GENERATION.
10th COSMO General Meeting, Cracow, Poland Verification of COSMOGR Over Greece 10 th COSMO General Meeting Cracow, Poland.
© Crown copyright Met Office Verifying modelled currents using a threshold exceedance approach Dr Ray Mahdon An exploration of the Gerrity Skill Score.
How far away is the moon? ( An adventure in signal detection theory and allusive science) Harold E. Brooks NOAA/National Severe Storms Laboratory Norman,
Handbook for Health Care Research, Second Edition Chapter 11 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 11 Statistical Methods for Nominal Measures.
Eidgenössisches Departement des Innern EDI Bundesamt für Meteorologie und Klimatologie MeteoSchweiz Weather type dependant fuzzy verification of precipitation.
Centro Nazionale di Meteorologia e Climatologia Aeronautica Common Verification Suite Zurich, Sep 2005 Alessandro GALLIANI, Patrizio EMILIANI, Adriano.
UERRA user workshop, Toulouse, 3./4. Feb 2016Cristian Lussana and Michael Borsche 1 Evaluation software tools Cristian Lussana (2) and Michael Borsche.
Biostatistics Class 2 Probability 2/1/2000.
Chapter 12 Tests with Qualitative Data
Verifying and interpreting ensemble products
COSMO Priority Project ”Quantitative Precipitation Forecasts”
Binary Forecasts and Observations
Validation-Based Decision Making
Quantitative verification of cloud fraction forecasts
COSMO-LEPS Verification
Verification of SPE Probability Forecasts at SEPC
Short Range Ensemble Prediction System Verification over Greece
Presentation transcript:

4IWVM - Tutorial Session - June 2009 Verification of categorical predictands Anna Ghelli ECMWF

4IWVM - Tutorial Session - June 2009 Outline What is a categorical forecast? Basic concepts Basic scores Exercises Conclusions

4IWVM - Tutorial Session - June 2009 Basic concepts Categorical: only one of a set of possible events will occur Categorical forecast does not contain expression of uncertainty There is typically a one-to-one correspondence between the forecast values and the observed values. The simplest possible situation is a 2x2 case or verification of a categorical yes/no forecast: 2 possible forecasts (yes/no) and 2 possible outcomes (event observed/event not observed)

4IWVM - Tutorial Session - June 2009 Contingency tables

4IWVM - Tutorial Session - June 2009 How do we build a contingency table? Date24 hour forecastObserved quantities Jan. 1st0.0- Jan. 2nd0.4yes Jan. 3rd1.5yes Jan. 4th Mar. 28th5.0- Mar. 29th0.0yes d a b c

4IWVM - Tutorial Session - June 2009 Contingency tables Tornado forecast Tornado Observed yesnoTotal fc yes no Total obs Marginal probability: sum of column or row divided by the total sample size For example the marginal probability of a yes forecast is: p x =Pr(X=1) = 100/2800 = 0.03 Sum of rows Total sample size

4IWVM - Tutorial Session - June 2009 Contingency tables Tornado forecast Tornado Observed yesnoTotal fc yes no Total obs represents the intersection of two events in a cross-tabulation table. Joint probability: represents the intersection of two events in a cross-tabulation table. For example the joint probability of a yes forecast and a yes observed: P x,y =Pr(X=1,Y=1) = 30/2800 = 0.01

4IWVM - Tutorial Session - June 2009 Basic measures/scores Frequency Bias Index (Bias) FBI > 1 over forecasting FBI < 1 under forecasting Proportion Correct simple and intuitive yes and no forecasts are rewarded equally can be maximised by forecasting the most likely event all the time Range: 0 to Perfect score = 1 Range: 0 to 1 Perfect score = 1

4IWVM - Tutorial Session - June 2009 Basic measures/scores Hit Rate, Probability Of Detection, Prefigurance sensitive to misses events and hits, only can be improved by over forecasting complement score Miss Rate MS=1-H=c/(a+c) False Alarm Ratio function of false alarms and hits only can be improved by under forecasting Range: 0 to 1 Perfect score = 1 Range: 0 to 1 Perfect score = 0

4IWVM - Tutorial Session - June 2009 Basic measures/scores Post agreement Complement FAR -> PAG=1-FAR not widely used sensitive to false alarms and hits False Alarm Rate, Probability of False Detection sensitive to false alarms and correct negative can be improved by under forecasting generally used with H (POD) to produce ROC score for probability forecasts (see later on in the week) Range: 0 to 1 Perfect score = 1 Range: 0 to 1 Perfect score = 0

4IWVM - Tutorial Session - June 2009 Basic measures/scores Threat Score, Critical Success Index takes into account: hits, misses and false alarms correct negative forecast not considered sensitive to climatological frequency of event Equitable Threat Score, Gilbert Skill Score (GSS) it is the TS which includes the hits due to the random forecast Range: 0 to 1 Perfect score = 1 No skill level = 0 Range: -1/3 to 1 Perfect score = 1 No skill level = 0

4IWVM - Tutorial Session - June 2009 Basic measures/scores Hanssen & Kuipper’s Skill Score, True Skill Statistic (TSS), Pierce’s Skill Score popular combination of H and F Measures the ability to separate yes (H)and no (F) cases For rare events d is very large -> F small and KSS (TSS) close to POD (H) Related to ROC (Relative Operating Characteristic) Heidke Skill Score Measures fractional improvements over random chance Usually used to score multi-category events Range: -1 to 1 Perfect score = 1 No skill level = 0 Range: - to 1 Perfect score = 1 No skill level = 0

4IWVM - Tutorial Session - June 2009 Basic measures/scores Odds Ratio measures the forecast probability(odds) to score a hit (H) compared to giving a false alarm (F) independent of biases unbound Odds Ratio Skill Score produces typically very high absolute skill values (because of its definition) Not widely used in meteorology Range: -1 to 1 Perfect score = 1 Range: 0 to Perfect score = No skill level = 1

4IWVM - Tutorial Session - June 2009 Verification history Tornado forecast Tornado Observed yesnoTotal fc yes no Total obs PC=( )/2800= 96.8% H = 30/50 = 60% FAR = 70/100 = 70% B = 100/50 = 2 Tornado forecast Tornado Observed yesnoTotal fc yes000 no Total obs PC=(2750+0)/2800= 98.2% H = 0 = 0% FAR = 0 = 0% B = 0/50 = 0

4IWVM - Tutorial Session - June 2009 Multi-category events Generalised version of HSS and KSS - measure of improvement over random forecast The 2x2 tables can be extended to several mutually exhaustive categories  Rain type: rain/snow/freezing rain  Wind warning: strong gale/gale/no gale  Cloud cover: 1-3 okta/4-7 okta/ >7 okta Only PC (Proportion Correct) can be directly generalised Other verification measures need to be converted into a series of 2x2 tables fifi X

4IWVM - Tutorial Session - June 2009 Multi-category events Calculate: PC = ? KSS = ? HSS = ? results: PC = 0.61 KSS = 0.41 HSS = 0.37

4IWVM - Tutorial Session - June 2009 Summary scores Left panel: Contingency table for five months of categorical warnings against gale-force winds (wind speed > 14m/s) Right panel: Tornado verification statistics B = (a+b)/(a+c)______ ______ PC = (a+d)/n____________ POD = a/(a+c)____________ FAR = b/(a+b)____________ PAG = a/(a+b)____________ F = b/(b+d)____________ KSS = POD-F____________ TS = a/(a+b+c)____________ ETS = (a-a r )/(a+b+c-a r )____________ HSS = 2(ad-bc)/[(a+c)(c+d)+(a+b)(b+d)]____________ OR = ad/bc____________ ORSS = (OR-1)/(OR+1)____________ GALE TORNADO

4IWVM - Tutorial Session - June 2009 Summary scores Left panel: Contingency table for five months of categorical warnings against gale-force winds (wind speed > 14m/s) Right panel: Tornado verification statistics B = (a+b)/()a+c)_0.65___2.00_ PC = (a+d)/n_0.91___0.97_ POD = a/(a+c)_0.58___0.60_ FAR = b/(a+b)_0.12___0.70_ PAG = a/(a+b)_0.88___0.30_ F = b/(b+d)_0.02___0.03_ KSS = POD-F_0.56___0.57_ TS = a/(a+b+c)_0.54___0.25_ ETS = (a-a)/(a+b+c-a)_0.48___0.24_ HSS = 2(ad-bc)/[(a+c)(c+d)+(a+b)(b+d)]_0.65___0.39_ OR = ad/bc_83.86__57.43_ ORSS = (OR-1)/(OR+1)_0.98___0.97_ GALE TORNADO

4IWVM - Tutorial Session - June 2009 Example 1

4IWVM - Tutorial Session - June 2009 Example 1 -- Answer

4IWVM - Tutorial Session - June 2009 Example 2

4IWVM - Tutorial Session - June 2009 Example 2 -- answer

4IWVM - Tutorial Session - June 2009 Example 3

4IWVM - Tutorial Session - June 2009 Example 3 -- answer

4IWVM - Tutorial Session - June 2009 Example 4

4IWVM - Tutorial Session - June 2009 Example 4 -- answer

4IWVM - Tutorial Session - June 2009 Example 5

4IWVM - Tutorial Session - June 2009 Example 5 -- answer

4IWVM - Tutorial Session - June 2009 Example 6

4IWVM - Tutorial Session - June 2009 Example 6 -- answer

4IWVM - Tutorial Session - June 2009 Example 7

4IWVM - Tutorial Session - June 2009 Example 7 -- answer

4IWVM - Tutorial Session - June 2009 Example 8

4IWVM - Tutorial Session - June 2009 Example 8 -- answer Correct. Rain occurs with a frequency of only about 20% (74/346) at this station True. The frequency bias is 1.31, greater than 1, meaning over forecasting Correct. It could be seen that the over forecasting is accompanied by high false alarm ratio, but the false alarm rate depends on the observation frequencies, and it is low because the climate is relatively dry Probably true. The PC gives credits to all Those “easy” correct forecasts of the non-occurrence. Such forecasts are easy when the non-occurrence is common The POD is high most likely because the Forecaster has chosen to forecast the occurrence of the event too often, and Has increased the false alarms Yes, both the KSS and HSS are well within The positive range. Remember, the standard For the HSS is a chance forecast, which is Easy to beat.

4IWVM - Tutorial Session - June 2009 Conclusions Definition of categorical forecast How to build a contingency table and what are the marginal and joint probabilities Basic scores for simple 2x2 tables Extension to multi-category events Practical exercises

4IWVM - Tutorial Session - June 2009 references Jolliffe and Stehenson (2003): Forecast verification: A practitioner’s guide, Wiley & sons Nurmi (2003): Recommendations on the verification of local weather forecasts. ECMWF Techical Memorandum, no. 430 Wilks (2005): Statistical methods in the atmospheric sciences, ch. 7. Academic Press JWGFVR (2009): Recommendation on verification of precipitation forecasts. WMO/TD report, no.1485 WWRP