Robin Hogan Julien Delanoe, Ewan OConnor, Anthony Illingworth, Jonathan Wilkinson University of Reading, UK Quantifying the skill of cloud forecasts from.

Robin Hogan Julien Delanoe, Ewan OConnor, Anthony Illingworth, Jonathan Wilkinson University of Reading, UK Quantifying the skill of cloud forecasts from the ground and from space

Other areas of interest Representing effects of cloud structure in radiation schemes –Horizontal inhomogeneity, overlap, 3D effects Mixed-phase clouds –Why are they so poorly represented in models? Convection –Estimating microphysical properties and fluxes of mass and momentum from observations

Overview The Cloudnet processing of ground-based radar and lidar observations –Continuous evaluation of the climatology of clouds in models Testing the skill of cloud forecasts from seven models –Desirable properties of skill scores; good and bad scores –Skill versus cloud fraction, height, scale, forecast lead time –Estimating the forecast half life Cloud fraction evaluation using a spaceborne lidar simulator –Evaluation of ECMWF model with ICESat/GLAS lidar Synergistic retrievals of ice cloud properties from the A-train –Variational methodology –Testing of the Met Office and ECMWF models

Project Original aim: to retrieve and evaluate the crucial cloud variables in forecast and climate models –Seven models: 5 NWP and 2 regional climate models in NWP mode –Variables: cloud fraction, LWC, IWC, plus a number of others –Four sites across Europe: UK, Netherlands, France, Germany –Period: Several years, to avoid unrepresentative case studies Ongoing/future work (dependent on sources of funding) –Apply to ARM data worldwide –Run in near-real-time for rapid feedback to NWP centers –Evaluate multiple runs of single-column versions of models

Level 1b Minimum instrument requirements at each site –Cloud radar, lidar, microwave radiometer, rain gauge, model or sondes Radar Lidar

Level 1c Ice Liquid Rain Aerosol Instrument Synergy product –Example of target classification and data quality fields:

Level 2a/2b Cloud products on (L2a) observational and (L2b) model grid –Water content and cloud fraction L2a IWC on radar/lidar grid L2b Cloud fraction on model grid

Chilbolton Observations Met Office Mesoscale Model ECMWF Global Model Meteo-France ARPEGE Model KNMI RACMO Model Swedish RCA model Cloud fraction

Statistics from AMF Murgtal, Germany, 2007 –140-day comparison with Met Office 12-km model Dataset released to the COPS community –Includes German DWD model at multiple resolutions and forecast lead times

NCEP over SGP in 2007 Hot off the press! Produced directly from ARMs ARSCL product so could easily be automated NCEP model appears to under- predict low and mid-level cloud

How skillful is a forecast? Most model comparisons evaluate the cloud climatology –What about individual forecasts? Standard measure shows ECMWF forecast half-life of ~6 days in 1980 and ~9 in 2000 –But virtually insensitive to clouds! ECMWF 500-hPa geopotential anomaly correlation

Joint PDFs of cloud fraction Raw (1 hr) resolution –1 year from Murgtal –DWD COSMO model 6-hr averaging ab cd …or use a simple contingency table

5 desirable properties of skill scores 1.Equitable: all random forecasts score zero –This is essential! –Note that forecasting the right climatology versus height but with no other skill should also score zero 2.Proper: not possible to hedge your bets –Some scores reward under- or over-prediction (e.g. hit rate) –Jolliffe and Stephenson: impossible to be equitable and strictly proper! 3.Independent of how often cloud occurs –Almost all scores asymptote to 0 or 1 for vanishingly rare events 4.Dependence on full joint PDF, not just 2x2 contingency table –Difference between cloud fraction of 0.9 and 1 is as important for radiation as a difference between 0 and 0.1 5.Linear: so that can fit an inverse exponential –Some scores (e.g. Yules Q) saturate at the high-skill end

Possible skill scores Contingency table Observed cloud Observed clear sky Modeled cloud a hit b false alarm Modeled clear sky c miss d correct negative Cloud deemed to occur when cloud fraction f is larger than some threshold f thresh To ensure equitability and linearity, we can use the concept of the generalized skill score = (x-x random )/(x perfect -x random ) –Where x is any number derived from the joint PDF –Resulting scores vary linearly from random=0 to perfect=1

Possible skill scores Contingency table Observed cloud Observed clear sky Modeled cloud a hit b false alarm Modeled clear sky c miss d correct negative DWD model a = 7194 b = 4098 c = 4502 d = 41062 Perfect forecast a p = 11696 b p = 0 c p = 0 d p = 45160 Random forecast a r = 2581 b r = 8711 c r = 9115 d r = 36449 To ensure equitability and linearity, we can use the concept of the generalized skill score = (x-x random )/(x perfect -x random ) –Where x is any number derived from the joint PDF –Resulting scores vary linearly from random=0 to perfect=1 Simplest example: Heidke skill score (HSS) uses x=a+d –We will use this as a reference to test other scores Brier skill score uses x=mean squared cloud-fraction difference, Linear Brier skill score (LBSS) uses x=mean absolute difference –Sensitive to errors in model for all values of cloud fraction Cloud deemed to occur when cloud fraction f is larger than some threshold f thresh

Some simpler scores Hit rate or Prob. of Detection: H=a/(a+c) –Fraction of cloudy events correctly forecast –E.g. Mace et al. (1998) for cloud occurrence Problems –Not equitable –Easy to hedge: forecast cloud all the time guarantees a perfect score, so favours models that overpredict cloud –This is linked to its asymmetry Log of Odds Ratio: LOR=ln(ad/bc) –E.g. Stephenson (2008) for tornado forecasts Properties –Equitable –Not easy to hedge –Unbounded: a perfect score is infinity! LOR H

Skill versus cloud-fraction threshold Consider 7 models evaluated over 3 European sites in 2003-2004 LOR implies skill increases for larger cloud-fraction threshold HSS implies skill decreases significantly for larger cloud- fraction threshold LORHSS

Extreme dependency score Stephenson et al. (2008) explained this behavior: –Almost all scores have a meaningless limit as base rate p 0 –HSS tends to zero and LOR tends to infinity They proposed the Extreme dependency score: –where n = a + b + c + d It can be shown that this score tends to a meaningful limit: –Rewrite in terms of hit rate H =a/(a +c) and base rate p =(a +c)/n : –Then assume a power-law dependence of H on p as p 0: –In the limit p 0 we find –This is meaningful because random forecasts have Hit rate converging to zero at the same rate as base rate: =1 so EDS=0 –Perfect forecasts have constant Hit rate with base rate: =0 so EDS=1

Symmetric extreme dependency score Problems with EDS: –Easy to hedge by predicting cloud all the time so c =0 –Not equitable These are solved by defining a symmetric version: –All the benefits of EDS, none of the drawbacks! Hogan, OConnor and Illingworth (2009, submitted to QJRMS)

Skill versus cloud-fraction threshold SEDS has much flatter behaviour for all models (except for Met Office which underestimates high cloud occurrence significantly) LORHSS SEDS

Skill versus height –Most scores not reliable near the tropopause because cloud fraction tends to zero LORHSS LBSS SEDS New score reveals: –Skill tends to slowly decrease at tropopause –Mid-level clouds (4-5 km) most skilfully predicted, particularly by Met Office –Boundary-layer clouds least skilfully predicted

A surprise? Is mid-level cloud well forecast??? –Frequency of occurrence of these clouds is commonly too low (e.g. from Cloudnet: Illingworth et al. 2007) –Specification of cloud phase cited as a problem –Higher skill could be because large-scale ascent has largest amplitude here, so cloud response to large-scale dynamics most clear at mid levels –Higher skill for Met Office models (global and mesoscale) because they have the arguably most sophisticated microphysics, with separate liquid and ice water content (Wilson and Ballard 1999)? Low skill for boundary-layer cloud is not a surprise! –Well known problem for forecasting (Martin et al. 2000) –Occurrence and height a subtle function of subsidence rate, stability, free-troposphere humidity, surface fluxes, entrainment rate...

Skill versus lead time Only possible for UK Met Office 12-km model and German DWD 7-km model –Steady decrease of skill with lead time –Both models appear to improve between 2004 and 2007 Generally, UK model best over UK, German best over Germany –An exception is Murgtal in 2007 (UK model wins) 2004 2007

Forecast half life Fit an inverse-exponential: –S 1 is the score after 1 day and 1/2 is the half-life Noticeably longer half-life fitted after 36 hours –Same thing found for Met Office rainfall forecast (Roberts 2008) –First timescale due to data assimilation and convective events –Second due to more predictable large-scale weather systems 2004 2007 2.6 days 2.9 days 2.7 days 2.9 days 2.7 days 3.1 days 2.4 days 4.0 days 4.3 days 3.0 d 3.2 d 3.1 d Met OfficeDWD

Why is half-life less for clouds than pressure? Different spatial scales? Convection? –Average temporally before calculating skill scores: –Absolute score and half-life increase with number of hours averaged

Cloud is noisier than geopotential height Z because it is separated by around two orders of differentiation: –Cloud ~ vertical wind ~ relative vorticity ~ 2 streamfunction ~ 2 pressure –Suggests cloud observations should be used routinely to evaluate models Geopotential height anomalyVertical velocity

Alternative approach How valid is it to estimate 3D cloud fraction from 2D slice? –Henderson and Pincus (2009) imply that it is reasonable, although presumably not in convective conditions Alternative: treat cloud fraction as a probability forecast –Each time the model forecasts a particular cloud fraction, calculate the fraction of time that cloud was observed instantaneously over the site –Leads to a Reliability Diagram: Jakob et al. (2004) Perfect No skill No resolution

Satellite observations: IceSAT Cloud observations from IceSAT 0.5-micron lidar (first data Feb 2004) Global coverage but lidar attenuated by thick clouds: direct model comparison difficult Optically thick liquid cloud obscures view of any clouds beneath Solution: forward-model the measurements (including attenuation) using the ECMWF variables Lidar apparent backscatter coefficient (m -1 sr -1 ) Latitude

Simulate lidar backscatter: –Create subcolumns with max-rand overlap –Forward-model lidar backscatter from ECMWF water content & particle size –Remove signals below lidar sensitivity ECMWF raw cloud fraction ECMWF cloud fraction after processing IceSAT cloud fraction

Global cloud fraction comparison ECMWF raw cloud fraction ECMWF processed cloud fraction IceSAT cloud fraction Wilkinson, Hogan, Illingworth and Benedetti (MWR 2008) Results for October 2003 –Tropical convection peaks too high –Too much polar cloud –Elsewhere agreement is good Results can be ambiguous –An apparent low cloud underestimate could be a real error, or could be due to high cloud above being too thick

Testing the model climatology Reduction in model due to lidar attenuation Error due to uncertain extinction-to-backscatter ratio

Testing the model skill from space Clearly need to apply SEDS to cloud estimated from lidar & radar! Unreliable region Lowest skill: tropical boundary-layer clouds Tropical skill appears to peak at mid-levels but cloud very infrequent here Highest skill in north mid-latitude and polar upper troposphere Is some of reduction of skill at low levels because of lidar attenuation?

Ice cloud retrievals from the A-train Advantages of combining radar, lidar and radiometers –Radar Z D 6, lidar D 2 so the combination provides particle size –Radiances ensure that the retrieved profiles can be used for radiative transfer studies How do we do we combine them optimally? –Use a variational framework: takes full account of observational errors –Straightforward to add extra constraints and extra instruments –Allows seamless retrieval between regions of different instrument sensitivity Retrievals will be compared to Met Office and ECMWF forecasts under the A-train

Formulation of variational scheme For each ray of data we define: Observation vector State vector –Elements may be missing –Logarithms prevent unphysical negative values Attenuated lidar backscatter profile Radar reflectivity factor profile (on different grid) Ice visible extinction coefficient profile Ice normalized number conc. profile Extinction/backscatter ratio for ice (TBD) Aerosol visible extinction coefficient profile (TBD) Liquid water path and number conc. for each liquid layer Visible optical depth Infrared radiance Radiance difference

Solution method An iterative method is required to minimize the cost function New ray of data Locate cloud with radar & lidar Define elements of x First guess of x Forward model Predict measurements y from state vector x using forward model H(x) Predict the Jacobian H=y i /x j Has solution converged? 2 convergence test Gauss-Newton iteration step Predict new state vector: x k+1 = x k +A -1 {H T R -1 [y-H(x k )] -B -1 (x k -b)-Tx k } where the Hessian is A=H T R -1 H+B -1 +T Calculate error in retrieval No Yes Proceed to next ray

CloudSat-CALIPSO-MODIS example 1000 km Lidar observations Radar observations Radar forward model

CloudSat-CALIPSO-MODIS example Lidar observations Lidar forward model Radar observations Radar forward model

Extinction coefficient Ice water content Effective radius Forward model MODIS 10.8- m observations Radar-lidar retrieval

Radiances matched by increasing extinction near cloud top …add infrared radiances Forward model MODIS 10.8- m observations

Radar-lidar complementarity CloudSat radar CALIPSO lidar MODIS 11 micron channel Time since start of orbit (s) Height (km) Cirrus detected only by lidar Mid-level liquid clouds Deep convection penetrated only by radar Retrieved extinction (m -1 )

Comparison with ECMWF log10(IWC[kg m -3 ])

A-Train Temperature (°C) Comparison with model IWC Met OfficeECMWF Global forecast model data extracted underneath A-Train A-Train ice water content averaged to model grid –Met Office model lacks observed variability –ECMWF model has artificial threshold for snow at around 10 -4 kg m -3 Temperature (°C)

Summary and outlook Defined five key properties of a good skill score –Plenty of bad scores are used (hit rate, false-alarm rate etc) –New Symmetric extreme dependency score is equitable and nearly independent of the occurrence of the quantity being forecast Model comparisons reveal –Half-life of a cloud forecast is between 2.5 and 4 days, much less than ~9 days for ECMWF 500-hPa geopotential height forecast –Longer-timescale predictability after 1.5 days –Higher skill for mid-level cloud and lower for boundary-layer cloud –Proposal submitted to apply some of these metrics (including probabilistic ones) to NWP & single-column models over the ARM sites Further work with radar-lidar-radiometer retrieval –Being used to test new ice cloud scheme in ECMWF model, as well as high-resolution simulations of tropical convection in Cascade project –Retrieve liquid clouds and precipitation at the same time to provide a truly seamless retrieval from the thinnest to the thickest clouds –Adapt for EarthCARE satellite (ESA/JAXA: launch 2013)

Cloud fraction in 7 models Mean & PDF for 2004 for Chilbolton, Paris and Cabauw Illingworth et al. (BAMS 2007) 0-7 km –Uncertain above 7 km as must remove undetectable clouds in model –All models except DWD underestimate mid-level cloud –Some have separate radiatively inactive snow (ECMWF, DWD); Met Office has combined ice and snow but still underestimates cloud fraction –Wide range of low cloud amounts in models –Not enough overcast boxes, particularly in Met Office model

Model cloud Model clear-sky A: Cloud hitB: False alarm C: MissD: Clear-sky hit Observed cloud Observed clear-sky Comparison with Met Office model over Chilbolton October 2003 Contingency tables

Monthly skill versus time Measure of the skill of forecasting cloud fraction>0.05 –Comparing models using similar forecast lead time –Compared with the persistence forecast (yesterdays measurements) Lower skill in summer convective events

Why N 0 */ 0.6 ? In-situ aircraft data show that N 0 */ 0.6 has temperature dependence that is independent of IWC Therefore we have a good a-priori estimate to constrain the retrieval Also assume vertical correlation to spread information in height, particularly to parts of the profile detected by only one instrument

Why N 0 *??? We need to be able to forward model Z and other variables from x Large scatter between extinction and Z implies 2D lookup-table is required When normalized by N0*, there is a near-unique relationship between /N 0 * and Z/N 0 * (as well as r e, IWC/N 0 * etc.)

Ice cloud: non-variational retrieval Donovan et al. (2000) algorithm can only be applied where both lidar and radar have signal Observations State variables Derived variables Retrieval is accurate but not perfectly stable where lidar loses signal Donovan et al. (2000) Aircraft- simulated profiles with noise (from Hogan et al. 2006)

Variational radar/lidar retrieval Noise in lidar backscatter feeds through to retrieved extinction Observations State variables Derived variables Lidar noise matched by retrieval Noise feeds through to other variables

…add smoothness constraint Smoothness constraint: add a term to cost function to penalize curvature in the solution (J = i d 2 i /dz 2 ) Observations State variables Derived variables Retrieval reverts to a-priori N 0 Extinction and IWC too low in radar-only region

…add a-priori error correlation Use B (the a priori error covariance matrix) to smooth the N 0 information in the vertical Observations State variables Derived variables Vertical correlation of error in N 0 Extinction and IWC now more accurate

Effective radius versus temperature All clouds An effective radius parameterization?

Comparison of mean effective radius July 2006 mean value of r e =3IWP/2 i from CloudSat-CALIPSO only Just the top 500 m of cloud MODIS/Aqua standard product

Comparison of ice water path Mean of all skies Mean of clouds CloudSat-CALIPSO MODIS Need longer period than just one month (July 2006) to obtain adequate statistics from poorer sampling of radar and lidar

Comparison of optical depth Mean of all skies Mean of clouds CloudSat-CALIPSO MODIS Mean optical depth from CloudSat-CALIPSO is lower than MODIS simply because CALIPSO detected many more optically thin clouds not seen by MODIS Hence need to compare PDFs as well

Robin Hogan Julien Delanoe, Ewan OConnor, Anthony Illingworth, Jonathan Wilkinson University of Reading, UK Quantifying the skill of cloud forecasts from.

Similar presentations

Presentation on theme: "Robin Hogan Julien Delanoe, Ewan OConnor, Anthony Illingworth, Jonathan Wilkinson University of Reading, UK Quantifying the skill of cloud forecasts from."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Robin Hogan Julien Delanoe, Ewan OConnor, Anthony Illingworth, Jonathan Wilkinson University of Reading, UK Quantifying the skill of cloud forecasts from.

Similar presentations

Presentation on theme: "Robin Hogan Julien Delanoe, Ewan OConnor, Anthony Illingworth, Jonathan Wilkinson University of Reading, UK Quantifying the skill of cloud forecasts from."— Presentation transcript:

Similar presentations

About project

Feedback