Presentation on theme: "Climate change detection and attribution methods Exploratory Workshop DADA, Buenos Aires, 15-18 Oct 2012 Francis Zwiers, Pacific Climate Impacts Consortium,"— Presentation transcript:
Climate change detection and attribution methods Exploratory Workshop DADA, Buenos Aires, Oct 2012 Francis Zwiers, Pacific Climate Impacts Consortium, University of Victoria, Canada Photo: F. Zwiers
Introduction Two types of approaches currently in use - non-optimal and “optimal” Both rely heavily on climate models The objective is always to assess the evidence contained in the observations Methods are simple, yet complex Photo: F. Zwiers
Optimal approach Originally developed in a couple of different ways –Optimal filtering (North and colleagues, early 1980’s) –Optimal fingerprinting (Hasselmann,1979; Hegerl et al, 1996; 1997 They are equivalent (Hegerl and North, 1997) and amount to generalized linear regression Subsequently have OLS and EIV variants –OLS; Allan and Tett (1999) –TLS; Allan and Stott (2003) –EIV; Huntingford et al (2006) Recently development concerns regularization of the regression problem –Ribes et al., 2009, 2012a, 2012b Photo: F. Zwiers
An early detection study - Bell (1982) “Signal in noise” problem Observed field T(x,t) at locations x and times t Climate’s deterministic response to an “external” forcing (such as G, S, Sol, Vol) Natural internal variability (I.e., the “noise”)
Bell (1982) simple linear space-time “filtering” to remove the noise notation - extended temperature field - weights - noise field - signal field
Optimal detection statistic... Maximizes the signal to noise ratio subject to the constraint that the weights sum to one Constant c is unimportant so can set c=1
After Hasselmann (1979)
A simple detection test can testwith reject when Z 2 >4 Assume that A t is Gaussian
Bell’s application Estimate S/N ratio for NH seasonal mean temperature circa 1972 assumed covariance between zones is zero divided NH into 3 latitude zones –equator to 30N, 30-60N, 60-90N Got signal from a 4xCO 2 equilibrium run – CO 2 = 1200 ppm (Manabe and Stouffer, 1980) –estimated warming for 1972 (10% increase in CO 2 )
2 GHG Signal DJF JJA Estimate of expected warming due to 10% increase in CO 2 After Bell (1982)
Estimated S/N ratio Optimal Area weighted average After Bell (1982) ~25% gain S/N ratio is large, but signal not detected in why? poor estimate of variance ocean delay other signals
Evaluate amplitude estimates Observations Model Total least squares regression in reduced dimension space Filtering and projection onto reduced dimension space Evaluate goodness of fit Weaver and Zwiers, 2000
The regression model Evolution of methods since the IPCC SAR (“the balance of evidence suggests…”) Most studies now use an errors in variables approach Observations Signals (estimated from climate models) Signal errors Scaling factors Errors
Observations represented in a dimension-reduced space –Typically Filtered spatially (to retain large scales) Filtered temporally (to retain decadal variability decades) Projected onto low-order space-time EOFs Signals estimated from –Multi-model ensembles of historical simulations With different combinations of external forcings –Anthropogenic (GHG, aerosols, etc) –Natural (Volcanic, solar) IPCC WG1 AR4 Fig. TS-23
Examples of signals SolarVolcanic GHGsOzone Direct SO 4 aerosol All 20th century response to forcing simulated by PCM IPCC WG1 AR4 Fig. 9.1
Signal error term represents effects of –Internal variability (ensemble sizes are finite) –Structural error Know that multi-model mean often a better presentation of current climate Do not know how model space has been sampled Ultimate small sample inference problem: Observations provide very little information about the error variance-covariance structure Scaling factor –Alters amplitude of simulated response pattern Error term –Sampling error in observations (hopefully small) –Internal variability (substantial, particular at smaller scales) –Misfit between model-simulated signal and real signal (hopefully small … a scaling factor near unity would support this)
Typical D&A problem setup Typical approach in a global analysis of surface temperature –Often start with HadCRU data (5°x 5°), monthly mean anomalies –Calculate annual or decadal mean anomalies –Filter to retain only large scales Spectrally transform (T4 25 spectral coefficients), or Average into large grid boxes (e.g., 30°x40° up to 6x9=54 boxes) –For a 110-yr global analysis performed with T4 spectral filtering and decadal mean anomalies dim( Y ) = 25x11 = 275 Photo: F. Zwiers
The OLS form of the estimator of the scaling factors β is where is the estimated variance-covariance matrix of the observations Y Even with T4 filtering, would be 275x275 Need further dimension reduction Constraints on dimensionality –Need to be able to invert covariance matrix –Covariance needs to be well estimated on retained space-time scales –Should only keep scales on which climate model represents internal variability reasonably well –Should be able to represent signal vector reasonably well
Further constraint –To avoid bias, optimization and uncertainty analysis should be performed separately Require two independent estimates of internal variability –An estimate for the optimization step and to estimate scaling factors β –An estimate to make estimate uncertainties and make inferences Residuals from the regression model are used to assess misfit and model based estimates of internal variability
Basic procedure 1.Determine space-time scale of interest (e.g., global, T4 smoothing, decadal time scale, past 50-years) 2.Gather all data Observations Ensembles of historical climate runs Might use runs with ALL and ANT forcing to separate effects of ANT and NAT forcing in observations Control runs (no forcing, needed to estimate internal variability) 3.Process all data Observations homogenize, center, grid, identify where missing Historical climate runs “mask” to duplicate missingness of observations, process each run as the observations (no need to homogenize) ensemble average to estimate signals
Observations Model Process all data - continued Control run(s), within ensemble variability for individual models Divide into two parts Organize each part into “chunks” covering the same period as the observations – typically allow chunks to overlap 2000 yr run 2x1000 yr pieces 2x94x60 yr chunks Process each chunk as the observations Basic procedure ….
4.Filtering step Apply space and time filtering to all processed data sets suppose doing a analysis using observations, ALL and ANT ensembles of size 5 from one model, 2000 yr control 1 obs + 2x5 forced + 2x94 control = 200 datasets to process 5.Optimization step Use 1 st sample of control run chunks to estimate Select an EOF truncation Calculate Moore-Penrose inverse 6.Fit the regression model in the reduced space OLS scaling factor estimates are Basic procedure …
7.Rudimentary residual diagnostics on the fit Is residual variance consistent with model estimated internal variability? Allen and Tett (1999) Ignores sampling variability in the optimization (Allen and Stott, 2003). Ribes et al (2012a) therefore show that would be more appropriate
Dependent upon models, but we think models represent internal surface temperature variability reasonably well on global scales … Variability of observed and simulated annual global mean surface temperature ( ) ALL forcings 58 simulations 14 models IPCC WG1 AR4 Fig. 9.7
… and also on continental scales IPCC WG1 AR4 Fig. 9.8
Basic procedure …. 8.Repeat 6-7 for a range of EOF truncations k=1,2,…. Min et al, 2011, Fig S8b (right) Residual consistency test as a function of EOF truncation Space-time analysis of transformed extreme precipitation Obs are 5-year means for averaged over Northern mid-lat and tropic bands Dashed estimate of internal variance doubled
Basic procedure …. 9.Make inferences about scaling factors OLS expression that ignores uncertainty in the basis looks like…
A “typical” detection result Scaling factor estimates as a function of EOF truncation Space-time analysis of transformed annual extreme precipitation Obs are 5-year means for averaged over Northern mid-lat and tropic bands Min et al, 2011, Fig S8a (right) * Residual consistency test fails O Residual consistency test fails with doubled internal variance
How should we regularize the problem? Approach to date has been adhoc –Filtering + sample covariance matrix (may not be well conditioned) + EOF truncation (Moore-Penrose inverse) –Neither EOF nor eigenvalues well estimated –Truncation criteria not clear Results can be ambiguous in some cases Filtering occurs both external to the analysis, and within the analysis
How should we regularize the problem? Ribes (2009, 2012a, 2012b) has suggested using the well-conditioned regularized estimator of Ledoit and Wolf (2004) Weighted average of the sample covariance matrix and a structured covariance matrix, which in this case is the identify matrix This estimate is always well conditioned, is consistent, and has better accuracy than the sample estimator Separates the filtering problem from the D&A analysis.
How should we regularize the problem? Ledoit and Wolf (2004) point out that the weighted average has a Bayesian interpretation (with I corresponding to the prior, and a posterior estimate) Perhaps convergence could be improved by using a more physically appropriate structured estimator in place of I ? Perhaps the other DA can help? ?
Space-time vector of annual extremes Space-time signal matrix (one column per signal) Vector of scaling factors Vector of scale parameters Vector of shape parameters Note that these are vectors What about other distributional settings?
Conclusions The method continues to evolve Thinking hard about regularization is a good development (but perhaps not most critical) Some key questions –How do we make objective prefiltering choices? –How should we construct the “monte-carlo” sample of realizations that is used to estimate internal variability? –Similar question for signal estimates –How should we proceed as we push answer questions about extremes?