Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ensemble Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July – 7 August 2015 Univ. of Maryland-College Park, Dept.

Similar presentations


Presentation on theme: "Ensemble Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July – 7 August 2015 Univ. of Maryland-College Park, Dept."— Presentation transcript:

1 Ensemble Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July – 7 August 2015 Univ. of Maryland-College Park, Dept. of Atmos. & Oceanic Science Thanks to Kayo Ide (UMD) and Massimo Bonavita (ECMWF) for much of the inspiration and/or slides for this lecture. Acknowledgements also to Eugenia Kalnay (UMD) and Jeff Whitaker (NOAA/ESRL)

2 Outline 2 I.Introduction: Sequential Filtering a)Kalman Filter b)Extended KF and RRKF II.The Ensemble Kalman Filter a)Stochastic Filter (Perturbed Observations) b)Serial/Square Root Filters (EnSRF, EAKF, LETKF) c)Technical Challenges i.Localization ii.Inflation d)Toy model example results III.Ensemble of Vars IV.Summary

3 I. Introduction Data Assimilation Basics – Iterative approach to estimation/forecast of current/future states x using the computational models & observations of the system Computational Model: Forecast from t k-1 to t k Observation Measurement over the window Data assimilation: Analysis at t k Assimilation Window Courtesy: Kayo Ide

4 I. Introduction Data Assimilation Basics, Notations, and Challenges – Iterative approach to estimation/forecast of current/future states x using the computational models & observations of the system Step 1. Model Forecast Forecast  Background: x b k x k =M k (x k-1 ) Observation Measurement: y o k’ y k’ =h(x k’ ) Assimilation cycle Step 2. Assimilation Integration of x b k and y o k  Analysis: x a k = func of x b k and y o k tktk t k-1 truth x t M k is nonlinear imperfect x can be large h k’ is/may be nonlinear imperfect y may be large or too small y k’ is/may be insufficient to determine x k not exactly at t k Courtesy: Kayo Ide

5 I. Introduction: Probabilistic View of Data Assimilation Step 1. Model Forecast Forecast  Background: x b k X k =M k (x k-1 ) Observation Measurement: y o k’ y k’ =h(x k’ ) Assimilation cycle Step 2. Assimilation Integration of x b k and y o k  Analysis: x a k = func of x b k and y o k tktk t k-1 truth x t Uncertainty evolution p(x k |Y k-1 ) =f k (p(x k-1 |Y k-1 ) ) Uncertainty in observation p(y k’ |x k’ ) Uncertainty reduction/refinement p(x k |Y k )  p(x k- |Y k-1 ) & p(y k’ |x k’ ) Y k ={[y k- ], Y k-1 } Courtesy: Kayo Ide

6 Perspectives of Data Assimilation Two main perspectives of practical data assimilation & hybrid approach 6 Variational Approach: Least square estimation [maximum likelihood] – 3D-Var (3 dim in space) – 4D-Var (4 th dim is time) Sequential (KF) Approach: Minimum Variance estimate [least uncertainty] –Optimal Interpolation (OI) –(Extended / Ensemble) Kalman Filter p(x) Courtesy: Kayo Ide

7 Introduction: Kalman Filter 7 Here, x * k represents the true state at time (k). Superscripts (f) and (a) represent the forecast and analysis, respectively. M represents the nonlinear propagator (NWP model) that describes evolution in time. Errors to be discussed shortly

8 8 Linear, unbiased analysis equation can be expressed in the following form for the analysis (a), background (b), and time level (k) using linear operator H k : In order to find the best linear unbiased analysis, the Kalman Gain is expressed as the following B represents the background error covariance and R the observation error covariance. Under the condition that K is the optimal gain matrix, we can also obtain an equation for the analysis error covariance Introduction: Kalman Filter

9 9 Comment on Notation: – You may see f/b superscripts, using forecast and background interchangeably – Here, I use B and A for the background and analysis error covariance matrices. The “unified notation” (Ide et al. 1997) recommends using P a and P b, respectively. For the background forecast using the linear model Subtract the true state (x*) to define the error Noting that:

10 10 Introduction: Kalman Filter We can then prescribe the background error as: Defining the model error to be: And finally rewriting the previous expression as: So that

11 11 Introduction: Kalman Filter Assuming that: And inserting the model error covariance matrix, Q

12 12 Introduction: Kalman Filter (linear) Forecast Step Analysis Complete set of equations for DA cycling: – State and error covariances are propagated forward in time, and updated with observations at time k – Under assumptions of linearity (M, H), KF produces optimal set of analysis states – Analysis is the minimum variance estimate of the state

13 13 Extended Kalman Filter For weakly nonlinear systems, slight modifications can be made. Here, state update and propagation uses nonlinear operators: But covariance update and propagations uses linearized operators (Jacobians, or TL/AD) Where

14 Model Forecast: x b, B Observation Measurement: y o time EKF Analysis: x a, A Step 1. Forecast (x b k, B k ) Obtained by integrating starting from (x a k-1, A k-1 ) over [t k, t k ] Step 1. Forecast (x b k, B k ) Obtained by integrating starting from (x a k-1, A k-1 ) over [t k, t k ] (y o, R) R: prescribed Step 2. Analysis (x a k, A k ) Extended Kalman Filter Courtesy: Kayo Ide

15 Model Forecast: x b Observation Measurement: y o Assimilation cycle Assimilation window time Analysis: x a 15 M k is nonlinear imperfect M k is nonlinear imperfect dim of B may be (large) 2 dim of x may be large Extended Kalman Filter Courtesy: Kayo Ide

16 16 Kalman Filter for Large Dimensions Kalman filters (and EKF) are impractical for large system like NWP models – For present day NWP, the state size (N) can be > O(10 8 ) However, a variety of Kalman Filters have been developed for large dimensional systems – All of these rely on Low-Rank Approximations of the background and analysis error covariance matrices Assume that B k has rank M<<N, so that we can write the error covariance as a function of X b (NxM), where M can be ~100

17 17 Reduced-Rank KF The Kalman Gain can then be re-written as: Where the increment is a linear combination of columns of X b k (thus, confined to that subspace) It can be shown that the covariance matrix propagation is rewritten as (requiring only M realizations of M k ): Note that there is no B k, and H k is operating on smaller dimension. The analysis update again:

18 18 Ensemble Kalman Filters Ensemble Kalman Filters (EnKF) are Monte Carlo approximations/implementations, using sample covariances from an ensemble (over bar represents ensemble mean): Where X b k is a matrix (NxM) of ensemble forecast perturbations: And the full B e is never explicitly computed! Instead, we represent it in the subspace of the M x M ensemble space.

19 Ensembles Represent pdf of state with discrete sampling (ensemble members) Mean and covariance of ensemble members defined the evolved pdf t0t0 t1t1

20 Ensemble Approach to Represent p(x) ◆ Ensemble Members Spread  Mean  Covariance 20 ◆ Issues  Sampling of by ensemble can be poor, especially for Small M Small P in  Rank of P is at most M-1  There infinitely many ΔX that have the same P=(1/M-1)ΔX(ΔX) T Courtesy: Kayo Ide

21 p(x) Sampling & Reconstruction by Ensemble: 1D 21 different realizations Assuming Gaussian pdfs If sampling is well-done, then p*(x)~p(x). ‘Fitness’ of p*(x) to p*(x) vary case by case particularly for small M. All cases, N<M. orig. p(x) by M sample* from p(x) Reconstructed p*(x) by Courtesy: Kayo Ide

22 p(x) Sampling & Reconstruction by Ensemble: 2D 22 different realizations Assuming Gaussian pdfs If sampling is well-done, then p*(x)~p(x). ‘Fitness’ of p*(x) to p*(x) vary case by case particularly for small M. All cases, N≤M. orig. p(x) by M sample from p(x) Reconstructed p*(x) by Courtesy: Kayo Ide

23 EnKF Methodology Step 1. Model Forecast Forecast  Background: x b k X k =M k (x k-1 ) Observation Measurement: y o k’ y k’ =h(x k’ ) Assimilation cycle Step 2. Assimilation Integration of x b k and y o k  Analysis: x a k = func of x b k and y o k tktk t k-1 truth x t Uncertainty evolution p(x k |Y k-1 ) =f k (p(x k-1 |Y k-1 ) ) Uncertainty in observation p(y k’ |x k’ ) Uncertainty reduction/refinement p(x k |Y k )  p(x k- |Y k-1 ) & p(y k’ |x k’ ) Y k ={[y k- ], Y k-1 }  Ensemble Forecast  Ensemble Analysis Courtesy: Kayo Ide

24 Ensemble Kalman Filters 24 Recall that the KF (and EKF) propagate the error covariances explicitly using the TL/AD of the model and observation operators In an EnKF, error covariances are evolved implicitly in time through an ensemble of realizations of the nonlinear model Note the lack of M k and M T k (there is no TL or AD model needed).

25 Stochastic EnKF 25 Starting from the EnKF analysis update equation Where B e k is represented by ensemble statistics. The analysis is the mean of the posterior ensemble and analysis error covariance as: With a perturbation update following (if observations are same for all members):

26 Stochastic EnKF 26 Which yields an estimate of the analysis error as: However, if BLUE is followed, it should actually be: So, the error is underestimated! One solution to this is to stochastically perturb the observations from a Gaussian distribution drawn from R

27 Stochastic EnKF 27 In the limit of very large ensemble sizes, R p coincides with the original, prescribed R. The new Kalman Gain (K* p ) is identical to before, but R is replace with R p Yielding the correct analysis error covariance matrix. This is known as the Perturbed Observation EnKF (Houtekamer and Mitchell, 1998)

28 Deterministic EnKF [Square Root Filters] 28 There is another class of filters that does not require perturbing the observations. Starting with the ensemble background perturbation matrix again: We define the analysis to be a linear combination background perturbations: Where w k is a vector of coefficients in ensemble space. Expanding the RHS and defining the departures:

29 Deterministic EnKF [Square Root Filters] 29 Which implies that: Let’s define the following matrix in observation space: Which yields a simplified formulation for the weights:

30 Deterministic EnKF [Square Root Filters] 30 In other words, the Gain is computed in observation space. However, using the Sherman-Morrison-Woodbury formula, this can be rewritten to ensemble space: The perturbations can be updated to satisfy the following transform (T) satisfying the relationships: It can be shown that one such choice accomplishes this:

31 Deterministic EnKF [Square Root Filters] 31 There are many implementations of deterministic, square root filters. There are differences in the handling of observations (performing on local patches, serial assimilation of observations, etc.) – Ensemble Transform Kalman Filter (ETKF, Bishop et al. 2001) – Local Ensemble Transform Kalman Filter (LETKF, Ott et al. 2004, Hunt et al. 2007) – Serial Ensemble Square Root Filter (EnSRF, Whitaker and Hamill 2002) – Ensemble Adjustment Kalman Filter (EAKF, Anderson 2001) Overall, all of the above are largely similar and differ in their practical implementation. The class above is different than the stochastic filter.

32 Perform data assimilation in a local volume, choosing observations The state estimate is updated at the central grid red dot LETKF: Localization based on observations

33 Perform data assimilation in a local volume, choosing observations The state estimate is updated at the central grid red dot All observations (purple diamonds) within the local region are assimilated LETKF: Localization based on observations

34 Globally: Forecast step: Analysis step: construct Locally: Choose for each grid point the observations to be used, and compute the local analysis error covariance and perturbations in ensemble space: Analysis mean in ensemble space and add to W a to get the analysis ensemble in ensemble space. The new ensemble analyses in model space are the columns of  x. Gathering the grid point analyses forms the new global analyses. Note that the the output of the LETKF are analysis weightsand perturbation analysis matrices of weights. These weights multiply the ensemble forecasts. XXWXXW n abaabab wawa W a LETKF: Hunt et al. (2007)

35 Two Main Branches of EnKF: Analysis Processes at a Fixed t k ◆ Stochastic EnKF approach  Perturbed Observation (PO) (Houtekamer & Mitchell, MWR, 1998) (Burgers et al, MWR, 1998) ◆ Square-Root EnKF approach  Serial Ensemble Square Root Filter (EnSRF) (Whitaker & Hamill, MWR, 2002)  [Local] ensemble transform KF ([L]ETKF) (Hunt et al, Physica D, 2007) ◆ Extended Kalman Filter Given x b, B, y o, R, h / H Compute K = BH T (HBH T +R) -1 Obtain x a =x b +Δx a ; Δx a =K (y o - h(x b ) ) & A = (I- K H)B (x b, B) (x a, A) {xbm}{xbm} {xam}{xam} {xbm}{xbm} {xam}{xam} Given {x b m }, y o, R, h / H Apply K to {x b m } Obtain {x a m } Given {x b m }, y o, R, h (/ H) Apply K to x b, B Obtain x a, A & hence {x a m } Courtesy: Kayo Ide

36 What does B e gain us? Allows for flow-dependence/errors of the day Multivariate correlations from dynamic model – Quite difficult to incorporate into fixed error covariance models Evolves with system, can capture changes in the observing network More information extracted from the observations => better analysis => better forecasts 36

37 What does B e gain us? Temperature observation near warm front 37 BfBf BeBe Courtesy: Jeff Whitaker

38 What does B e gain us? 38 Surface pressure observation near “atmospheric river” First guess surface pressure (white contours) and precipitable water increment (A-G, red contours) after assimilating a single surface pressure observation (yellow dot) using B e. Courtesy: Jeff Whitaker

39 So what’s the catch? Rank Deficiency – The ensemble sizes used (~40-100+) for NWP are not nearly large enough Too few degrees of freedom available to fit the observations Low rank approximation yields spurious long-distance correlations Mistreatment of “system error/uncertainty” – Sampling (as above), model error, observation operator error, representativeness, etc. State estimate is ensemble average – This can produce unphysical estimates, smooth out high fidelity information, etc. 39

40 Inflation and Localization Inflation – Used to inflate ensemble estimate of uncertainty to avoid filter divergence (additive and multiplicative) Localization – Domain Localization Solves equations independently for each grid point (LETKF) – Covariance Localization Performed element wise (Schur product) on covariances themselves 40

41 Methods for representing model error (inflation) Multiplicative inflation: – Relaxation-to-prior spread (RTPS) – Relaxation-to-prior perturbation (Zhang et al. 2004) – Adaptive (Anderson 2007) Additive inflation: Add random samples from a specified distribution to each ensemble member after the analysis step. – Env. Canada uses random samples of isotropic 3DVar covariance matrix. – NCEP uses random samples of 48-h – 24-h forecast error (fcsts valid at same time). 41

42 Imperfect Model (Additive + Multiplicative Inflation Example) Additive inflation alone outperforms multiplicative inflation alone (compare values y-axis to values along x-axis) A combination of both is better than either alone. Multiplicative and additive inflation representing different error sources in the DA cycle? 42 From Whitaker and Hamill (2012)

43 Example of Covariance Localization 17 Estimates of covariances from a small ensemble will be noisy, with signal to noise small especially when covariance is small Courtesy: Jeff Whitaker

44 Real World NWP Example of Localization 44 Courtesy: Jeff Whitaker

45 Toy Model Experiments Lorenz 96 40-variable model (F-8.0) – 0.05 cycling (~6 hours) 45 Investigate aspects of EnKF, will show various RMSE metrics Graphic courtesy Rahul Mahajan

46 ETKF – Impact of Covariance Inflation (nobs=20) Average ETKF RMSE for analysis mean as function of inflation parameter. Average is taken over 1800 cases, ignoring the first 200 to allow for spin up. (Reference: EKF RMSE=12.0429)

47 Localization Localization of observation selection allows for reduction in ensemble size (inflation kept constant 1.1 here). For larger ensembles, more work is need to improve result (observation error inflation by distance for example)

48 Comparison between 3DVAR, EKF, and LETKF Time Series (500 to 1000) showing the analysis RMSE (truth-analysis) for the case where all 20 grid points are observed next to 4 unobserved. EKF and LETKF are significantly better than 3DVAR. SchemeMean RMSE 3DVAR51.6823 EKF14.6979 LTEKF11.9255

49 Ensemble of Data Assimilations Much like the stochastic EnKF, ECMWF and Meteo-France use an ensemble of data assimilations instead of an EnKF – Perturb the observations and model – Designed to represent and estimate the uncertainty in their deterministic 4DVAR This provides flow-dependent estimates of analysis error for their EPS Also provides flow-dependent estimates of background error for use in DA (either as B 0 or in hybrid….next lecture) Can be hugely expensive, given that a variational (4DVAR) update has to be executed for each ensemble member! 49

50 Summary EnKFs are Monte-Carlo implementations of the sequential Kalman Filter (minimum variance estimate) – PROS: Easy to implement, do not need TL/AD, flow-dependent estimates for error covariances, solver can be done in ensemble space (computationally efficient) – CONS: Sample sizes are usually much too small for large dimensional systems such as NWP models, requires ad-hoc methods of inflation and localization Many variants of EnKFs exist, but can be subset into two classes – Stochastic, Perturbed observation schemes ( – Deterministic, square root filters While many of these exist in their details and practical information, they are all solving for essentially the same thing. Observation versus physical space localization (when might this matter?) Has been successfully applied to many, high-dimensional problems 50

51 Summary (cont.) Like the variational solutions, similar assumptions need to be made to formulate the EnKF (including Gaussianity) – Although not discussed in detail, one does not necessarily need linear, differentiable observation operators as in variational schemes There is another class of “ensemble methods” designed to sample/estimate the full PDF (not just the mean and covariance) – Particle Filters: Nonlinear, non-Gaussian DA, towards Bayesian filtering – Expensive with greater dimensionality issues than the EnKF 51

52 Selective References Anderson, J. L., 2001. An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev. 129, 2884–2903. Bishop, C. H., Etherton, B. J. and Majumdar, S. J., 2001. Adaptive sampling with ensemble transform Kalman filter. Part I: theoretical aspects. Mon. Wea. Rev. 129, 420–436. Burgers, G., Van Leeuwen, P. J. and Evensen, G., 1998. On the analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev. 126, 1719–1724. Evensen, G., 1994. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. 99(C5), 10 143–10 162. Houtekamer, P. L. and Mitchell, H. L., 1998. Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev. 126, 796–811. Houtekamer, P. L. and Mitchell, H. L., 2001. A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev. 129, 123–137. Hunt, B. R., Kostelich, E. J. and Szunyogh, I., 2007. Efficient data assimilation for spatiotemporal chaos: a local ensemble transform Kalman filter. Physica D, 230, 112–126. Ott, E., Hunt, B. H., Szunyogh, I., Zimin, A. V., Kostelich, E. J. and co-authors. 2004. A local ensemble Kalman filter for atmospheric data assimilation. Tellus 56A, 415–428. Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble Square Root Filters, Mon. Wea. Rev. 131, 1485–1490. Whitaker, J. S. and Hamill, T. M., 2002. Ensemble data assimilation without perturbed observations. Mon. Wea. Rev. 130, 1913–1924. Whitaker, J. S. and Hamill, T. M., 2012. Evaluating Methods to Account for System Errors in Ensemble Data Assimilation. Mon. Wea. Rev. 140, 3078–3089. Zhang, F., C. Snyder, and J. Sun, 2004: Impact of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter. Mon. Wea. Rev. 132, 1238–1253. 52


Download ppt "Ensemble Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July – 7 August 2015 Univ. of Maryland-College Park, Dept."

Similar presentations


Ads by Google