Data Assimilation Theory CTCD Data Assimilation Workshop Nov 2005 Sarah Dance
Data assimilation is often treated as a black box algorithm OUT Analysis IN Observations and a priori information (apologies to Rube Goldberg) BUT, understanding and developing what goes on inside the box is crucial !!
Some DARC DA Theory Group Projects Nonlinear assimilation techniques Convergence of 4D-Var Reduced order modelling Stochastic Processes in DA Treatment of observation error correlations Background error modelling and balance Multiscale DA Phase errors and rearrangement theory Model errors and bias correction EnKF and bias Information content Observation targeting Soma of these relate to developing new methods, improving and understanding existing algorithms, evaluating models, using the data assimilation to feedback which observations will be most useful for a given objective. A lot of them have been triggered by problems that have occurred in atmospheric and oceanic applications. We in the theory group like to try and abstract the problem, consider it in a simple model where we can actually understand what is going on, and then apply what we have learnt in the big model. Lot’s of the problems that we look at are generic difficulties that hold whatever the application.
Formulations of the Ensemble Kalman Filter and Bias MSc thesis by David Livings, supervised by Sarah Dance and Nancy Nichols
Outline Bayesian state estimation and the Kalman Filter The EnKF Bias and the EnKF Conclusions
Prediction (between observations) e.g. Suppose xk = M xk-1+ M is linear, the prior and model noise are Gaussian P(xk-1) ~ N(xb, P) ~ N(0, Q) Then P(xk |xk-1) ~N(Mxb, MPMT+Q) Expressions are deceptively simple, since they are very difficult to evaluate in general
At an observation we use Bayes rule Prior Background error distribution Likelihood of observations Observation error pdf Bayes rule
Bayes rule illustrated
Bayes rule illustrated (cont)
The Kalman Filter Kalman filter BUT Models are nonlinear Use prediction equation and Bayes rule Assume linear models (forecast and observation) Assume Gaussian statistics Kalman filter BUT Models are nonlinear Evolving large covariance matrices is expensive (106 x 106 in meteorology) So use an ensemble (Monte Carlo idea)
=
Results with ETKF (old formulation) and Peter Lynch’s swinging spring model N=10, Perfect observations Red ensemble mean Blue ensemble std. Error bars indicate obs std. Ensemble statistics not consistent with the truth!
Bias and the EnKF Many EnKF algorithms, can be put into a “square root” framework. Define an ensemble perturbation matrix: So, by definition of the ensemble mean
Square-root ensemble updates The mean of the ensemble is updated separately. Ensemble perturbations are updated as where T is a (non-unique) square root of an update equation. Thus, for consistency, David discovered that not all implementations preserve this property. We have now found nec. and suff. conditions for consistency.
Consequences The size of the ensemble spread will be too small The ensemble will be biased The size of the ensemble spread will be too small Filter divergence is more likely to occur ! Care must be taken in algorithm choice and implementation