École Doctorale des Sciences de l'Environnement d’ Î le-de-France Année 2009-2010 Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.

Slides:

Advertisements

Similar presentations

Introduction to Data Assimilation NCEO Data-assimilation training days 5-7 July 2010 Peter Jan van Leeuwen Data Assimilation Research Center (DARC) University.

Advertisements

Assimilation Algorithms: Tangent Linear and Adjoint models Yannick Trémolet ECMWF Data Assimilation Training Course March 2006.

Extended Kalman Filter (EKF) And some other useful Kalman stuff!

Observers and Kalman Filters

Visual Recognition Tutorial

Ibrahim Hoteit KAUST, CSIM, May 2010 Should we be using Data Assimilation to Combine Seismic Imaging and Reservoir Modeling? Earth Sciences and Engineering.

Data assimilation schemes in numerical weather forecasting and their link with ensemble forecasting Gérald Desroziers Météo-France, Toulouse, France.

Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Advanced data assimilation methods with evolving forecast error covariance Four-dimensional variational analysis (4D-Var) Shu-Chih Yang (with EK)

Validation and intercomparation discussion Items Size of assimilation ensembles. Is it to be determined only by numerical stability of algorithms? (O.

Advanced data assimilation methods- EKF and EnKF Hong Li and Eugenia Kalnay University of Maryland July 2006.

Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”

Course AE4-T40 Lecture 5: Control Apllication

Linear and generalised linear models

A comparison of hybrid ensemble transform Kalman filter(ETKF)-3DVAR and ensemble square root filter (EnSRF) analysis schemes Xuguang Wang NOAA/ESRL/PSD,

Review of Lecture Two Linear Regression Normal Equation

Lecture 11: Kalman Filters CS 344R: Robotics Benjamin Kuipers.

EnKF Overview and Theory

Model Inference and Averaging

Assimilation of Observations and Bayesianity Mohamed Jardak and Olivier Talagrand Laboratoire de Météorologie Dynamique, École Normale Supérieure, Paris,

Computing a posteriori covariance in variational DA I.Gejadze, F.-X. Le Dimet, V.Shutyaev.

Computational Issues: An EnKF Perspective Jeff Whitaker NOAA Earth System Research Lab ENIAC 1948“Roadrunner” 2008.

CSDA Conference, Limassol, 2005 University of Medicine and Pharmacy “Gr. T. Popa” Iasi Department of Mathematics and Informatics Gabriel Dimitriu University.

Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss High-resolution data assimilation in COSMO: Status and.

Kalman Filter (Thu) Joon Shik Kim Computational Models of Intelligence.

Assimilation of HF Radar Data into Coastal Wave Models NERC-funded PhD work also supervised by Clive W Anderson (University of Sheffield) Judith Wolf (Proudman.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

Mean Field Variational Bayesian Data Assimilation EGU 2012, Vienna Michail Vrettas 1, Dan Cornford 1, Manfred Opper 2 1 NCRG, Computer Science, Aston University,

Dusanka Zupanski And Scott Denning Colorado State University Fort Collins, CO CMDL Workshop on Modeling and Data Analysis of Atmospheric CO.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

2004 SIAM Annual Meeting Minisymposium on Data Assimilation and Predictability for Atmospheric and Oceanographic Modeling July 15, 2004, Portland, Oregon.

Applications of optimal control and EnKF to Flow Simulation and Modeling Florida State University, February, 2005, Tallahassee, Florida The Maximum.

MODEL ERROR ESTIMATION EMPLOYING DATA ASSIMILATION METHODOLOGIES Dusanka Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University.

Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:

Research Vignette: The TransCom3 Time-Dependent Global CO 2 Flux Inversion … and More David F. Baker NCAR 12 July 2007 David F. Baker NCAR 12 July 2007.

Data assimilation and forecasting the weather (!) Eugenia Kalnay and many friends University of Maryland.

Robotics Research Laboratory 1 Chapter 7 Multivariable and Optimal Control.

Errors, Uncertainties in Data Assimilation François-Xavier LE DIMET Université Joseph Fourier+INRIA Projet IDOPT, Grenoble, France.

Quality of model and Error Analysis in Variational Data Assimilation François-Xavier LE DIMET Victor SHUTYAEV Université Joseph Fourier+INRIA Projet IDOPT,

Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore

Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.

An Introduction To The Kalman Filter By, Santhosh Kumar.

École Doctorale des Sciences de l'Environnement d’Île-de-France Année Universitaire Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.

Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.

École Doctorale des Sciences de l'Environnement d’Île-de-France Année Universitaire Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.

6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,

Lecture II-3: Interpolation and Variational Methods Lecture Outline: The Interpolation Problem, Estimation Options Regression Methods –Linear –Nonlinear.

École Doctorale des Sciences de l'Environnement d’ Î le-de-France Année Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.

École Doctorale des Sciences de l'Environnement d’Île-de-France Année Universitaire Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.

MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

The Ensemble Kalman filter

DSP-CIS Part-III : Optimal & Adaptive Filters Chapter-9 : Kalman Filters Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.

Environmental Data Analysis with MatLab 2 nd Edition Lecture 22: Linear Approximations and Non Linear Least Squares.

Probabilistic Robotics Bayes Filter Implementations Gaussian filters.

Hybrid Data Assimilation

PSG College of Technology

Course: Autonomous Machine Learning

Ensemble variance loss in transport models:

Hidden Markov Models Part 2: Algorithms

Filtering and State Estimation: Basic Concepts

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.

École Doctorale des Sciences de l'Environnement d’Île-de-France Année Universitaire Modélisation Numérique de l’Écoulement Atmosphérique.

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Kalman Filter: Bayes Interpretation

Presentation transcript:

École Doctorale des Sciences de l'Environnement d’ Î le-de-France Année Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation d'Observations Olivier Talagrand Cours 7 4 Juin 2010

Two solutions : Low-rank filters ( Heemink, Pham, …) Reduced Rank Square Root Filters, Singular Evolutive Extended Kalman Filter, …. Ensemble filters (Evensen, Anderson, …) Uncertainty is represented, not by a covariance matrix, but by an ensemble of point estimates in state space which are meant to sample the conditional probability distribution for the state of the system (dimension N ≈ O( ) ). Ensemble is evolved in time through the full model, which eliminates any need for linear hypothesis as to the temporal evolution.

I. Hoteit, Doctoral Dissertation, Université Joseph Fourier, Grenoble, 2001

How to update predicted ensemble with new observations ? Predicted ensemble at time t : {x b n },n = 1, …, N Observation vector at same time : y = Hx +  Gaussian approach Produce sample of probability distribution for real observed quantity Hx y n = y -  n where  n is distributed according to probability distribution for observation error . Then use Kalman formula to produce sample of ‘analysed’ states x a n = x b n + P b H T [HP b H T + R] -1 (y n - Hx b n ),n = 1, …, N(2) where P b is covariance matrix of predicted ensemble {x b n }. Remark. If P b was exact covariance matrix of background error, (2) would achieve Bayesian estimation, in the sense that {x a n } would be a sample of conditional probability distribution for x, given all data up to time t.

Called Ensemble Kalman Filter (EnKF) But problems - Collapse of ensemble for small ensemble size (less than a few hundred). Empirical ‘covariance inflation’ - Spurious correlations appear at large geographical distances. Empirical ‘localization’. In formula x a n = x b n + P b H T [HP b H T + R] -1 (y n - Hx b n ),n = 1, …, N P b, which is covariance matrix of an N-size ensemble, has rank N-1 at most. This means that corrections made on ensemble elements are contained in a subspace with dimension N-1. Obviously very restrictive if N « p, N « n.

 EnKF  3DVar (prior, solid; posterior, dotted) Prior posterior Better performance of EnKF than 3DVar also seen in both 12-h forecast and posterior analysis in terms of root-mean square difference averaged over the entire month Month-long Performance of EnKF vs. 3Dvar with WRF (Meng and Zhang 2007c, MWR, in review )

Situation still not entirely clear. Houtekamer and Mitchell (1998) use two ensembles, the elements of each of which are updated with covariance matrix of other ensemble. Local Ensemble Transform Kalman Filter (LETKF) defined by Kalnay and colleagues. Correction is performed locally in space on the basis of neighbouring observations. In any case, optimality always requires errors to be independent in time. That constraint can be relaxed to some extent by augmenting the state vector in the temporal dimension.

Variational approach can easily be extended to time dimension. Suppose for instance available data consist of - Background estimate at time 0 x 0 b = x 0 +  0 b E(  0 b  0 bT ) = P 0 b - Observations at times k = 0, …, K y k = H k x k +  k E(  k  j T ) = R k - Model (supposed for the time being to be exact) x k+1 = M k x k k = 0, …, K-1 Errors assumed to be unbiased and uncorrelated in time, H k and M k linear Then objective function  0  S  J (  0 ) = (1/2) (x 0 b -  0 ) T [P 0 b ] -1 (x 0 b -  0 ) + (1/2)  k [y k - H k  k ] T R k -1 [y k - H k  k ] subject to  k+1 = M k  k,k = 0, …, K-1

J (  0 ) = (1/2) (x 0 b -  0 ) T [P 0 b ] -1 (x 0 b -  0 ) + (1/2)  k [y k - H k  k ] T R k -1 [y k - H k  k ] Background is not necessary, if observations are in sufficient number to overdetermine the problem. Nor is strict linearity. How to minimize objective function with respect to initial state u =  0 (u is called the control variable of the problem) ? Use iterative minimization algorithm, each step of which requires the explicit knowledge of the local gradient  u J  (  J /  u i ) of J with respect to u. Gradient computed by adjoint method.

How to numerically compute the gradient  u J ? Direct perturbation, in order to obtain partial derivatives  J /  u i by finite differences ? That would require as many explicit computations of the objective function J as there are components in u. Practically impossible.

Adjoint Method Input vector u = (u i ), dimu = n Numerical process, implemented on computer (e. g. integration of numerical model) u  v =  G(u)  v = (v j ) is output vector, dimv = m  Perturbation  u = (  u i ) of input. Resulting first-order perturbation on v  v j =  i (  v j /  u i )  u i  or, in matrix form  v = G’  u  where G’  (  v j /  u i ) is local matrix of partial derivatives, or jacobian matrix, of G.

Adjoint Method (continued 1)  v = G’  u(D)  Scalar function of output J (v) = J [G(u)] Gradient  u J of J with respect to input u? ‘Chain rule’  J /  u i =  j  J /  v j (  v j /  u i ) or   u J = G’ T  v J (A)

Adjoint Method (continued 2)  G is the composition of a number of successive steps G = G N ° … ° G 2 ° G 1 ‘Chain rule’ G’ = G N ’ … G 2 ’ G 1 ’ Transpose  G’ T = G 1 ’ T G 2 ’ T … G N ’ T Transpose, or adjoint, computations are performed in reversed order of direct computations. If G is nonlinear, local jacobian G’ depends on local value of input u. Any quantity which is an argument of a nonlinear operation in the direct computation will be used again in the adjoint computation. It must be kept in memory from the direct computation (or else be recomputed again in the course of the adjoint computation). If everything is kept in memory, total operation count of adjoint computation is at most 4 times operation count of direct computation (in practice about 2).

Adjoint Approach J (  0 ) = (1/2) (x 0 b -  0 ) T [P 0 b ] -1 (x 0 b -  0 ) + (1/2)  k [y k - H k  k ] T R k -1 [y k - H k  k ] subject to  k+1 = M k  k,k = 0, …, K-1 Control variable  0 = u Adjoint equation  K = H K T R K -1 [H K  K - y K ] k = M k T k+1 + H k T R k -1 [H k  k - y k ] k = K-1, …, 1 0 = M 0 T 1 + H 0 T R 0 -1 [H 0  0 - y 0 ] + [P 0 b ] -1 (  0 - x 0 b )  u J = 0 Result of direct integration (  k ), which appears in quadratic terms in expression of objective function, must be kept in memory from direct integration.

Adjoint Approach (continued 2) Nonlinearities ? J (  0 ) = (1/2) (x 0 b -  0 ) T [P 0 b ] -1 (x 0 b -  0 ) + (1/2)  k [y k - H k (  k )] T R k -1 [y k - H k (  k )] subject to  k+1 = M k (  k ),k = 0, …, K-1 Control variable  0 = u Adjoint equation  K = H K ’ T R K -1 [H K (  K ) - y K ] k = M k ’ T k+1 + H k ’ T R k -1 [H k (  k ) - y k ] k = K-1, …, 1 0 = M 0 ’ T 1 + H 0 ’ T R 0 -1 [H 0 (  0 ) - y 0 ] + [P 0 b ] -1 (  0 - x 0 b )  u J = 0 Not heuristic (it gives the exact gradient  u J ), and really used as described here.

How to write the adjoint of a code ? Operation a = b x c Input b, c Output a but also b, c For clarity, we write a = b x c b’ = b c’ = c  J/  a,  J/  b’,  J/  c’ available. We want to determine  J/  b,  J/  c Chain rule  J/  b = (  J/  a)(  a/  b) + (  J/  b’)(  b’/  b) + (  J/  c’)(  c’/  b) c 1 0  J/  b = (  J/  a) c +  J/  b’ Similarly  J/  c = (  J/  a) b +  J/  c’