About data assimilation Compiled by Henrik Vedel Center for Meteorological Models (CMM) Danish Meteorological Institute.

About data assimilation Compiled by Henrik Vedel Center for Meteorological Models (CMM) Danish Meteorological Institute

NWP is an” initial value problem”. We need a NWP model field to start an NWP model simulation from. This field is to represent the current state of the atmosphere as precisely as possible, given the variables of the model and its resolution. To find determine this field we need observations.

Observations from stardard meteorolocial ground stations

Radiosonde observations

Observations from starting, cruising and landing aircraft

Observations from drifting buoys

Radiance observations from polar orbiting satellites

Scatterometer data from polar orbiting satellite

Atmospheric motion vectors (based on geostationary satellites)

Atmospheric motion vectors (based polar satellites)

Ground based GNSS

GNSS radio occultations

Aircraft data – number varies with the time of day

Aircraft data

Ingredients of DMIs numerical weather prediction system Data assimilation system provide ”Analysis” (=initial conditions) Numerical weather model Observations Boundary values from external model Old model state Combined and done on a very powerfull computer Computer generated forecasts (fx byvejr) MENU Forecasts by forecasters (fx landsudsigt og regionaludsigter)

Examples of DMI NWP model areas NWP model “HIRLAM”

Example: HIRLAM S03/SKA details HIRLAM = High Resolution Limited Area Model HIRLAM = High Resolution Limited Area Model Boundary conditions = ECMWF global model Boundary conditions = ECMWF global model Horizontal grid resolution = 0.03° (approx 3 km) Horizontal grid resolution = 0.03° (approx 3 km) Vertical layers = 65 Vertical layers = 65 Grid points = 978 * 818 * 65 Grid points = 978 * 818 * 65 Approximately 8 variables Approximately 8 variables Forecast length=54h, time step=90s Forecast length=54h, time step=90s Runs on 50 nodes (700 cores) using MPI parallelization; asynchronous I/O on 8 cores Runs on 50 nodes (700 cores) using MPI parallelization; asynchronous I/O on 8 cores

Data assimilation The initial state is not based solely on observations, because: 1. The problem is vastly under determined in terms of available observations. 2. The model has skill that should not be thrown away. It is far better than interpolation between observations. 3. Many newer type observation types do not correspond directly to model variables (wind, temperature, humidity, surface pressure, specific humidity, etc (or similar)), but rather a combination thereof. They can only be properly assimilated with the help of the model field.

Data assimilation The process of determining the initial state is called data assimilation The result, the new initial state, is called the analysis.

PDF on data assimilation, variational part.

In a data assimilation system based on variational analysis (VAR), the observations and NWP model first guess field are combined in a statistically optimal way. Provided the assumptions about the size of errors and their distributions are correct, the result is the maximum likelihood estimate of the atmospheric state. It is much much better than the estimate provided by any single observing system. A main benefit of the VAR formulation, is that observations do not need to correspond to model variables to be assimilated. Any observation type which can be estimated from the model data can be assimilated. Also very ”abstract” observations, such as satellite radiances, GNSS radio occultations, and GNSS zenith delays, which depend on many model variables from extended parts of the model. The use of variational data assimilation has lead to a huge increase in the types of observations used, and are responsible for a significant part of the improved NWP skill in the last decades.

Slide from Pierre Brousseau, Météo France Impact of different observations on different scales in Meteo France Arome

Handling of observation errors On the NWP side we hope that the observation data providers have done basic checks of the observations. In practice it varies, and we do not rely on it. There is a somewhat rudimentary check on data in the reading in phase – checking whether the expected information is found, sometimes check against climatology, etc. The main quality estimation is done in the first guess check. Here the observation is compared to the expection value based on the first guess model state. If the deviation is large, i.e. 5 time the assumed observation error, the observation is considered likely errorneous, and is not included as active in the data assimilation. If the observation is close to this limit, it is given a lower weight. The setup is used for a preset number of iteration steps in the minimalisation of J. Then the procedure of check against deviation from the NWP model estimate is done again, possibly resulting in a change of the active stations and/or their weights depending on the setup. This works fine in general. But there are examples where one, or a few, correct observations that could have corrected a faulthy forecast of servere weather was thrown out by the first guess check.

Data assimilation time window, cut-off time, and NWP cycling frequency The ”data assimilation time window ” is the the period from within which observations are selected for data assimilation in a given model setup. In 3DVar only one observation per site and type within the time window is chosen for assimilation, the one closest to the valid time of the analysis = valid time of first guess. Henrce for high frequency observations many are not used. In 4DVar the full DA time window is broken into sub windows, each typically 1 h in length. Within the sub windows the selection is as for 3DVar. This results in use of more observations. But not necessarily all. In global models and large region LAM models the interest is on the larger scales, that do not vary quickly. In this case the NWP is cycled every 6 or 12 hours. The DA time windows used are correspondingly long, 6 – 12 hours (not necessarily equal to cycling time).

The longer the DA time window, the more important it is to take into account the variations with time of the properties assimilated. The global models typically use 12 hour windows and 4DVar. In 3DVar a specific setup, ”first guess at appropriate time” (FGAT) can be used. Typical DA time windows in small region LAM are significantly shorter, an hour to some hours. The ”cut-off” time is the difference between the wall clock time for the start of a forecast sequence and the valid time of the corresponding analysis. To base the analysis on fresh observations, in particular if using 3DVar, one improves by reducing the cut-off time. However, it takes time for many types of observations to reach the met offices. And some observations are not very frequent (e.g. polar orbiting satellite radiances). Hence, even if running with a NWP cycling frequency of 1 h, it can be necessary/beneficial to run with a longer cut-off time, of 1 ½ h or longer. As we get quicker access to more types of observations, cycling with reduced cut-off times is likely to become beneficial for certain types of NWP nowcasting

Example of 6 hourly 3 and 4 Dvar cycling at the Canadian met office. (From S. Laroche et al)

4D-Var at ECMWF Courtesy ECMWF

Iteration step

Observation operator for ZTD The ZTD (zenith total delay, or zenith troposphic delay) is defined as: This can be cast into typical variables of an NWP model in different ways, using some of these relations:

Subtitution to integrate over pressure provides a simple, and in terms of NWP and radiosonde data very robust formulation.

Using the simplest precise version of the observation operator limits the risk of mistakes when making derivatives and adjoints. And runs faster. P1 being the top pressure level in the model column, g1 the gravitational acceleration at that point. However, doing the ZHD integral for radiosonde and NWP profiles demonstrates the Saastamoinen analytical approximation to ZHD works very well For ZHD, the hydrostatic component, it is important to include the contribution from atop the NWP model, important to include the variation of g with location (latitude), and fine to include the variation of g with height. ZHD is easily integrated numerically in a model column, and the top contribution can be approximated as,

For ZWD it is important to include the variation of g with latitude, because of the low scale height of humidity, the variation of g with height is not important. This leads to Where p_i+1/2 and p_i-1/2 are the model pressures at either side of the model gridbox i in the column, where the mid box temperature and specific humidity (in terms of pressure) is T_i and q_i. Notice that serveral other formulations of the observation operator for ZTD exist, it varies from NWP model to NWP model.

Several important things remain. Correction for location offsets between NWP model and GNSS antenna, to provide a NWP estimate of the column above the GNSS site. Horisontally interpolation is used. In the vertical the different NWP models apply widely different methods. The more refined, the larger deviations can be accepted for the offset between NWP orography and actual GNSS site altitude. Assessment of error characteristics of the individual GNSS sites and AC solutions. Does the O-B characteristics have a structure beneficial for use in Var data assimilation? (approximately Gaussian). Is there a bias to be corrected for? (Could be due to a poor observation operator, if so all sites in a region; due to specific problems of the NWP or GNSS at a certain location; or due to problems at the AC deriving the GNSS ZTDs.). The GNSS network is constantly evolving, with new sites and processing centers appearing. This is good. But the current NWP setups require a fair bit of manual work to handle this, in the form of construction of ”white list” (of sites to choose data from), associated biases to correct for, and observation errors to use in the DA. This is something we need to work on.

ZTD gradients and slant delays Some DA systems already enable assimilation of slant delays. Work is under way to enable this in even more systems. The different schemes differ in particular regarding the sofistication of the derivation of the slant path in the NWP field. Besides precision of the NWP slant delay estimate, this has an impact on the level of complication of the observation operator as well as on the cpu-spending. Limited work has been done on comparison of GNSS ZTD gradients versus NWP ZTD gradients. Assimilation of GNSS ZTD gradients have not yet been introduced in DA. Possibly work on that could prove succesful. Work on both subjects fits well in the scope of GNSS4SWEC.

More on the background and observation errors, B & R B is vital for a DA system. It both determines how alteration of one variable due an observation influences the same variable in the neighborhood, and how it influences other variables, through the error correlations. It also, along with the resolution of the DA system, sets the scale of the region effected by an observation. It is to do this in a statistically and dynamically consistent way. Several problems prevents use of a ”proper” B. Lack of knowledge about the true state of the atmosphere. Lack of capacity to handle a true B on the computer. Lack of capacity to properly determine weather dependent varitions of B

Several approaches to estimate B exists, such as NMC method: Assume that at long forecasts length the deviations between forecasts with similar valid time are solely due to the forecasts errors, e.g. compare 24 and 48 h forecasts. Easily done, but errors and error correlations may evolve differently late and early in the forecasts, and the basic assumption might not be true. Ensemble method: Considers an ensemble of analyses and subsequent short forecasts. Diagnoses the statistics of the actual DA system, but requires observation errors are well known, and is more prone to errors if assumptions are not met. Separate background and observation errors from statistics of O-F (observation minus first guess). Introduced by Hollingworth and Lønnberg (1986). Separation based on the assumption that the observation errors are spatially uncorrelated, not correlated with forecast errors either, and that errors are isotropic and homogenuous. Good to give insight in error size and correlation lengths, but dominated by small regions with many observations and is valid in observation space. These produce static B´s. In reality the forecast errors are flow dependent. ”Ensemble Kalman filtering” is being introduced to overcome this.

Forecast quality (comparison of forecasts with measurements in Denmark, Greenland and Faeroe Islands)

Nudging See pdf last page.

Compared to variational DA nudging suffers from not being statistically optimal, less advanced quality control, and not the least difficulties in using observations that are not easily related to a model variable in a specific gridbox. A benefit is that the whole time sequence of observations is used. A few NWP models are based solely on nudging for the DA. At some met offices nudging is used to assimilate specific data related to low predictability, high local variability phenomena, such as convective showers, typically based on high frequency radar precipitation and satellite cloud observations. To circumvent the problem that such observations do not correspond directly to model variables, special schemes such as ”latent heat nudging” and ”nudging of velocity divergence” have been invented. This is used in NWP nowcasting setups, and improves model skill regarding the strength and location of heavy showers in short term forecating. Humidity being strongly linked to precipitation, it would be of interest to see the whole time sequence of GNSS observations, and eventual other high time frequency humidity obserations, also being used.

The DMI NWP nowcasting data assimilation is a two step procedure 3D-VAR Forecast model, including nudging terms Nudging algorithm for radar rain rates and cloud data (modified divergence field) Quality control 2D composite, specifically for NWP data assimilation Satellite data on clouds/ humidity Nowcasting SAF + DMI cloud product SYNOPs, TEMPs, Aircraft, Satellite radiances etc. Ground-based GNSS data Quality control by intercomparison (to be developed) New forecast New forecasts minimum once an hour, possible several times per hour, each time including new observations in the nudging, and made available shortly after the valid time of the observations. The forecast takes about 5 min. 1: Traditional 3DVAR, done hourly, with cut-off time of approx. 1.5 h. 2: Nudging, done hourly with very small cut-off time (few min). DMI radars

Final remarks on assimilation NWP data assimilation is tailored to handle a range of specific issues: NWP data assimilation is tailored to handle a range of specific issues: The number of NWP variables is much greater than the number of observations. The number of NWP variables is much greater than the number of observations. The types of observations are very different. From observations of model variables (radiosondes e.g.) to highly abstract measures (like the delay and bending of radiowaves passing through the atmosphere). The types of observations are very different. From observations of model variables (radiosondes e.g.) to highly abstract measures (like the delay and bending of radiowaves passing through the atmosphere). The variational data assimilations systems enable combination of these observations and the NWP first guess state in a statistical optimal way, leading to the maximum likelihood estimate of the current state of the atmosphere. The variational data assimilations systems enable combination of these observations and the NWP first guess state in a statistical optimal way, leading to the maximum likelihood estimate of the current state of the atmosphere. This estimate is far better than what any single observing system could provide. Provided the statistical assumptions are correct.. This estimate is far better than what any single observing system could provide. Provided the statistical assumptions are correct.. The advanced systems are costly system to run, with the requirement of computer ressources and detailed knowledge about many different types of observations. But they are beneficial to NWP model skill. The advanced systems are costly system to run, with the requirement of computer ressources and detailed knowledge about many different types of observations. But they are beneficial to NWP model skill. We also see indications that regarding short term forecasts of important low predictability phenomena, it can be preferable to run more frequent forecasts using the very latest observations – which may require, at least at present, including less advanced assimilation to be computationally feasible. We also see indications that regarding short term forecasts of important low predictability phenomena, it can be preferable to run more frequent forecasts using the very latest observations – which may require, at least at present, including less advanced assimilation to be computationally feasible.

This is Linux. Should I really bring him here for the sommerschool?

About data assimilation Compiled by Henrik Vedel Center for Meteorological Models (CMM) Danish Meteorological Institute.

Similar presentations

Presentation on theme: "About data assimilation Compiled by Henrik Vedel Center for Meteorological Models (CMM) Danish Meteorological Institute."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

About data assimilation Compiled by Henrik Vedel Center for Meteorological Models (CMM) Danish Meteorological Institute.

Similar presentations

Presentation on theme: "About data assimilation Compiled by Henrik Vedel Center for Meteorological Models (CMM) Danish Meteorological Institute."— Presentation transcript:

Similar presentations

About project

Feedback