Quality control of daily data on example of Central European series of air temperature, relative humidity and precipitation P. Štěpánek (1), P. Zahradníček.

Slides:



Advertisements
Similar presentations
Comparing statistical downscaling methods: From simple to complex
Advertisements

CORRECTION/COMPLETION OF RAINFALL DATA PRIMARY & SECONDARY VALIDATION –FLAGGING : SUSPECT OR INCORRECT VALUES –MISSING : NON-OBSERVANCE OR LOSS OF DATA.
1 Alberta Agriculture and Food (AF) Surface Meteorological Stations and Data Quality Control Procedures.
Extreme precipitation Ethan Coffel. SREX Ch. 3 Low/medium confidence in heavy precip changes in most regions due to conflicting observations or lack of.
Use of regression analysis Regression analysis: –relation between dependent variable Y and one or more independent variables Xi Use of regression model.
Homogenization of monthly Benchmark temperature series of network no. 3 – using ProClimDB software COST Benchmark meeting in Zürich September 2010.
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA A Method for Daily Temperature Data Interpolation and Quality Control Based on the Selected.
Mechanistic crop modelling and climate reanalysis Tom Osborne Crops and Climate Group Depts. of Meteorology & Agriculture University of Reading.
SECONDARY VALIDATION - RAINFALL DATA PRIMARY VALIDATION ALREADY DONE *ON INDIVIDUAL STATION BASIS SECONDARY VALIDATION *IDENTIFY SUSPECT VALUES BY HAVING.
Outline of Talk Introduction Toolbox functionality Results Conclusions and future development.
Operational Quality Control in Helsinki Testbed Mesoscale Atmospheric Network Workshop University of Helsinki, 13 February 2007 Hannu Lahtela & Heikki.
Assessment of Future Change in Temperature and Precipitation over Pakistan (Simulated by PRECIS RCM for A2 Scenario) Siraj Ul Islam, Nadia Rehman.
A Procedure for Automated Quality Control and Homogenization of historical daily temperature and precipitation data (APACH). Part 1: Quality Control of.
1 NATS 101 Lecture 3 Climate and Weather. 2 Review and Missed Items Pressure and Height-Exponential Relationship Temperature Profiles and Atmospheric.
MOS Performance MOS significantly improves on the skill of model output. National Weather Service verification statistics have shown a narrowing gap between.
Assessment of Extreme Rainfall in the UK Marie Ekström
Daily Stew Kickoff – 27. January 2011 First Results of the Daily Stew Project Ralf Lindau.
SECC – CCSP Meeting November 7, 2008 Downscaling GCMs to local and regional levels Institute of Food and Agricultural Sciences Guillermo A. Baigorria
CARPE DIEM Centre for Water Resources Research NUID-UCD Contribution to Area-3 Dusseldorf meeting 26th to 28th May 2003.
A Radar Data Assimilation Experiment for COPS IOP 10 with the WRF 3DVAR System in a Rapid Update Cycle Configuration. Thomas Schwitalla Institute of Physics.
Nynke Hofstra and Mark New Oxford University Centre for the Environment Trends in extremes in the ENSEMBLES daily gridded observational datasets for Europe.
Spatial Interpolation of monthly precipitation by Kriging method
Geostatistical approach to Estimating Rainfall over Mauritius Mphil/PhD Student: Mr.Dhurmea K. Ram Supervisors: Prof. SDDV Rughooputh Dr. R Boojhawon Estimating.
Analysis of extreme precipitation in different time intervals using moving precipitation totals Tiina Tammets 1, Jaak Jaagus 2 1 Estonian Meteorological.
Climate Change Tendencies in Georgia under Global Warming Conditions Mariam Elizbarashvili 1 Marika Tatishvili 2 Ramaz Meskhia 2 Nato Kutaladze 3 1. Ivane.
South Eastern Latin America LA26: Impact of GC on coastal areas of the Rio de la Plata: Sea level rise and meteorological effects LA27: Building capacity.
Tailored climate indices for DRR (infrastructure) Elena Akentyeva Main Geophysical Observatory, ST. PETERSBURG, RF.
A Statistical Comparison of Weather Stations in Carberry, Manitoba, Canada.
Benchmark dataset processing P. Štěpánek, P. Zahradníček Czech Hydrometeorological Institute (CHMI), Regional Office Brno, Czech Republic, COST-ESO601.
June 19, 2007 GRIDDED MOS STARTS WITH POINT (STATION) MOS STARTS WITH POINT (STATION) MOS –Essentially the same MOS that is in text bulletins –Number and.
Quantifying the uncertainty in spatially- explicit land-use model predictions arising from the use of substituted climate data Mike Rivington, Keith Matthews.
Dataset Development within the Surface Processes Group David I. Berry and Elizabeth C. Kent.
TOPTHERM TopTask TopTask Competition Modern meteorological forecasting tools for gliding activities.
Experiences with homogenization of daily and monthly series of air temperature, precipitation and relative humidity in the Czech Republic, P.
Latest results in verification over Poland Katarzyna Starosta, Joanna Linkowska Institute of Meteorology and Water Management, Warsaw 9th COSMO General.
Quality control and homogenization of the COST benchmark dataset Petr Štěpánek Pavel Zahradníček Czech Hydrometeorological Institute, regional office Brno.
Correction of daily values for inhomogeneities P. Štěpánek Czech Hydrometeorological Institute, Regional Office Brno, Czech Republic
Meteorological Data Analysis Urban, Regional Modeling and Analysis Section Division of Air Resources New York State Department of Environmental Conservation.
Using cluster analysis for Identifying outliers and possibilities offered when calculating Unit Value Indices OECD NOVEMBER 2011 Evangelos Pongas.
Modern Era Retrospective-analysis for Research and Applications: Introduction to NASA’s Modern Era Retrospective-analysis for Research and Applications:
The climate and climate variability of the wind power resource in the Great Lakes region of the United States Sharon Zhong 1 *, Xiuping Li 1, Xindi Bian.
Adjustment of Global Gridded Precipitation for Orographic Effects Jennifer C. Adam 1 Elizabeth A. Clark 1 Dennis P. Lettenmaier 1 Eric F. Wood 2 1.Dept.
Uncertainty assessment in European air quality mapping and exposure studies Bruce Rolstad Denby, Jan Horálek 2, Frank de Leeuw 3, Peter de Smet 3 1 Norwegian.
Panut Manoonvoravong Bureau of research development and hydrology Department of water resources.
Evapotranspiration Estimates over Canada based on Observed, GR2 and NARR forcings Korolevich, V., Fernandes, R., Wang, S., Simic, A., Gong, F. Natural.
NDFDClimate: A Computer Application for the National Digital Forecast Database Christopher Mello WFO Cleveland.
The observational dataset most RT’s are waiting for: the WP5.1 daily high-resolution gridded datasets HadGHCND – daily Tmax Caesar et al., 2001 GPCC -
COSMO General Meeting Zurich, 2005 Institute of Meteorology and Water Management Warsaw, Poland- 1 - Simple Kalman filter – a “smoking gun” of shortages.
Validation of Satellite-derived Clear-sky Atmospheric Temperature Inversions in the Arctic Yinghui Liu 1, Jeffrey R. Key 2, Axel Schweiger 3, Jennifer.
The ENSEMBLES high- resolution gridded daily observed dataset Malcolm Haylock, Phil Jones, Climatic Research Unit, UK WP5.1 team: KNMI, MeteoSwiss, Oxford.
1 Detection of discontinuities using an approach based on regression models and application to benchmark temperature by Lucie Vincent Climate Research.
Development of an Ensemble Gridded Hydrometeorological Forcing Dataset over the Contiguous United States Andrew J. Newman 1, Martyn P. Clark 1, Jason Craig.
1. Session Goals 2 __________________________________________ FAMINE EARLY WARNING SYSTEMS NETWORK Become familiar with the available data sources for.
Homogenization of daily data series for extreme climate index calculation Lakatos, M., Szentimey T. Bihari, Z., Szalai, S. Meeting of COST-ES0601 (HOME)
HYDROCARE Kick-Off Meeting 13/14 February, 2006, Potsdam, Germany HYDROCARE Actions 2.1Compilation of Meteorological Observations, 2.2Analysis of Variability.
Introduction  In situ measurements examine the phenomenon exactly in place where it occurs.  The most accurate of soil moisture measurements are.
Norwegian Meteorological Institute met.no QC2 Status
(Srm) model application: SRM was developed by Martinec (1975) in small European basins. With the progress of satellite remote sensing of snow cover, SRM.
1 Černikovský, Krejčí, Volná (ETC/ACM): Air pollution by ozone in Europe during the summer 2012 & comparison with previous years 17th EIONET Workshop on.
Actions & Activities Report PP8 – Potsdam Institute for Climate Impact Research, Germany 2.1Compilation of Meteorological Observations, 2.2Analysis of.
UERRA User Workshop Toulouse, 4 th Feb 2016 Questions to the users.
U.S.-India Partnership for Climate Resilience
Using satellite data and data fusion techniques
IBIS Weather generator
CLIMATE CHANGE IN TRINIDAD AND TOBAGO Willis Mills.
Looking for universality...
MOS Developed by and Run at the NWS Meteorological Development Lab (MDL) Full range of products available at:
Post Processing.
An Analysis of Possibilities of Forecasting the AVISO Agrometeorological Model Input   Münster P. Chuchma F.
Facultad de Ingeniería, Centro de Cálculo
Presentation transcript:

Quality control of daily data on example of Central European series of air temperature, relative humidity and precipitation P. Štěpánek (1), P. Zahradníček (1) (1) Czech Hydrometeorological Institute, Regional Office Brno, Czech Republic, COST-ESO601 meeting and Sixth Seminar for Homogenization and Quality Control in Climatological Databases

Processing before any data analysis Software AnClim, ProClimDB

Data Quality Control Finding Outliers Two main approaches: Using limits derived from interquartile ranges (time series) comparing values to values of neighbouring stations (spatial analysis)

1. Using limits derived from interquartile range –relatively, series of diffs./ratios (logarithms) of tested and reference series reference series created as an average of 5 mostly correlated stations, max. distance 35 km (precipitation) limits: coefficient (multiple) = 3.0 –absolutely, in the past when only one station is available in cases when less than three neighbours have been found limits: coefficient (multiple) = 5.0 Data Quality Control Finding Outliers

for monthly, daily data (each month individually) weighted /unweighted mean from neighbouring stations criterions used for stations selection (or combination of it): –best correlated / nearest neighbours (correlations – from the first differenced series) –limit correlation, limit distance –limit difference in altitudes neighbouring stations series should be standardized to test series AVG and / or STD (temperature - elevation, precipitation - variance) - missing data are not so big problem then Creating Reference Series

Example: Proposed list of stations used for creating reference series Selection according to correlations, distances and altitudes

Data Quality Control Finding Outliers 2. comparing values to values of neighbouring stations –comparing to min. 3 to 10 best correlated (nearest) stations –calculating series of standardized differences (logarithms of ratios) –number of cases exceeding 95% confidence limits is counted

Example: Comparing base station to its neighbours

Data Quality Control Finding Outliers 2. comparing values to values of neighbouring stations –comparing to min. 3 to 10 best correlated (nearest) stations –calculating series of standardized differences (logarithms of ratios) –number of cases exceeding 95% confidence limits is counted –Standardization of neighbours to base station values (AVG, STD, Altitude), –calcualting various characterestics from these values –Comparison with „expected“ value – (calculated as weighted mean from standardized neighbours values) < - interpolation method

Data Quality Control Neighbours values Standardization Standardizing to average and/or standard deviation of base station (for each month individually) To altitude of the base station Linear regression calculated for each month individually For each row (particular month or day) individually (GIS)

Data Quality Control Neighbours values Standardization Characteristics calculated from the standardized values: coefficient of Interquartile range (ranges are estimated from standardized neighbours values) difference of base station and median from neighbours values (probability): CDF for ( (base station – median_from_standardized_neighbors_values)/STD_base_station ) „Expected“ value (as weighted mean with weights 1/distance or correlations, arbitrary power; possibility of using trimmed mean)

QC, Settings in the software processing the whole database 2. Calculation: 1. Finding neighbours:

Combination of all characteristics

Example of outputs for outliers assessment Altitudes and distances of neighbours List of neighbours Neighbour stations values Expected value Suspicious values

Spatial distribution Quality control Run for period , dialy data (measured values in observation hours) All stations (200 climatological stations, 800 precipitation stations) All meteorological elements (T, TMA, TMI, TPM, SRA, SCE, SNO, E, RV, H, F) – parameters set individually Historical records will follow now

Spatial distribution Spatial distribution of climatological stations period stations mean minimum distance: 12 km

Settings in the software Air temperature Spatial correlations, max. temperature Annual values …

Spatial correlations, max. temperature January July

Air temperature, number of outliers , from station-days T – air temperature at obs. hour, TMA – daily maximum temp., TMI – daily min. temp., TPM – daily ground minimum temp.

Air temperature, number of outliers , from station-days Air temperature at obs. hour, AVG – daily average temp.

Air temperature, number of outliers , from station-days TMA – daily maximum temp., TMI – daily min. temp., TPM – daily ground minimum temp.

Air temperature, number of outliers , Number of outliers per one station (all observation hours, AVG)

Settings in the software Water vapor pressure Spatial correlations, water vapor pressure Annual values …

Spatial correlations, water vapor pressure January July

Water vapor pressure, number of outliers , from station-days Water vapor pressure at obs. hour, AVG – daily average

Spatial distribution Spatial distribution of precipitation stations period stations mean minimum distance: 7.5 km

Settings in the software Precipitation Spatial correlations, precipitation Annual values …

Spatial correlations, precipitation January July

Problematic detections (heavy rainfall)

Problematic detections (heavy rainfall), Radar information

Precipitation, number of outliers , from station-days.

Precipitation, number of outliers , Number of outliers per one station

Detecting 15 minute data (automated weather stations) for Temperature – it works well Precipitation – big problems – spatial variability needs to be combined with further information (e.g. meteorological elements observed at station)

Presented method can be further applied for Filling missing values (the “expected” value – from interpolation) Calculation of technical series (e.g. for grid points - to be used for RCM validations or correction, EC FP6 project CECILIA ), …

Conclusions Only combination of several methods for outliers detection leads to satisfying results (“real” outliers detection, supressing fault detection -> Emsemble approach) Parameters (settings) has to be found individually for each meteorological element, maybe also region (terrain complexity) and part of a year (noticeable annual cycle in number of outliers) Similar to homogenization of time series, it is important to use measured value (e.g. from observation hours) - outliers are masked in daily average (and even more in monthly or annual ones) Errors found in all elements and investigated countries (AT, CZ, SK, HU)

Outlook Improving methods by applying kriging (co-kriging) Including (combining) further information (meteorological phenomenon to precipitation, wind direction to wind speed, …) Connecting with R-software and utilization of other already programmed functions

Software used for data processing LoadData - application for downloading data from central database (e.g. Oracle) ProClimDB software for processing whole dataset (finding outliers, combining series, creating reference series, preparing data for homogeneity testing, extreme value analysis, RCM outputs validation, correction, …) AnClim software for homogeneity testing

ProcData software ProClimDB software

ProcData software ProClimDB software