First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 1 Quality control checks description First Data Management Training Workshop.

Slides:



Advertisements
Similar presentations
Better Data, Better Science! [ Better Science through Better Data Management ] Todd D. OBrien NOAA – NMFS - COPEPOD.
Advertisements

Use of VOS data in Climate Products Elizabeth Kent and Scott Woodruff National Oceanography Centre, Southampton NOAA Earth System Research Laboratory.
Groom-gliders Trieste data-management meeting Trieste, June 2013 T&S real-time QC n 1. Platform identification : valid wmo ID n 2. Impossible date test.
Pierre Jaccard1 MyOcean Quality Control for Ferryboxes MyOcean Tutorial, NERSC, Bergen.
TMSI/IDM/SISMER Sept 2000 SISMER SISMER Systèmes d’Informations Scientifiques pour la Mer F-NODC Quality Control Procedure at IFREMER Argo data management.
MAMA Workshop on Marine Data & Information Management Malta, 28 th January 2004 Recent experiences and future plans in oceanographic data management of.
Kick-Off Seadatanet – Heraklion – 7-11 June Atlantic and global products C. Coatanoan, F. Gaillard, E. Autret, T. Loubrieu.
Argo a year ago. 500 floats. Regional coverage only.
Work Package 2 / 3 TECHNOLOGIAL & PROCEDURAL HARMONISATION FixO3 General Assembly 14 th to the 16 th October 2014, Heraklion-CRETE
Argo Real-time Quality Control Process NOAA/AOML: Y.-H. DANESHZADEH, R. MOLINARI, R. SABINA, C. SCHMID CIMAS/UM: E. FORTEZA, X. XIA, H. YANG.
© Crown copyright Met Office The EN dataset Simon Good and Claire Bartholomew.
Guidelines for QC/QA from EMODNet Chemistry pilot Marina Lipizer, Istituto Nazionale di Oceanografia e di Geofisica Sperimentale–OGS Dipartimento OCE Kick-off.
Argo QC with an emphasis on the North Atlantic Justin Buck British Oceanographic Data Centre Joseph Proudman Building 6 Brownlow Street Liverpool L3 5DA,
Hernan E. Garcia (U.S. NODC, IODE Group of Experts on Biological and Chemical Data Management and Exchange Practices) EDM Workshop 2014, Silver Spring,
National Data Buoy Center Presented to the QARTOD III Workshop November 2, 2005 Wave Data Quality Control Chung-Chu Teng.
TMSI/IDM/SISMER Centre de Brest B.P Plouzané France Tél : +33 (0) Fax : +33 (0)
QARTOD III Scripps Institution of Oceanography La Jolla, CA Quality Control, Quality Assurance, and Quality Flags Mark Bushnell, NOAA/NOS/CO-OPS November.
Operational Quality Control in Helsinki Testbed Mesoscale Atmospheric Network Workshop University of Helsinki, 13 February 2007 Hannu Lahtela & Heikki.
QARTOD II Currents and Waves In-Situ Currents: Breakout Group Report Out QARTOD II February 28 – March 2, 2005.
2 nd international Conference for GODAR-WESTPAC JODC, Japan Coast Guard, Tokyo 2004.
11 MEDATLAS 2002: database and data management system for the long term monitoring of Mediterranean and Black seas EC-MAST Concerted Action (MAS3-CT /ERBIC20-CT
Quality Control Standards for SeaDataNet Review status at 1 st Annual Meeting (March 2007) Review developments over last year Current status Future work.
IQuOD An International Quality Control Effort Tim Boyer EDM workshop September 10, 2014.
IQuOD Data Flow Tim Boyer NODC. Inflow How will IQuOD quality controlled data get into the World Ocean Database?
Reiner Schlitzer Alfred Wegener Institute for Polar and Marine Research Ocean Data View - Available Data Collections and Data Model.
OUTLINE OF QUALITY CONTROL DOCUMENT Introduction Why is quality control is needed? Information to accompany data Automatic checks “Scientific” quality.
WP2: Data products generation in Mediterranean Sea EMODNET Chemistry Data Products, Experts Workshop Split (Croatia), June 2014 Sissy Iona, HCMR-HNODC.
Towards a Standard for Real-time Quality Control Procedures for in situ Ocean Waves Richard Bouchard 1 and Julie Thomas 2 1.NOAA’s National Data Buoy Center.
Work Package 2 / 3 TECHNOLOGIAL & PROCEDURAL HARMONISATION FixO3 General Assembly 14 th to the 16 th October 2014, Heraklion-CRETE Maureen Pagnani, BODC/NOC,
Needs for Data and Information Managing living and non-living resources, monitoring environmental changes in the sea and protecting the marine environment,
Reiner Schlitzer Alfred Wegener Institute for Polar and Marine Research Ocean Data View.
John T. Gunn, AVDS Manager Earth & Space Research 1910 Fairview Ave. E., Suite 210 Seattle, WA Ph: ; 1 st Joint GOSUD/SAMOS.
Editing RT QC flag in delayed mode ? Virginie Thierry DMQC 4 Toulouse, 28 septembre 2009.
Technical Working Group, II Teruko Manabe Steven Worley Miroslaw Mietus Shawn Smith Simon Tett Volker Wagner Scott Woodruff David Berry Liz Kent.
SAMOS-GOSUD Meeting. Boulder 2-4 May Potential collaboration between the Coriolis project and the Samos initiative L. Petit de la Villéon. Ifremer-France-
ODINBLACKSEA Meeting, Ostende October BULGARIAN ACADEMY OF SCIENCES INSTITUTE OF OCEANOLOGY BGODC 2010 BULGARIAN NATIONAL OCEANOGRAPHIC DATA.
© Crown copyright Met Office The EN QC system for temperature and salinity profiles Simon Good.
Page 1© Crown copyright Report of the Global Collecting Centres Elanor Gowland, GCC UK SOT-IV, 16 th - 21 st April 2007.
From Ocean Sciences at the Bedford Institute of Oceanography Temperature – Salinity for the Northwest.
Quality Control for the World Ocean Database GSOP Quality Control Workshop June 12, 2013.
EC-Marine Science & Technology Programme Concerted Action (MAS3-CT /ERBIC20-CT ) MEDAR/MEDATLAS 2002 and future initiatives Mediterranean.
Monitoring Heat Transport Changes using Expendable Bathythermographs Molly Baringer and Silvia Garzoli NOAA, AOML What are time/space scales of climate.
© Crown copyright Met Office The EN4 dataset of quality controlled ocean temperature and salinity profiles and monthly objective analyses Simon Good.
Temporal Variability of Thermosteric & Halosteric Components of Sea Level Change, S. Levitus, J. Antonov, T. Boyer, R. Locarnini, H. Garcia,
ECOOP annual meeting Feb 13-14, Objectives & Description. Objectives: Collect meta data and historical data for temperature and salinity measurements.
1 Global Ocean Heat Content in light of recently revealed instrumentation problems Syd Levitus, John Antonov, Tim Boyer Ocean Climate Laboratory.
MEDAR 2002 Database and Network The MEDAR Group MEDAR/MEDATLAS II Mediterranean Data Archaeology and Rescue of Temperature, Salinity and Bio-chemical Parameters.
1 NODC Quality Control : Automatic Checks - reveal systematic errors in incoming data and metadata - eliminate most non-representative data from consideration.
© Crown copyright Met Office Upgrading VOS to VOSClim Sarah North.
10 th Argo data management 2009 Toulouse What is new at GDACs ?
Hernan E. Garcia (U.S. NODC, IODE Group of Experts on Biological and Chemical Data Management and Exchange Practices) 2nd IQuOD Workshop 2014, Silver Spring,
N ational C limatic D ata C enter Development of the Global Historical Climatology Network Sea Level Pressure Data Set (Version 2) David Wuertz, Physical.
CTD Data Processing Current BIO Procedure. Current Processing Software Matlab Migrating to R & Python Code Version Control SVN Migrating to GitHub.
Data management dedicated for assessing and modeling of the Mediterranean and Black Seas ecosystem changes (SESAME integrated project). Isaac Gertman,
WP3 - Quality Control survey findings and gaps M. Vinci, A. Giorgetti.
The SeaDataNet data products regional temperature and salinity historical data collections S. Simoncelli 1, C. Coatanoan 2, O. Bäck 3, H. Sagen 4, S.
Reiner Schlitzer Alfred Wegener Institute for Polar and Marine Research Data Quality Control and Visualization with Ocean Data View 4.
Reiner Schlitzer Alfred Wegener Institute for Polar and Marine Research Ocean Data View and itsRole in SeaDatanet and its Role in SeaDatanet.
RTOFS Monitoring and Evaluation Metrics Avichal Mehra MMAB/EMC/NCEP/NWS.
SeaDataNet Technical Task Group meeting JRA1 Standards Development Task 1.2 Common Data Management Protocol (for dissemination to all NODCs and JRA3) Data.
PROVIDING DATA SERVICES FOR THE MEDITERRANEAN AND BLACK SEAS
TAIYO KOBAYASHI and Shinya Minato
Data aggregation and products generation in the Mediterranean Sea
Argo Delayed-Mode Salinity Data
Outline RTQC goals to achieve Description of current proposal Tests
Comparisons of Argo profiles and nearby high resolution CTD stations
WP2 Products in Mediterranean Sea
Ocean Data View Reiner Schlitzer
Data aggregation and products generation
Ocean Data View Reiner Schlitzer
Presentation transcript:

First Data Management Training Workshop, February, 2007, Oostende, Belgium 1 Quality control checks description First Data Management Training Workshop February 2007, Oostende, Belgium Sissy IONA

First Data Management Training Workshop, February, 2007, Oostende, Belgium 2 Outlines Objectives of the Quality Control Requirements for Data Validation Delayed - Mode QC (IOC/CEC Manual and Guides #26, MEDAR/MEDATLAS Protocol) Real - Time QC (IOC/CEC Manual and Guides #22, Operational Oceanography) QC and processing of historical data (World Ocean Data Centre, NODC Ocean Climate Laboratory) References

First Data Management Training Workshop, February, 2007, Oostende, Belgium 3 Objectives of the Quality Control “to ensure the data consistency within a single dataset and within a collection of data sets and to ensure that the quality and the errors of the data are apparent to the user, who has sufficient information to assess its suitability for a task” (IOC/CEC Manual and Guides #26)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 4 Requirements for Data Validation The procedures that insure the quality of the data and the metadata are: Instrumentation checks and calibrations. Full documentation about the field measurements (location, duration of measurements, methods of deployments, sampling schemes, etc). Processing and validation by the source laboratories according to internationally standards and methods. Transmission of the validated data to the National Data Centres for further quality control, standardization, documentation and permanent archiving to a central database system for further use.

First Data Management Training Workshop, February, 2007, Oostende, Belgium 5 Delayed - Mode QC (IOC/CEC Manual and Guides #26, MEDAR/MEDATLAS Protocol)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 6 QC Procedures - Overview The QC procedures for oceanographic data management according to IOC, ICES and EU recommendations include automatic and visual controls on the data and their metadata. Data measured from the same instrument and coming from the same cruise are organized at the same file, transcoded to the same exchange format and then are subject to a series of quality tests: 1. check of the Format 2. check of the Cruise and the Stations metadata 3. check of Data points The results of the automatic control are added as qc flags to each data value.qc flags Validation or correction is made manually to the QC flags and NOT to the data. In case of uncertainties, the data originator is contacted.

First Data Management Training Workshop, February, 2007, Oostende, Belgium 7 MEDATLAS Quality Flags values ( based to the GTSPP Flag Scale definition ) 0: No QC 1: Correct value 2: Out of statistics but not obviously wrong 3: Doubtful value 4: Bad value 5: Modified value (only for the location, date, bottom depth) 9: missing value

First Data Management Training Workshop, February, 2007, Oostende, Belgium 8 QC Software Several software packages developed in the framework of European and International Projects to perform QC. Within SeaDataNet, Ocean Data View software (Schlitzer, R., ) is used for QC from participants who are not equipped with the appropriate facilities.

First Data Management Training Workshop, February, 2007, Oostende, Belgium 9 QC checks description 1. Format Check - This check detects anomalies in the format like wrong ship codes or names, parameters names or units, completeness of the information. - No further control should be made before the correction and validation of the exchange format.

First Data Management Training Workshop, February, 2007, Oostende, Belgium 10 QC checks description 2. Cruise and Station metadata checks For vertical profiles (CTD, Bottles, Bathythermographs, etc) duplicate entries: cruises or stations within a cruise using a space-time radius (e.g., for duplicate cruises: 1 mile, 15min or 1day if time is unknown) date: reasonable date, station date within the begin and end date of the cruise. ship velocity between two consecutive stations. (e.g., speed >15 knots means wrong station date or wrong station location). location/shoreline: on land position bottom sounding: out of the regional scale, compared with the reference surroundings For time series of fixed mooring (current meters/profilers, sea level, sediment traps, etc) sensor depth checks: less than the bottom depth series duration checks: consistence with the start and end date of the dataset duplicate moorings checks land position checks

First Data Management Training Workshop, February, 2007, Oostende, Belgium 11 QC checks description 3. Data points main checks presence of at least two parameters: vertical/time reference + measurement pressure/time must be monotonous increasing the profile/time series must not be constant: sensor jammed broad range checks: check for extreme regional values compared with the min. and max. values for the region. The broad range check is performed before the narrow range check. broad range checks data points below the bottom depth spikes detection: usually requires visual inspection. For time series a filter is applied first to remove the effect of tides and internal waves. spikes detection narrow range check: comparison with pre-existing climatological statistics. Time series are compared with internal statistics. narrow range check density inversion test: ( potential density anomaly, FOFONOF and MILLARD, 1983, MILLERO and POISSON, 1981 ) density inversion test Redfield ratio for nutrients: ratio of the oxygen, nitrate and alkalinity (carbonates) concentration over the phosphate (172, 16 and 122)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 12 Regional parameterization in MEDAR/MEDATLAS II ( plus depth parameterization)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 13 Spikes detection The test is sensitive to the vertical/time resolution. It requires at least 3 consecutive good/acceptable values. Algorithm to detect the spikes taking into account the difference in values (for regularly spaced data like CTD): |V2-(V3+V1)/2 | - |V1-V3|/2 ) > THRESHOLD VALUE For irregularly spaced values (like bottle data) a better algorithm to detect the spikes, taking into account the difference in gradients, is: | |(V2-V1)/(P2-P1)-(V3-V1)/(P3-P1)| - |(V3-V1)/(P3-P1)| |> THRESHOLD VALUE

First Data Management Training Workshop, February, 2007, Oostende, Belgium 14 Standard deviation tests in respect with the climatolological statistics Reference climatologies used in Medar/Medatlas II Project: MEDATLAS, 1997, for temperature and salinity, averaged on 1x1 square degree LEVITUS 1998, for nutrients The automatic comparison is made by linearly interpolating the references at the level of the observation. Outliers are detected if the data points differ from the references more than: 5 x standard deviation over the shelf (depth <200m) 4 x standard deviation at the slop and straits region (200 m< depth < 400m) 3 x standard deviation at the deep sea (depth >400m)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 15 Density inversion The importance of visual check: example of density inversion due to temperature increase with depth Wrong value detected automatically threshold value=0.03 for ctd, 0.05 for near surface and for bottle data z1 z2 Wrong value detected automatically, but it is correct value, the previous is manually corrected

First Data Management Training Workshop, February, 2007, Oostende, Belgium 16 Real - Time QC (IOC/CEC Manual and Guides #22, Operational Oceanography)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 17 ARGO Real-Time QC on vertical profiles Based on the Global Temperature and Salinity Profile Project –GTSPP of IOC/IODE, the automatic QC tests are: Platform identification: checks whether the floats ID corresponds to the correct WMO number. Impossible date test: checks whether the observation date and time from the float is sensible. Impossible location test : checks whether the observation latitude and longitude from the float issensible. Position on land test : observation latitude and longitude from the float be located in an ocean. Impossible speed test : checks the position and time of the floats. Global range test : applies a gross filter on observed values for temperature and salinity. Regional range test: checks for extreme regional values Pressure increasing test : checks for monotonically increasing pressure Spike test : checks for large differences between adjacent values. Gradient test : is failed when the difference between vertically adjacent measurements is too steep. Digit rollover test : checks whether the temperature and salinity values exceed the floats storage capacity. Stuck value test : checks for all measurements of temperature or salinity in a profile being identical. Density inversion : Densities are compared at consecutive levels in a profile, in both directions, i.e. from top to bottom profile and from bottom to top. Grey list (7 items) : stop the real-time dissemination of measurements from a sensor that is not working correctly. Gross salinity or temperature sensor drift : to detect a sudden and important sensor drift. Frozen profile test : detect a float that reproduces the same profile (with very small deviations) over and over again. Deepest pressure test : the profile has pressures not higher than DEEPEST_PRESSURE plus 10%.

First Data Management Training Workshop, February, 2007, Oostende, Belgium 18 CORIOLIS-Real Time QC on Time Series Automatic quality controls test 1: Platform Identification test 2: Impossible Date Test test 3: Impossible Location Test test 4: Position on Land Test test 5: Impossible Speed Test test 6: Global Range Test test 7: Regional Global Parameter Test for Red Sea and Mediterranean Sea test 8: Spike Test test 10: comparison with climatology The Delayed-Mode QC in Coriolis Data centre for profiles and time series consists of Visual QC, objective analysis and residual analysis (to correct sensor drift and offsets).

First Data Management Training Workshop, February, 2007, Oostende, Belgium 19 ARGO/CORIOLIS quality control flag scale

First Data Management Training Workshop, February, 2007, Oostende, Belgium 20 QC and processing of historical data (World Ocean Data Centre, NODC Ocean Climate Laboratory)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 21 Quality Controls - Overview The QC procedures in the WDC are summarized in three major parts: 1. Check of the observed level data For the construction of the climatology – processing: 2. Interpolation to standard levels 3. Standard level data checks

First Data Management Training Workshop, February, 2007, Oostende, Belgium 22 Quality Controls - Overview 1. Checks of the observed level data Format conversion Position/date/time check Assignment of cruise and cast numbers Speed check Duplicate profile/cruise checks Range checks Range checks Depth inversion and depth duplication checks Large temperature inversion and gradient tests: to quantify the maximum allowable temperature increase with depth (inversion) and decrease (excessive gradient) with depth (0.3 C per m, 0.7 C per m) Observed level density inversion checks

First Data Management Training Workshop, February, 2007, Oostende, Belgium 23 Regional parameterization of the world ocean in WOD05. (plus vertical parameterization)

First Data Management Training Workshop, February, 2007, Oostende, Belgium 24 Quality Controls - Overview 2. Interpolation to standard levels Modified Reiniger – Ross scheme (Reiniger and Ross, 1968): less spurious features in regions with large vertical gradients than a 3-point Lagrangian interpolation. 3. Standard level data checks Density inversion checks (Fofonoff et al., 1983) Standard deviation checks: a series of statistical analysis tests based on the mean, std and number of observations in a 5 degrees square box for coastal, near-coastal and open ocean data. Objective analysis Post objective analysis subjective checks: to detect unrealistic -“bullseyes” features mostly in data sparse areas.

First Data Management Training Workshop, February, 2007, Oostende, Belgium 25 Definition of WOD Quality Flags. (1) FLAGS FOR ENTIRE PROFILE (AS A FUNCTION OF PARAMETER) 0 - accepted profile 1 - failed annual standard deviation check 2 - two or more density inversions ( Levitus, 1982 criteria ) 3 - flagged cruise 4 - failed seasonal standard deviation check 5 - failed monthly standard deviation check 6 - flag 1 and flag flag 1 and flag flag 4 and flag flag 1 and flag 4 and flag 5 (2) FLAGS ON INDIVIDUAL OBSERVATIONS (a) Depth Flags 0 - accepted value 1 - error in recorded depth ( same or less than previous depth ) 2 - temperature inversion of magnitude > 0.3 /meter 3 - temperature gradient of magnitude > 0.7 /meter 4 - temperature gradient and inversion (b) Observed Level Flags 0 - accepted value 1 - range outlier ( outside of broad range check ) 2 - density inversion 3 - failed range check and density inversion check (3) Standard Level Flags 0 - accepted value 1 - bullseye marker 2 - density inversion 3 - failed annual standard deviation check 4 - failed seasonal standard deviation check 5 - failed monthly standard deviation check 6 - failed annual and seasonal standard deviation check 7 - failed annual and monthly standard deviation check 8 - failed seasonal and monthly standard deviation check 9 - failed annual, seasonal and monthly standard deviation check

First Data Management Training Workshop, February, 2007, Oostende, Belgium 26 References Argo quality control manual, V2.1, 2005 Coriolis Data Centre, In-situ data quality control, V1.3, 2005 Data Type guidelines - ICES Working Group of Marine Data Management (12 data types) GTSPP Real-Time Quality Control Manual, 1990 (IOC MANUALS AND GUIDES #22) “Medar-Medatlas protocol, Part I: Exchange format and quality checks for observed profiles”, V3, 2001SCOOP User Manual, V4.2, 2000 “QUALITY CONTROL OF SEA LEVEL OBSERVATIONS”, ESEAS-RI, V1.0, 2006 SCOOP User Manual, V4.2, 2000 QUALITY CONTROL PROCESSING OF HISTORICAL OCEANOGRAPHIC TEMPERATURE, SALINITY,AND OXYGEN DATA. Timothy Boyer and Sydney Levitus, National Oceanographic Data Centre, Ocean Climate laboratory UNESCO/IOC/IODE and MAST, 1993, Manual and Guides #26 World Ocean Database 2005 Documentation. Ed. Sydney Levitus. NODC Internal Report 18,U.S. Government Printing Office, Washington, D.C., 163 pp In-situ data quality control procedures, COR-DO/DTI-RAP/04-047, V1.3 Gosud data management, Real-time QC, go-um-03-01, V1.0