Presentation is loading. Please wait.

Presentation is loading. Please wait.

First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 1 Quality control checks description First Data Management Training Workshop.

Similar presentations


Presentation on theme: "First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 1 Quality control checks description First Data Management Training Workshop."— Presentation transcript:

1 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 1 Quality control checks description First Data Management Training Workshop 12-17 February 2007, Oostende, Belgium Sissy IONA

2 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 2 Outlines Objectives of the Quality Control Requirements for Data Validation Delayed - Mode QC (IOC/CEC Manual and Guides #26, MEDAR/MEDATLAS Protocol) Real - Time QC (IOC/CEC Manual and Guides #22, Operational Oceanography) QC and processing of historical data (World Ocean Data Centre, NODC Ocean Climate Laboratory) References

3 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 3 Objectives of the Quality Control “to ensure the data consistency within a single dataset and within a collection of data sets and to ensure that the quality and the errors of the data are apparent to the user, who has sufficient information to assess its suitability for a task” (IOC/CEC Manual and Guides #26)

4 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 4 Requirements for Data Validation The procedures that insure the quality of the data and the metadata are: Instrumentation checks and calibrations. Full documentation about the field measurements (location, duration of measurements, methods of deployments, sampling schemes, etc). Processing and validation by the source laboratories according to internationally standards and methods. Transmission of the validated data to the National Data Centres for further quality control, standardization, documentation and permanent archiving to a central database system for further use.

5 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 5 Delayed - Mode QC (IOC/CEC Manual and Guides #26, MEDAR/MEDATLAS Protocol)

6 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 6 QC Procedures - Overview The QC procedures for oceanographic data management according to IOC, ICES and EU recommendations include automatic and visual controls on the data and their metadata. Data measured from the same instrument and coming from the same cruise are organized at the same file, transcoded to the same exchange format and then are subject to a series of quality tests: 1. check of the Format 2. check of the Cruise and the Stations metadata 3. check of Data points The results of the automatic control are added as qc flags to each data value.qc flags Validation or correction is made manually to the QC flags and NOT to the data. In case of uncertainties, the data originator is contacted.

7 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 7 MEDATLAS Quality Flags values ( based to the GTSPP Flag Scale definition ) 0: No QC 1: Correct value 2: Out of statistics but not obviously wrong 3: Doubtful value 4: Bad value 5: Modified value (only for the location, date, bottom depth) 9: missing value

8 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 8 QC Software Several software packages developed in the framework of European and International Projects to perform QC. Within SeaDataNet, Ocean Data View software (Schlitzer, R., http://odv.awi.de, 2006) is used for QC from participants who are not equipped with the appropriate facilities.http://odv.awi.de

9 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 9 QC checks description 1. Format Check - This check detects anomalies in the format like wrong ship codes or names, parameters names or units, completeness of the information. - No further control should be made before the correction and validation of the exchange format.

10 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 10 QC checks description 2. Cruise and Station metadata checks For vertical profiles (CTD, Bottles, Bathythermographs, etc) duplicate entries: cruises or stations within a cruise using a space-time radius (e.g., for duplicate cruises: 1 mile, 15min or 1day if time is unknown) date: reasonable date, station date within the begin and end date of the cruise. ship velocity between two consecutive stations. (e.g., speed >15 knots means wrong station date or wrong station location). location/shoreline: on land position bottom sounding: out of the regional scale, compared with the reference surroundings For time series of fixed mooring (current meters/profilers, sea level, sediment traps, etc) sensor depth checks: less than the bottom depth series duration checks: consistence with the start and end date of the dataset duplicate moorings checks land position checks

11 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 11 QC checks description 3. Data points main checks presence of at least two parameters: vertical/time reference + measurement pressure/time must be monotonous increasing the profile/time series must not be constant: sensor jammed broad range checks: check for extreme regional values compared with the min. and max. values for the region. The broad range check is performed before the narrow range check. broad range checks data points below the bottom depth spikes detection: usually requires visual inspection. For time series a filter is applied first to remove the effect of tides and internal waves. spikes detection narrow range check: comparison with pre-existing climatological statistics. Time series are compared with internal statistics. narrow range check density inversion test: ( potential density anomaly, FOFONOF and MILLARD, 1983, MILLERO and POISSON, 1981 ) density inversion test Redfield ratio for nutrients: ratio of the oxygen, nitrate and alkalinity (carbonates) concentration over the phosphate (172, 16 and 122)

12 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 12 Regional parameterization in MEDAR/MEDATLAS II ( plus depth parameterization)

13 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 13 Spikes detection The test is sensitive to the vertical/time resolution. It requires at least 3 consecutive good/acceptable values. Algorithm to detect the spikes taking into account the difference in values (for regularly spaced data like CTD): |V2-(V3+V1)/2 | - |V1-V3|/2 ) > THRESHOLD VALUE For irregularly spaced values (like bottle data) a better algorithm to detect the spikes, taking into account the difference in gradients, is: | |(V2-V1)/(P2-P1)-(V3-V1)/(P3-P1)| - |(V3-V1)/(P3-P1)| |> THRESHOLD VALUE

14 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 14 Standard deviation tests in respect with the climatolological statistics Reference climatologies used in Medar/Medatlas II Project: MEDATLAS, 1997, for temperature and salinity, averaged on 1x1 square degree LEVITUS 1998, for nutrients The automatic comparison is made by linearly interpolating the references at the level of the observation. Outliers are detected if the data points differ from the references more than: 5 x standard deviation over the shelf (depth <200m) 4 x standard deviation at the slop and straits region (200 m< depth < 400m) 3 x standard deviation at the deep sea (depth >400m)

15 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 15 Density inversion The importance of visual check: example of density inversion due to temperature increase with depth Wrong value detected automatically threshold value=0.03 for ctd, 0.05 for near surface and for bottle data z1 z2 Wrong value detected automatically, but it is correct value, the previous is manually corrected

16 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 16 Real - Time QC (IOC/CEC Manual and Guides #22, Operational Oceanography)

17 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 17 ARGO Real-Time QC on vertical profiles Based on the Global Temperature and Salinity Profile Project –GTSPP of IOC/IODE, the automatic QC tests are: Platform identification: checks whether the floats ID corresponds to the correct WMO number. Impossible date test: checks whether the observation date and time from the float is sensible. Impossible location test : checks whether the observation latitude and longitude from the float issensible. Position on land test : observation latitude and longitude from the float be located in an ocean. Impossible speed test : checks the position and time of the floats. Global range test : applies a gross filter on observed values for temperature and salinity. Regional range test: checks for extreme regional values Pressure increasing test : checks for monotonically increasing pressure Spike test : checks for large differences between adjacent values. Gradient test : is failed when the difference between vertically adjacent measurements is too steep. Digit rollover test : checks whether the temperature and salinity values exceed the floats storage capacity. Stuck value test : checks for all measurements of temperature or salinity in a profile being identical. Density inversion : Densities are compared at consecutive levels in a profile, in both directions, i.e. from top to bottom profile and from bottom to top. Grey list (7 items) : stop the real-time dissemination of measurements from a sensor that is not working correctly. Gross salinity or temperature sensor drift : to detect a sudden and important sensor drift. Frozen profile test : detect a float that reproduces the same profile (with very small deviations) over and over again. Deepest pressure test : the profile has pressures not higher than DEEPEST_PRESSURE plus 10%.

18 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 18 CORIOLIS-Real Time QC on Time Series Automatic quality controls test 1: Platform Identification test 2: Impossible Date Test test 3: Impossible Location Test test 4: Position on Land Test test 5: Impossible Speed Test test 6: Global Range Test test 7: Regional Global Parameter Test for Red Sea and Mediterranean Sea test 8: Spike Test test 10: comparison with climatology The Delayed-Mode QC in Coriolis Data centre for profiles and time series consists of Visual QC, objective analysis and residual analysis (to correct sensor drift and offsets).

19 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 19 ARGO/CORIOLIS quality control flag scale

20 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 20 QC and processing of historical data (World Ocean Data Centre, NODC Ocean Climate Laboratory)

21 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 21 Quality Controls - Overview The QC procedures in the WDC are summarized in three major parts: 1. Check of the observed level data For the construction of the climatology – processing: 2. Interpolation to standard levels 3. Standard level data checks

22 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 22 Quality Controls - Overview 1. Checks of the observed level data Format conversion Position/date/time check Assignment of cruise and cast numbers Speed check Duplicate profile/cruise checks Range checks Range checks Depth inversion and depth duplication checks Large temperature inversion and gradient tests: to quantify the maximum allowable temperature increase with depth (inversion) and decrease (excessive gradient) with depth (0.3 C per m, 0.7 C per m) Observed level density inversion checks

23 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 23 Regional parameterization of the world ocean in WOD05. (plus vertical parameterization)

24 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 24 Quality Controls - Overview 2. Interpolation to standard levels Modified Reiniger – Ross scheme (Reiniger and Ross, 1968): less spurious features in regions with large vertical gradients than a 3-point Lagrangian interpolation. 3. Standard level data checks Density inversion checks (Fofonoff et al., 1983) Standard deviation checks: a series of statistical analysis tests based on the mean, std and number of observations in a 5 degrees square box for coastal, near-coastal and open ocean data. Objective analysis Post objective analysis subjective checks: to detect unrealistic -“bullseyes” features mostly in data sparse areas.

25 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 25 Definition of WOD Quality Flags. (1) FLAGS FOR ENTIRE PROFILE (AS A FUNCTION OF PARAMETER) 0 - accepted profile 1 - failed annual standard deviation check 2 - two or more density inversions ( Levitus, 1982 criteria ) 3 - flagged cruise 4 - failed seasonal standard deviation check 5 - failed monthly standard deviation check 6 - flag 1 and flag 4 7 - flag 1 and flag 5 8 - flag 4 and flag 5 9 - flag 1 and flag 4 and flag 5 (2) FLAGS ON INDIVIDUAL OBSERVATIONS (a) Depth Flags 0 - accepted value 1 - error in recorded depth ( same or less than previous depth ) 2 - temperature inversion of magnitude > 0.3 /meter 3 - temperature gradient of magnitude > 0.7 /meter 4 - temperature gradient and inversion (b) Observed Level Flags 0 - accepted value 1 - range outlier ( outside of broad range check ) 2 - density inversion 3 - failed range check and density inversion check (3) Standard Level Flags 0 - accepted value 1 - bullseye marker 2 - density inversion 3 - failed annual standard deviation check 4 - failed seasonal standard deviation check 5 - failed monthly standard deviation check 6 - failed annual and seasonal standard deviation check 7 - failed annual and monthly standard deviation check 8 - failed seasonal and monthly standard deviation check 9 - failed annual, seasonal and monthly standard deviation check

26 First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 26 References Argo quality control manual, V2.1, 2005 Coriolis Data Centre, In-situ data quality control, V1.3, 2005 Data Type guidelines - ICES Working Group of Marine Data Management (12 data types) GTSPP Real-Time Quality Control Manual, 1990 (IOC MANUALS AND GUIDES #22) “Medar-Medatlas protocol, Part I: Exchange format and quality checks for observed profiles”, V3, 2001SCOOP User Manual, V4.2, 2000 “QUALITY CONTROL OF SEA LEVEL OBSERVATIONS”, ESEAS-RI, V1.0, 2006 SCOOP User Manual, V4.2, 2000 QUALITY CONTROL PROCESSING OF HISTORICAL OCEANOGRAPHIC TEMPERATURE, SALINITY,AND OXYGEN DATA. Timothy Boyer and Sydney Levitus, 1994. National Oceanographic Data Centre, Ocean Climate laboratory UNESCO/IOC/IODE and MAST, 1993, Manual and Guides #26 World Ocean Database 2005 Documentation. Ed. Sydney Levitus. NODC Internal Report 18,U.S. Government Printing Office, Washington, D.C., 163 pp In-situ data quality control procedures, COR-DO/DTI-RAP/04-047, V1.3 Gosud data management, Real-time QC, go-um-03-01, V1.0


Download ppt "First Data Management Training Workshop, 12-17 February, 2007, Oostende, Belgium 1 Quality control checks description First Data Management Training Workshop."

Similar presentations


Ads by Google