Presentation on theme: "WP1 - Data collection and metadata compilation in sea regions Sissy Iona (HCMR/HNODC) EMODNET Chemistry 2 - 4 th Steering Committee, 2-3 December 2014,"— Presentation transcript:
WP1 - Data collection and metadata compilation in sea regions Sissy Iona (HCMR/HNODC) EMODNET Chemistry 2 - 4 th Steering Committee, 2-3 December 2014, Amsterdam, The Netherlands
Overview of last month activities Creation of metadata enriched ODV collections – Regional data set of 28 October 2014 Data aggregations (not complete-missing conversions) QC (zero values, N:P) (not complete) No DIVA runs
Findings -1: mismatches Total: 87024 odv txt files imported in ODV 4.6.3 linux 64-bit (Oct 2014) 4393 files with mismatches in local_cdi_ids between ODV txt and CDIs 30465 files with mismatches in edmo_codes between ODV txt and CDIs 15 files with both mismatches 26 edmo_codes in CDIs (csv files) 36 edmo_codes in ODV txt files
Findings -3: invalid code (?) During import in ODV Error: Invalid parameter code 'SDN:P01::PRESPS02' of primary variable 'PRES (DECIBAR=10000 PASCALS)' detected – Action: PRESPS02 changed to PRESPR01
Findings -4: semantic descriptions During import in ODV : same label, code, different unit * Warning: Duplicate variable name 'DISSOLVED OXYGEN' found in line 9. None of the 'DISSOLVED OXYGEN' variables will be imported. Error: Header Line. 'DISSOLVED OXYGEN [umol/l]' is not in the semantic header. Cannot import this variable. Error: Header Line. 'DISSOLVED OXYGEN [ml/l]' is not in the semantic header. Cannot import this variable. * Same findings for other parameter e.g. PARTICULATE ORGANIC NITROGEN
Findings -4: semantic descriptions Action: change user label at semantic header (plus at column header)
Findings -5: semantic descriptions During import in ODV : mismatch of labels in semantic and column headers Error: Header Line. 'NITRATE _NO3-N_ CONTENT [umol/kg]' is not in the semantic header. Cannot import this variable. Error: Semantic header entry 'NITRATE _NO3-N_ CONTENT ' not found in column header line Name with 6 spaces Name with 7 spaces
Findings -7: format errors During import in ODV : wrong time series format Error: Invalid parameter code 'SDN:P01::YEARXXXX' of primary variable 'YEAR (yyyy)' detected
Findings -7: format errors Action: format correction
Findings -8: Warnings Warning: Label of primary variable differs from standard:Expected 'time_ISO8601‘ but found 'time_ISO8601 [ISO8601]' Warning: Header Line. Incorrect meta-variable label: expected 'EDMO_code' - found 'EDMO_CODE‘ Warning: Header Line. Incorrect meta-variable label: expected 'Bot. Depth [m]' - found 'Bot.Depth [m]' Warning: Unexpected empty line (line 2). Warning: Unexpected empty line (line 16). Warning: Unexpected end of SDN semantic header in line 17.
Import to ODV V126.96.36.199 Linux 64-bit (unofficial version because current regional data are not corrected)
Substitution procedure of zero values with respective LOQ/2 (1/3) It should be applied after the typical QC procedure (selection of 0, 1, 2, 6 QC flags, data aggregation, search out of range data, broad range checks, default values treatment, etc.). It should also be applied for each institute/NODC separately The creation of macro file that contains the expressions and the LOQ/2 value is required. The procedure described below in four steps refers to the simple case of substituting zero values of one parameter. In ODV 4.6.3: STEP 1: Create the macro file : Tools macro editor create new complete the fields (bold and underlined): ‘Label’,‘Units’ and ‘Digits’ ‘Comments’ (optional) ‘Input Variables’ Write a name in the ‘New’ field and then click << to move the variable to ‘Defined’ field ‘Expression in Post fix Notation’: #1 0.000 <= #1 x.xxx + #1 IFTE Explanation of the notation: if the input variable #1 is less or equal to 0.000 then #1+x.xxx where x.xxx the respective LOQ/2 value) Save as the macro editor file (e.g. ForZeroChanges.mac
Substitution procedure of zero values with respective LOQ/2 (2/3) STEP 2: Identify the EDMO_Codes of zero-value sample data in the initial aggregated collection : Selection criteria availability (for the parameter of interest) Export Station data ODV SpreadSheet Select variables: the primary variable and the variable you are working with and in Data Filter Apply sample range and quality filters Range: select the working variable Acceptable range fields: 0 - 0, Ok. Close Initial Aggregated Collection Re-import the exported txt file in ODV and Export Metadata. The EDMO Codes of all data sets that contain zero values are in the exported metadata.odv file Note that when selecting in zero values collection F4 for data statistics the mean=0, std=0 but minimum=-0.001 and maximum=0.001 (parameters with 3 digits) (bug?)
Substitution procedure of zero values with respective LOQ/2 (3/3) STEP3: Substitution of zero values per EDMO_Code per Parameter. Load the initial aggregated collection Station Selection Criteria Metadata EDMO Code Range: type 1 EDMO_Code in both fields Station Selection Criteria Availability (for the variable you are working with) Export Data Station data ODV SpreadSheet Select variables (the primary variable and the variable you are working with) Export All data. Now, the data with EDMO_code you want to work with are in the new data set. Close the aggregated collection and import the data exported in the previous stage in ODV. View 1scatter-plot Right Click on the sample data fields “Derived Variables Expression,Derived,Integrals Macro File Add. Choose the macro file of STEP 1 identify (actually assign) the ‘Input Variable’ (check step 1) Press the find and click on the variable you are working with. Click ok. The Derived variable (step 1) is inserted in the variables. All zero values are replaced with LOQ/2. Export Station data Choose primary variable and derived variable ONLY STEP 4: Import the above dataset to the initial aggregated collection and select ‘Replace all’ for the stations.
It should be applied after all other --- QC checks e.g. flag changes, default values, substitution of zero values with LOQ/2 etc. All 3 parameters for DIN=NH4+NO2+NO3 must have values in order to have the expression. If one of those is missing no value for the expression returns Export in separate ODV.txt files the water body phosphate, water body nitrite, water body nitrate and water body ammonium Create a new collection with parameters:depth, water body phosphate, water body nitrite, water body nitrate and water body ammonium Insert in the new collection the ODV files selecting merge data in order to avoid replacing stations Insert again selecting add/replace and when asked for replace select NO Derived Variables Expression,Derivatives,Integrals Ex pression: #1 #2 + #3+ #4 / If one of the parameters used in the expression is missing N:P ratio value will not be calculated Apply the N:P ratio QC for each area separately by using Selection Criteria Define Polygon Export Data N:P ratio