Presentation is loading. Please wait.

Presentation is loading. Please wait.

Validation of WStatR-Data

Similar presentations


Presentation on theme: "Validation of WStatR-Data"— Presentation transcript:

1 Validation of WStatR-Data
Jürgen Gonser ARGUS, Berlin ESTP – Training on waste statistics, 24th/25th April 2012

2 What is Data Validation?
Data validation shall ensure the correspondence of the final (published) data with a number of quality characteristics, in particular the accuracy, coherence and comparability of the data. Data validation encompasses: establishing a set of checking / validation rules; detecting outliers or potential errors; communicate the detected problems to the “actor” in the best position to investigate about the anomaly. ESTP – Training on waste statistics, 24th/25th April 2012

3 Structure or Presentation
Overview of validation process Types of validation checks Validation checks for waste generation Validation checks for waste treatment Compilation of clarification requests Outlook ESTP – Training on waste statistics, 24th/25th April 2012

4 Clarification request
Validation Process for WStatR-Data Technical check during data upload:  completeness, consistency of totals Evaluation report Quick evaluation (within 2 months)  internal coherence, development over time at a very aggregate level Country replies Clarification request Validation  analysis of time series, cross-country comparison, cross-checks with other data Country replies ESTP – Training on waste statistics, 24th/25th April 2012

5 Validation Checks Automatic checks based on:
established checking rules using defined thresholds implemented in a MS Access database “Visual” checks: needed for selection of relevant potential errors assessment of SDI indicators (non-mineral waste, hazardous waste) aspects not covered by automatic checks (e.g. waste related view on the data, …) Cross-checks with other waste data ESTP – Training on waste statistics, 24th/25th April 2012

6 Waste Generation: Validation Checks
Checks used for the validation of waste generation data: Comparison with previous year Comparison across countries Ranking of waste categories by activities Cross-checks with other data sets (WSR, ELV, WEEE) ESTP – Training on waste statistics, 24th/25th April 2012

7 Waste Generation: Comparison over time
ESTP – Training on waste statistics, 24th/25th April 2012

8 Waste Generation: Cross-Country Comparison
Comparison with other countries / identification of assumed outliers is done on the basis of the interquartile range. Interquartile range (IQR): common measure for statistical variation difference between 25%-quartile (Q1) and 75%-quartile (Q3) Outliers = values that deviate from the upper or lower quartiles by more than 1.5 interquartile ranges Advantage: Validation thresholds reflects sector-specific waste intensity and variation of waste generation across countries ESTP – Training on waste statistics, 24th/25th April 2012

9 Waste Generation: Cross-Country Comparison
Box-Whisker-Plot: 50% of all values lie between Q1 and Q3 (within the IQR) The upper whisker represents the highest value still within 1.5 IQR The lower whisker represents the lowest value still within 1.5 IQR ESTP – Training on waste statistics, 24th/25th April 2012

10 Waste Generation: Cross-Country Comparison
ESTP – Training on waste statistics, 24th/25th April 2012

11 Waste Generation: Cross-Country Comparison
Hazardous waste total / Gross value added: Distribution by sectors petroleum industry metal industry chemical industry ESTP – Training on waste statistics, 24th/25th April 2012

12 Generation: Ranking of waste categories
Waste categories are ranked according to the generated amounts: for each sector and for each country Result: Sector-specific waste generation profiles The profile shows which waste categories: are usually most important in the sector; are uncommon for the sector (potential errors) Sector-profiles are compiled separately for hazardous and for non-hazardous waste (in 1000 tonnes and in kg/inhabitant) ESTP – Training on waste statistics, 24th/25th April 2012

13 Generation: Ranking of waste categories
Non-hazardous waste reported by all countries in NACE F in by waste categories (kg per inhabitant): ESTP – Training on waste statistics, 24th/25th April 2012

14 Waste Treatment: Validation Checks:
Checks used for the validation of waste treatment data: Comparison with previous year Comparison with generated amounts Comparison with treatment capacities (for incineration only) Cross-checks with other data sets (packaging waste) ESTP – Training on waste statistics, 24th/25th April 2012

15 Waste Treatment: Comparison over Time
Indicator: Share of amount treated compared with previous year Comparison is carried out for: the treated total of all treatment categories; the treated total of each of the 5 treatment categories. Thresholds: lower threshold: 80% compared to previous year upper threshold: 120% compared to previous year ESTP – Training on waste statistics, 24th/25th April 2012

16 Waste Treatment: Comparison with Generation
Assumption: Treated total similar to generated total Two approaches: Share of treated total compared to generated total for total waste (nhaz and haz) for total hazardous waste Same as 1. but corrected for imports and exports with data on waste shipments (Basel) possible for hazardous waste only Thresholds: lower threshold: 80% of generated amount upper threshold: 115% of generated amount ESTP – Training on waste statistics, 24th/25th April 2012

17 Waste Treatment: Comparison with Capacity
Assumption: Treated total equal or lower than available capacity Test applied to total amount (nhaz and haz) treated by: Incineration with energy recovery Incineration without energy recovery Threshold: upper threshold: waste treated amounts to 100% of the reported treatment capacity lower threshold: not applied ESTP – Training on waste statistics, 24th/25th April 2012

18 Clarification Requests
Search for explanations for identified “errors”: Requests for clarification List of identified potential errors Elimination of insignificant “errors” Quality reports Results of quick evaluation Results of previous validation (1022 “errors” summarised in 476 questions) (Result: 1874 “errors”) (Result: 1206 “errors”) (Result: 1022 “errors”) ESTP – Training on waste statistics, 24th/25th April 2012

19 Clarification Requests
ESTP – Training on waste statistics, 24th/25th April 2012

20 Outlook Changes from WStatR revision have to be incorporated into validation: Enhances the possibilities for validation (e.g. comparison of generation and treatment by selected waste categories, ..) Envisaged improvements: More focus on tests related to waste categories Tests for validation of time series shall be improved More attention to low values, missing values, zero-values ESTP – Training on waste statistics, 24th/25th April 2012

21 Outlook After this presentation you can anticipate some of the requests for clarification you will most likely receive. We are looking forward to finding the respective explanations in the quality reports! ESTP – Training on waste statistics, 24th/25th April 2012


Download ppt "Validation of WStatR-Data"

Similar presentations


Ads by Google