Presentation on theme: "Better data quality through global data and metadata sharing"— Presentation transcript:
1 Better data quality through global data and metadata sharing Agne Bikauskaite and Håkan LindenEurostatEuropean Conference on Quality in Official Statistics (Q2014)Vienna, 3-5 June 2014
2 Outline Context A data sharing model The necessary preconditions Implementing Eurostat's data sharing strategyConclusions and outlook
3 Context General objectives Reduce reporting burden on NSIs More efficient use of resources in International Organisation (IO)Ensure high quality and consistency of data of official statisticsImprove global data exchange and dissemination
4 European statistics: From national to Eurostat A data sharing modelEurostatData ValidationEU Member stateEU Member stateEU Member stateEU Member stateEuropean statistics: From national to Eurostat
5 A data sharing model U S E R EU countriesOECD countries (non-EU countries only)Other countries (non-OECD countries only)Eurostat - ECBOECDIMF, UN, WB, ILO, BIS, other IOsUSEREurostat as international hub for European statistics
6 The necessary pre-conditions Internationally agreed technical and statistical standardsInternationally agreed data structuresMaintenance agreementsInternationally agreed data validationStreamlined data exchange processes
7 Statistical Data and Metadata Exchange (SDMX) It consists of technical and statistical standards, guidelines, an IT service infrastructure and IT tools.SDMX providestechnical/statistical standardsnew exchange modes (hubs)clear rules and responsibilitiesSDMXISO IS 17369
8 SDMX describes the data and metadata exchange Provision AgreementOrganisation schemeSDMX RegistrymaintainerConcept SchemesCode listsDSDsConcepts8
9 Describing the data exchange Who?When?Who?How?Where?What?What?992
10 Content-Oriented guidelines Cross-domain concepts and code listsStatistical subject-matter domainsMetadata common vocabularyRecommendations to harmonise implementationsOrganisation 1Organisation 2Organisation 3interoperability10
11 Implementing Eurostat's data sharing strategy Standardisation of structural metadataCode lists describe dimensions in data tables, giving a meaning to the data.Code lists are based on:official statistical classifications such as NACE, NUTS, ISCO, etc.The ESS and SDMX Content Oriented Guidelinesdomain specific codificationsA standard code list is a code list already harmonisedStandard code lists should be used all along the statistical business process: data design, collection, aggregation, dissemination, exchange, archiving.
12 Implementing Eurostat's data sharing strategy Recommendations for the SCL creation RECOMMENDED RULESESSSDMXCOMMENTSInput: Official informationⱴCodingA-Z _A-Z _In SDMX “–“ (dash) is not allowed (to avoid confusion with operator "minus")Codes starting with letterWith some exceptionsMeaningful codingLess homogeneity in coding in SDMX (due to involvement of several different partners)Aggregates are possibleTo be used all along the statistical business processMay be referenced by several statistical conceptsBased on clear guidelinesMaintenance agencyESS: Eurostat Unit B5SDMX: Statistical Working Group (SWG)Versioning systemIn future registriesGeneric conceptIn SDMX is special CL for generic codesIn ESS generic codes are implemented in each SCL when it is needed
13 Implementing Eurostat's data sharing strategy SDMX standards into ESS structural metadata In purpose to improve quality of the data comparability and clarity is needed:To use identical SCLs in the ESS and in the SDMXTo transpose the SDMX guidelines into the ESS code listsTo adapt the ESS standard codes into the SDMX DSDs
14 Implementing Eurostat's data sharing strategy Overview of the ESS SCLs 504 ESS CLs194 ESS SCLs released in Ramon12 fully SDMX compliant110 SDMX compliant (except Generic codes)
15 Implementing Eurostat's data sharing strategy Standardisation of Reference Metadata ESMSEuro SDMX Metadata StructureESQRSESS Standard for Quality Reports StructureEPMSEurostat Process Metadata Structure
16 Implementing Eurostat's Reference metadata sharing strategy WASTE (end of life vehicles, packaging, electronic waste)WINEFARM STRUCTUREMIP STATISTICSHICP/ Compliance monitoringEHIS (Education, health and social protection)R&D (CIS 2012)Annual cropsPRAGESAWAES (Education, Science and Culture)LCI (Labour Cost Index)INFOSOC (Information Society)BUSINESS REGISTERHICPLFS-Q, LFS-AEU-SILCFATSSTS (Short Term Statistics)WASTEAEI (Pesticides)EDUCATJVC (Job Vacancy Stats)PRODCOMEXTERNAL TRADE (3rd countries)COSAEAURBANREGR&DTOURISMPERMANENT CROPSCENSUSHOUSING PRICES HPSOver 30 Eurostat domains are in various phases of ESS Reference metadata standardisation.This concerns about 35% of all eligible Eurostat processes.
17 Implementing Eurostat's data sharing strategy The Eurostat established methodology 17
18 Implementing Eurostat's data sharing strategy in ESS
19 Implementing Eurostat's data sharing strategy Development of the technical infrastructure Key components:SDMX RegistriesThe Euro-SDMX RegistryThe Global SDMX RegistrySDMX Reference Infrastructure (SDMX-RI)
20 Implementing Eurostat's data sharing strategy What is the EuroSDMX Registry(SER)? Eurostat's implementation of the SDMX Registry specifications as published by the SDMX initiative sdmx.org.Based on SDMX 2.1 (as published on April 2011) Also capable of importing and exporting SDMX 2.0 artefacts.Allows browsing, searching, editing and subscribing to artefacts.Advanced access control mechanism for distributed maintenance of artefacts controlling also their visibility.
21 Access to the content of the Registry advanced search Home pageAccess to the content of the Registry advanced searchAccess to the content of the Registry text searchAccess to the content of the Registry by typeMost recent items
22 ConclusionsInternational data co-operation improves the production of accurate, comparable and coherent statistics;SDMX promotes an incremental movement toward the data and metadata sharing model;The increasing use of SDMX based statistical standards improves the quality of the underlying statistical processes;The SDMX technical standards pave the ways for simplified exchange and dissemination processes helping to improve also timeliness and accessibility;Statistical integration needs to go hand-in-hand with technical integration and standardisation.
23 OutlookMuch more global data and metadata sharing in the years to come;Common data validation and processing procedures are required (from structural validation to content information validation);Better metadata driven statistics production systems: the use of standards throughout the processes in combination with common metadata registries ;Better harmonised international reference metadata frameworks and templates;Broadening the scope of SDMX (versioning of codes, disabling of dimensions, other formats like CSV, flat files etc.);Interoperability between information models (GSIM, SDMX, DDI etc.).