SDMX Information Model

Slides:



Advertisements
Similar presentations
Federal Department of Home Affairs FDHA Federal Statistical Office FSO Meeting of the OECD Expert Group on SDMX September, OECD, Paris Centralized.
Advertisements

SDMX training session on basic principles, data structure definitions and data file implementation 29 November
ESCWA SDMX Workshop Session: SDMX and Data. Session Objectives At the end of this session you will: –Know the SDMX model of a data structure definition.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
CountryData SDMX for Development Indicators Metadata in DevInfo: Creation, Mapping and Registration.
Sdmx web services Strutural data
METADATA HARMONISATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.
UNECE METIS work session on statistical metadata Luxembourg, 9 to 11 April SDMX as a source of standardised terminology: MCV and cross-domain concepts.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.
Eurostat – Directorate B: Corporate statistical and IT services SDMX Basics Training – 2013 SDMX basics Marco Pellegrino Eurostat, Directorate B.
GSIM implementation in the Istat Metadata System: focus on structural metadata and on the joint use of GSIM and SDMX Mauro Scanu
13-Jul-07 Implementation of SDMX for data and metadata exchange Balance of Payments Working Group 2-3 April 2012 Daniel Suranyi Eurostat B5 Management.
1 Eurostat Unit B5 – Statistical Information Technologies SDMX Basics – October 2011 SDMX Basics Core Elements Information Model Data Structure Definition.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Basics David Barraclough OECD SDMX Coordinator
Model and Representations
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
1 5a. SDMX and reference metadata exchanges Bogdan ZDRENTU Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
Implementation of SDMX for Balance of Payments Balance of Payments Working Group 9-10 April 2013 BP Daniel Suranyi Eurostat B5 Management of statistical.
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools SDMX.
IAEA International Atomic Energy Agency Implementing SDMX for Energy Domain: From Discussion to Actual Implementation and Testing Andrii Gritsevskyi Oslo.
4. SDMX: Main objects for data exchange
Metadata Standards for Statistical Classifications
Progress Update MSIS: Bratislava, April 2005
SDMX FROM NEEDS Kamel ABDELLAOUI Rafik Mahjoubi
Interoperable data formats: SDMX
SDMX Opportunities MED Meeting 14 May 2013 Daniel Suranyi Eurostat B5
7. SDMX practical use case: National Accounts
MSDs and combined metadata reporting
SDMX: A brief introduction
ESTP Training Course 8 & 9 April 2014 Fabien JACQUET Eurostat B5
Cross-domain concepts
ESCWA SDMX Workshop Session: Constraints.
CountryData SDMX for Development Indicators
Census Hub: Progress report
2. An overview of SDMX (What is SDMX? Part I)
Eurostat – Units E2, B5 Cristina BLANARU
2. An overview of SDMX (What is SDMX? Part I)
ESS technical standards and tools for quality reporting
Workshop on ESA 2010 transmission programme – What and how?
SDMX Information Model: An Introduction
Data Transmission Tools & Services EDAMIS, SDMX, Validation
Implementation of SDMX in the ESS
Developing a Data Model
SDMX Tools Overview and architecture
Statistical Information Technology
SDMX as basis for water data reporting
August Götzfried Eurostat unit B 4
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
Prepared by Peter Boško, Luxembourg June 2012
SDMX : General introduction H. Linden, Eurostat, Unit B5
SODI Live Demonstration
Item 7.3 (b) SDMX for UOE data collection
The role of metadata in census data dissemination
9. Practical use case 3: Pesticides Use Project
SDMX Implementation The National Accounts use case
1. SDMX: Background and purpose
7. Introduction to the main SDMX objects for metadata exchange
Developing SDMX artefacts for data exchange, sharing and dissemination
SDMX: Frequently Asked Questions
Standardizing and industrializing a business process – the dissemination use case Alessio Cardacino - ESTP Course “Information standards.
SDMX IT Tools SDMX Registry
SDMX Information Model
SDMX training Francesco Rizzo June 2018
Presentation transcript:

SDMX Information Model Apr 2017 SDMX Information Model this project aims to modernize our statistical system Rafik Mahjoubi Kamel ABDELLAOUI Rafik.MAHJOUBI@oecd.org abdellaoui.kamel@ins.tn

SDMX components SDMX Information Model Content Oriented Guidelines IT Infrastructure for exchange and sharing SDMX is not just a new data transmission format, in fact it consist of: an Information Model from which all the formats (for data and metadata, for EDI and XML) are derived and cover both data and metadata: Structural metadata identify and describe the data. They must be always associated with the data, otherwise it becomes impossible to identify, retrieve and browse the data. Reference or explanatory metadata: are metadata describing the contents and the quality of the statistical data, normally including "conceptual" metadata, describing the concepts used and their practical implementation; "methodological" metadata, describing methods used for the generation of the data (e.g. sampling, collection methods, editing processes); and "quality" metadata, describing the different quality dimensions of the resulting statistics (e.g. timeliness, accuracy). an architecture for an efficient exchange of data and metadata. In addition to the normal “push” transmission of data files (where the agency that has the data sends it as a file to the agency which needs the data), SDMX provides guidelines and tools to support the "pull" method of data sharing, where the collecting organisation retrieves the data from the provider's website. The data may be made available for download in an SDMX-conformant file, or they may be retrieved from a database in response to an SDMX-conformant query, via a web service running on the provider's server. In both cases, the data are made available to any organisation requiring them, in formats which ensure that data are consistently described by appropriate metadata, whose meaning is common to all parties in the exchange. COG: these are recommendations for categorising and describing data. The present set of guidelines consists of the Metadata Common Vocabulary (MCV) of terms used for describing statistics and their compilation processes (regardless of subject-matter domains) by national statistical authorities and international organisations; the Cross-Domain Concepts, which provide common descriptors for concepts used in DSDs for different statistical domains; the Statistical Subject-Matter Domains which provides a list of statistical domains based on the UNECE Classification of International Statistical Activities. Organisations are free to make use of whichever of these elements of SDMX are most appropriate in a given case.

SDMX Data Structure Definition building blocks Concepts that identify the observation value Concepts that add additional metadata about the observation value Concept that is the observation value Any of these may be coded text date/time number etc. Dimensions Attributes Measure Representation

“Data Structure Definition” “From Data Set” to “Data Structure Definition”

Cross-sectional slice Different ways to represent data Statistical data - Cube Time series Time series slice Tourism activity A100 B010 B020 Cross-sectional slice 1 257 1 250 1 216 1 220 Tourism activity fixed at B010 Cross-section for 2006 Time 2005 2006 2007 2004 The statistical table has three dimensions Country, Time and Tourism activity. Let's consider only one tourism activity: B010 If we fix the country (in this case France) we obtain one time series with a value for each time (in this case for each year). The values are the observations and correspond to the number of tourist campsites for France over the time. Now, if we fix the time (in this case the year 2006) we obtain one cross-section with a value for each country. The values are the observation and correspond to the number of tourist campsites for 2006 over the countries 12 57 8 289 2 529 Country FR IT ES AT 546

Statistical data cube Tourism activity Time Country B020 ES 2004

From a statistical table to a Data Structure Definition: Tourism example Concepts FREQUENCY TOURISM_ACTIVITY UNIT TOURISM_INDICATOR TIME e COUNTRY p OBS_VALUE OBS_STATUS

From a statistical table to a Data Structure Definition: Tourism example Conceptscheme

Do we need codes for concept values? From a statistical table to a Data Structure Definition: Tourism example Codelists Do we need codes for concept values?

From a statistical table to a Data Structure Definition: Tourism example Codelists CL_TOUR_INDICATOR CL_UNIT CL_TOUR_ACTIVITY CL_AREA

From a statistical table to a Data Structure Definition: Tourism example Codelists Each Code List is defined uniquely by: an ID, a maintenance agency, a version. The name can be provided in several languages.

ConceptScheme and Codelists From a statistical table to a Data Structure Definition: Tourism example ConceptScheme and Codelists

Reference to code lists Tourism example: DSD elements and references Code lists DSD Concept Scheme Dimensions Reference to code lists Groups Measures Reference to concepts Attributes

Data Structure Definition From a statistical table to a Data Structure Definition: Tourism example Data Structure Definition FREQUENCY TOURISM_ACTIVITY UNIT TOURISM_INDICATOR TIME e COUNTRY p OBS_VALUE OBS_STATUS DIMENSIONS ATTRIBUTES MEASURES

FREQUENCY TOURISM_ACTIVITY TOURISM_INDICATOR TIME COUNTRY UNIT TOURISM_INDICATOR E P TIME COUNTRY OBS_VALUE OBS_STATUS Until now we have defined the statistical concepts that identify the information contained in our statistical table. But to have a full picture of the table we need to assign roles to our concepts. Indeed, if we think this table as a slice of a statistical cube, each observation has coordinates that identify it uniquely. In our exemple: Frequency, Time Period, Country, Tourism Indicator and Tourism Activity are required to identify the statistics: they act as dimensions. The observation value, the actual figure, is what we call measure. The ‘observation status’ and "Unit" further explain the figures as an attribute. We can see that they are displayed at different levels of the table. This depends on whether the attribute explains only a single figure, a group of figures or the whole table. In our example the "unit" explains the whole table: it is attached at dataset level. The "Observation status" explains only single figures: it is attached at observation level. In SDMX it is also possible to attach attributes to a group of observations. This is not the case in our example. The result of our small exercise is a formal definition of the corresponding Data Structure Definition, also called DSD, with all the structural metadata of the table.

SDMX Data Structure Definition From a statistical table to a Data Structure Definition: Tourism example SDMX Data Structure Definition As all SDMX object the DSD has an ID, a name and a description in several languages. We will see later that the DSD is a maintainable object which can have many versions. The ID and the name are mandatory in the SMDX information model, the description is optional. The name should be provided at least in one language, generally in English. In the DSD, we can also define locally a format and a code list for the statistical concepts to be used later for validation purpose, But this is also optional. The SDMX information model is flexible enough to define local representations to be used only in a specific DSD that shares a common Concept scheme with other DSDs.

Tourism DSD: summary DIMENSIONS MEASURES ATTRIBUTES This table summarizes the DSD information for a Cross sectional representation of the Tourism table.

Examples of concrete classes

Example of flat Codelist Each Code List is defined uniquely by: an ID, a maintenance agency, a version. The name and description can be provided in several languages. Each item is defined by an ID, The name and description can be provided in several languages.

Example of flat codelist with simple hierarchy -BE2 -BE3 |-BE31 |-BE32 |-BE321 |-BE322 |-BE323 |-BE324 |-BE33 |-BE34 |-BE35

Example of Catgory Scheme + [DEMO] Demography + [STS] Short term Statistics |-[PROD]: Production

Example of Dataflow + Demography + Short term Statistics + Production - Indices Of Manufacturing…. SDMX