Presentation is loading. Please wait.

Presentation is loading. Please wait.

SDMX Information Model

Similar presentations


Presentation on theme: "SDMX Information Model"— Presentation transcript:

1 SDMX Information Model
Apr 2017 SDMX Information Model this project aims to modernize our statistical system Rafik Mahjoubi Kamel ABDELLAOUI

2 SDMX components SDMX Information Model Content Oriented Guidelines
IT Infrastructure for exchange and sharing SDMX is not just a new data transmission format, in fact it consist of: an Information Model from which all the formats (for data and metadata, for EDI and XML) are derived and cover both data and metadata: Structural metadata identify and describe the data. They must be always associated with the data, otherwise it becomes impossible to identify, retrieve and browse the data. Reference or explanatory metadata: are metadata describing the contents and the quality of the statistical data, normally including "conceptual" metadata, describing the concepts used and their practical implementation; "methodological" metadata, describing methods used for the generation of the data (e.g. sampling, collection methods, editing processes); and "quality" metadata, describing the different quality dimensions of the resulting statistics (e.g. timeliness, accuracy). an architecture for an efficient exchange of data and metadata. In addition to the normal “push” transmission of data files (where the agency that has the data sends it as a file to the agency which needs the data), SDMX provides guidelines and tools to support the "pull" method of data sharing, where the collecting organisation retrieves the data from the provider's website. The data may be made available for download in an SDMX-conformant file, or they may be retrieved from a database in response to an SDMX-conformant query, via a web service running on the provider's server. In both cases, the data are made available to any organisation requiring them, in formats which ensure that data are consistently described by appropriate metadata, whose meaning is common to all parties in the exchange. COG: these are recommendations for categorising and describing data. The present set of guidelines consists of the Metadata Common Vocabulary (MCV) of terms used for describing statistics and their compilation processes (regardless of subject-matter domains) by national statistical authorities and international organisations; the Cross-Domain Concepts, which provide common descriptors for concepts used in DSDs for different statistical domains; the Statistical Subject-Matter Domains which provides a list of statistical domains based on the UNECE Classification of International Statistical Activities. Organisations are free to make use of whichever of these elements of SDMX are most appropriate in a given case.

3 SDMX Data Structure Definition building blocks
Concepts that identify the observation value Concepts that add additional metadata about the observation value Concept that is the observation value Any of these may be coded text date/time number etc. Dimensions Attributes Measure Representation

4 “Data Structure Definition”
“From Data Set” to “Data Structure Definition”

5 Cross-sectional slice
Different ways to represent data Statistical data - Cube Time series Time series slice Tourism activity A100 B010 B020 Cross-sectional slice 1 257 1 250 1 216 1 220 Tourism activity fixed at B010 Cross-section for 2006 Time 2005 2006 2007 2004 The statistical table has three dimensions Country, Time and Tourism activity. Let's consider only one tourism activity: B010 If we fix the country (in this case France) we obtain one time series with a value for each time (in this case for each year). The values are the observations and correspond to the number of tourist campsites for France over the time. Now, if we fix the time (in this case the year 2006) we obtain one cross-section with a value for each country. The values are the observation and correspond to the number of tourist campsites for 2006 over the countries 12 57 8 289 2 529 Country FR IT ES AT 546

6 Statistical data cube Tourism activity Time Country B020 ES 2004

7 From a statistical table to a Data Structure Definition: Tourism example
Concepts FREQUENCY TOURISM_ACTIVITY UNIT TOURISM_INDICATOR TIME e COUNTRY p OBS_VALUE OBS_STATUS

8 From a statistical table to a Data Structure Definition: Tourism example
Conceptscheme

9 Do we need codes for concept values?
From a statistical table to a Data Structure Definition: Tourism example Codelists Do we need codes for concept values?

10 From a statistical table to a Data Structure Definition: Tourism example
Codelists CL_TOUR_INDICATOR CL_UNIT CL_TOUR_ACTIVITY CL_AREA

11 From a statistical table to a Data Structure Definition: Tourism example
Codelists Each Code List is defined uniquely by: an ID, a maintenance agency, a version. The name can be provided in several languages.

12 ConceptScheme and Codelists
From a statistical table to a Data Structure Definition: Tourism example ConceptScheme and Codelists

13 Reference to code lists
Tourism example: DSD elements and references Code lists DSD Concept Scheme Dimensions Reference to code lists Groups Measures Reference to concepts Attributes

14 Data Structure Definition
From a statistical table to a Data Structure Definition: Tourism example Data Structure Definition FREQUENCY TOURISM_ACTIVITY UNIT TOURISM_INDICATOR TIME e COUNTRY p OBS_VALUE OBS_STATUS DIMENSIONS ATTRIBUTES MEASURES

15 FREQUENCY TOURISM_ACTIVITY TOURISM_INDICATOR TIME COUNTRY
UNIT TOURISM_INDICATOR E P TIME COUNTRY OBS_VALUE OBS_STATUS Until now we have defined the statistical concepts that identify the information contained in our statistical table. But to have a full picture of the table we need to assign roles to our concepts. Indeed, if we think this table as a slice of a statistical cube, each observation has coordinates that identify it uniquely. In our exemple: Frequency, Time Period, Country, Tourism Indicator and Tourism Activity are required to identify the statistics: they act as dimensions. The observation value, the actual figure, is what we call measure. The ‘observation status’ and "Unit" further explain the figures as an attribute. We can see that they are displayed at different levels of the table. This depends on whether the attribute explains only a single figure, a group of figures or the whole table. In our example the "unit" explains the whole table: it is attached at dataset level. The "Observation status" explains only single figures: it is attached at observation level. In SDMX it is also possible to attach attributes to a group of observations. This is not the case in our example. The result of our small exercise is a formal definition of the corresponding Data Structure Definition, also called DSD, with all the structural metadata of the table.

16 SDMX Data Structure Definition
From a statistical table to a Data Structure Definition: Tourism example SDMX Data Structure Definition As all SDMX object the DSD has an ID, a name and a description in several languages. We will see later that the DSD is a maintainable object which can have many versions. The ID and the name are mandatory in the SMDX information model, the description is optional. The name should be provided at least in one language, generally in English. In the DSD, we can also define locally a format and a code list for the statistical concepts to be used later for validation purpose, But this is also optional. The SDMX information model is flexible enough to define local representations to be used only in a specific DSD that shares a common Concept scheme with other DSDs.

17 Tourism DSD: summary DIMENSIONS MEASURES ATTRIBUTES
This table summarizes the DSD information for a Cross sectional representation of the Tourism table.

18 Examples of concrete classes

19 Example of flat Codelist
Each Code List is defined uniquely by: an ID, a maintenance agency, a version. The name and description can be provided in several languages. Each item is defined by an ID, The name and description can be provided in several languages.

20 Example of flat codelist with simple hierarchy
-BE2 -BE3 |-BE31 |-BE32 |-BE321 |-BE322 |-BE323 |-BE324 |-BE33 |-BE34 |-BE35

21 Example of Catgory Scheme
+ [DEMO] Demography + [STS] Short term Statistics |-[PROD]: Production

22 Example of Dataflow + Demography + Short term Statistics + Production
- Indices Of Manufacturing…. SDMX


Download ppt "SDMX Information Model"

Similar presentations


Ads by Google