Presentation is loading. Please wait.

Presentation is loading. Please wait.

SDMX IT Tools SDMX Converter

Similar presentations


Presentation on theme: "SDMX IT Tools SDMX Converter"— Presentation transcript:

1 SDMX IT Tools SDMX Converter
Jean-Francois LEBLANC Christian SEBASTIAN 11-13 May 2015 Eurostat, Unit B3 – IT solutions for statistical production

2 Table of Contents Objectives What is the Converter Interfaces
Design/Implementation Minimum requirements Supported formats Interfaces GUI CLI Common API Web Services

3 Table of Contents SMDX Converter vs SDMX-RI Where is the Converter
Hands-on exercise

4 1. Objectives Illustrate how to use the Converter
List all the supported formats Explain its interfaces Clarify when it is recommended State its limitations

5 2. What is the Converter The SDMX converter is a JAVA application that converts between all the following formats* SDMX 2.0 Generic, Compact, Utility And Cross-sectional. SDMX 2.1 Generic Data, Generic TS, Structure-specific, Structure-specific TS GESMES TS, 2.1, DSIS CSV, FLR MESSAGE GROUP (special SDMX 2.0 format) DSPL Excel *Limitations may apply. Please, check the User Manual. The SDMX Converter has been developed in the context of the SODI project. It is a Java tool that allows converting from/to various dataset types based on SDMX-ML DSDs. The allowed types between which the Converter is capable of converting are: SDMX-ML formats (in order to go from/to Cross-Sectional type the DSD should contain Cross-Sectional information and TimeDimension) GESMES/TS (aka SDMX-EDI) (not capable of converting from/to SDMX-ML Cross-Sectional format) GESMES/2.1 and GESMES/DSIS CSV, FLR, EXCEL, DSPL (supports mapping mechanism and parametric delimiter for CSV. Converting to CSV/FLR from other formats may result in loss of attributes attached at a higher level than observation.)

6 2.1 Design/Implementation
Steps to proceed a conversion Reading the input message Parsing of the message Populating the data model of the tool (based on the SDMX information model) Reading the DSD The DSD is retrieved from the Registry in order to complete the conversion The DSD can be loaded from files so no connection is needed Writing the converted message Uses the data model to write the output message in the target format <pagebreak> The conversion process comprises two main activities; reading an input data message and writing out the converted data message. The are specific modules that read and write datasets i.e SDMX-ML (Compact, Generic, Utility, Cross-Sectional) Gesmes (TS, 2.1, DSIS) Flat files (CSV FLR). The information of a dataset to converter is stored in classes that are based on the SDMX Information Model v2.0. These classes play the role of an intermediate format between readers and writers. The Data Structure Definition related to the converted datasets is needed for performing a conversion. SDMX Converter, if it’s is not provided manually, can retrieved the DSD from the Registry.

7 2.2 Minimum requirements Input file Output file (complete path)
Format for input and output files Specify DSD DSD file Reference to a DSD file in the Registry Reference to a Dataflow file in the Registry

8 2.3 Supported formats SDMX 2.0 GENERIC SDMX COMPACT SDMX UTILITY SDMX CROSS-SECTIONAL SDMX MESSAGE GROUP SDMX 2.1 GENERIC DATA 2.1 GENERIC TS DATA 2.1 STRUCTURE SPECIFIC DATA 2.1 STRUCTURE SPECIFIC TS DATA 2.1 GESMES GESMES TS GESMES 2.1 GESMES DSIS OTHERS CSV FLR DSPL EXCEL <pagebreak> The conversion process comprises two main activities; reading an input data message and writing out the converted data message. The are specific modules that read and write datasets i.e SDMX-ML (Compact, Generic, Utility, Cross-Sectional) Gesmes (TS, 2.1, DSIS) Flat files (CSV FLR). The information of a dataset to converter is stored in classes that are based on the SDMX Information Model v2.0. These classes play the role of an intermediate format between readers and writers. The Data Structure Definition related to the converted datasets is needed for performing a conversion. SDMX Converter, if it’s is not provided manually, can retrieved the DSD from the Registry. Limitations may apply. Please, check the User Manual.

9 3. Interfaces User interface Command line Web service API

10 3.1 GUI (Converter 5.1.0) 1. Selection of the input/output files and their format 2.a Select the DSD in the local drive 2.b Identify a DSD to download from the SDMX Registry (configuration required) 2. If the local DSD includes multiple versions, we can specify the one derired 2. If the local DSD includes multiple versions, we can specify the one desired 2.c Identify a dataflow linked to the DSD to download from the SDMX Registry (configuration required) 3. Excel parameter file There are some mandatory fields, e.g. the DSD is mandatory, therefore, either loading from a file or from the Registry, it must be present. A message will appear if there are mandatory fields missing. 3. SDMX header (.prop file) Only for flat and excel files CSV parameters 4. Mapping 5. CSV quotation 6. SDMX (output) validation XML parameters for SDMX output formats

11 3.1.1 Example 1. Input and output files 2. Format
3. DSD file or reference

12 3.1.2 Header

13 3.1.3 Change mapping

14 3.1.4 Transcoding

15 3.2 CLI Converter Options Windows OS: converter.bat
Converter [Options] InputFile OutputFile InputFormat OutputFormat Converter Windows OS: converter.bat Unix OS: converter.sh Options -reg, -dsd_file, -dsd_id, -dsd_agency, -dsd_version, -df, -df_id, -df_version, -df_agency, -header_file, -date_format, -level, -mapping_file, -ordered_input, -trans_file, -delimiter, header_row For further information check the User Manual page 84. converter.bat -dsd_file "C:\ProjectsSharp\myOutputs\First_NA_MAIN_DSD.xml" -header_file "C:\ProjectsSharp\myOutputs\Input_DatasHeader.prop" -header_row DISREGARD_COLUMN_HEADER -delimiter ; "C:\ProjectsSharp\myOutputs\Input_Datas.csv" "C:\ProjectsSharp\myOutputs\CLI_Compact.xml" CSV COMPACT_SDMX

16 3.2.1 Example in Windows converter.bat -dsd_file "C:\ProjectsSharp\myOutputs\First_NA_MAIN_DSD.xml" -header_file "C:\ProjectsSharp\myOutputs\Input_DatasHeader.prop" -header_row DISREGARD_COLUMN_HEADER -delimiter ; "C:\ProjectsSharp\myOutputs\Input_Datas.csv" "C:\ProjectsSharp\myOutputs\CLI_Compact.xml" CSV COMPACT_SDMX

17 3.3 Common API <pagebreak>
The conversion process comprises two main activities; reading an input data message and writing out the converted data message. In previous releases of this tool the first activity resulted in a populated data model, based on the SDMX v2.0 information model. The second activity used that populated data model to write the output message in the required target format. In other words the data model was used as an intermediate storage of the parsed data. That solution had a significant advantage; the data model was used as a common ‘format’ which all readers should write to and all writers read from. That way the number of possible combinations of source and target formats was significantly reduced; only conversion from all other formats to the data model ‘format’ and then from that back to all other formats needed be implemented. On the other hand that same solution had an apparent drawback; the whole data model should be stored on system memory, which could not be enough for large datasets, even for systems with very large amounts of available memory. To tackle that obstacle a redesign of the conversion process was needed. In this version of the conversion tool its readers and writers are sort of ‘plugged together, in the sense that all writers should implement a ‘Writer’ interface and all readers should be capable of making calls to that interface. While the data model is still used as an intermediary, this time only chunks of that model get stored in system memory. Each of those chunks accounts for only a portion of the complete data message, which may correspond to a populated timeseries (including all its observations), or sibling group (only attribute data, or dataset (only attribute data, not including its groups and timeseries), or a message header. As soon as one of those chunks is completely populated it is send to the ‘plugged writer, by calling the appropriate method of the Writer interface, and then is removed from system memory. The previous version follows Data Storage Model while the latest one follows a Streaming Model. Nevertheless, for backwards compatibility mostly, all readers and writers still provide a method implementing the previous solution of using a complete populated data model, bearing of course the aforementioned disadvantage.

18 3.4 Web Service A web service also exists for the Converter
Based on Java Can be installed on Tomcat Server or Weblogic A Test Client is provided to test the Web Service conversions (only available for Windows)

19 3.4.1 Wsdl

20 3.4.2 Web Service client

21 4. SMDX Converter vs SDMX-RI
SDMX Converter SDMX-RI Standalone application Needs to be installed on a server File repository Connected to dissemination DB Generates SDMX files from input files Generates SDMX files from customized SDMX queries

22 5. Where to find the SDMX Converter
You can download the latest version of the SDMX converter on CIRCABC Available packages SDMX Converter Platform Independent SDMX Converter Web Service SDMX Converter installer for Windows 32-bit SDMX Converter Documentation

23 6. Hands-on exercise Conversion using the Common API in JAVA.
Eclipse Java EE (or equivalent) SDMXSource code ( Input file and DSD

24 SDMX Converter


Download ppt "SDMX IT Tools SDMX Converter"

Similar presentations


Ads by Google