X-DIS project: final report

Slides:



Advertisements
Similar presentations
02-Oct-2008 European Forum for GeoStatistics 2008 in Bled Concept for an Integrated Web Solution / an Infrastructure for Geostatistics (Subproject 3)
Advertisements

Slide 1 Eurostat Directorate B – Statistical methods and tools; dissemination Towards implementation of SDMX – 9/11 January 2007 SDMX Open Data Interchange.
Agenda Item 3.3 SDMX reference architecture for NSIs Francesco Rizzo 24 th Meeting of the STNE Working Group “Statistics, Telematic Network & EDI”
Overview of SDMX: Statistical Data and Metadata eXchange Technical and Content Standards for Statistical Data Ann McPhail, Division Chief Statistics Department,
CountryData Development Improving the collation, availability and dissemination of development indicators (including the MDGs) Nairobi, 27 November 2013.
Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX Basics Training – 2012 IT architectures for data exchange SDMX-RI and the Hub approach.
Eurostat – Directorate B: Corporate statistical and IT services SDMX Basics Training – 2013 SDMX basics Marco Pellegrino Eurostat, Directorate B.
13-Jul-07 Implementation of SDMX for data and metadata exchange Balance of Payments Working Group 2-3 April 2012 Daniel Suranyi Eurostat B5 Management.
Francesco Rizzo (ISTAT - Italy) Stefano De Francisci (ISTAT – Italy) An integration approach for the Statistical Information System of Istat using SDMX.
1 Meeting on the Management of Statistical Information Systems (MSIS 2010) SDMX architecture for data sharing and interoperability Francesco Rizzo, ISTAT,
Slide 1 Eurostat Unit B3 – Statistical Information Technologies CoRD Meeting – 4 June 2007 Agenda Item 8 Preliminary ideas for a 2011 census hub Giuseppe.
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools Introduction
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
SDMX IT Tools SDMX use in practice in NA
7b. SDMX practical use case: Census Hub
Slide 1 Eurostat Unit B3 – Statistical Information Technology ITDG on October 2004 IDAbc Eurostat’s proposal for a statistical project in the European.
Eurostat Sharing data validation services Item 5.1 of the agenda.
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
The ESS vision, ESSnets and SDMX
The evolution of the SDMX infrastructure and services
The CVD Metadata Handler
SDMX Information Model
Census Hub in practice Working Group "European Statistical Data Support" Luxembourg, 29 April 2015.
SDMX: A brief introduction
ESSnet on SDMX phase II Laura Vignola
11. The future of SDMX Introducing the SDMX Roadmap 2020
SDMX Reference Infrastructure Introduction
SDMX Visualisation.
Census Hub: Progress report
Presentation contents:
2. An overview of SDMX (What is SDMX? Part I)
Eurostat – Units E2, B5 Cristina BLANARU
2. An overview of SDMX (What is SDMX? Part I)
SDMX Tools Architecture
Census Hub – progress report Francesco Rizzo Unit B5
SODI Demonstration of the results of the SODI Pilots
ESS.VIP VALIDATION An ESS.VIP project for mutual benefits
Results of the XBRL Pilot Project
Statistical Information Technology
SDMX as basis for water data reporting
Practical use cases of SDMX: Census Hub
CORA ESSNet COmmon Reference Architecture starting ...
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
Prepared by Peter Boško, Luxembourg June 2012
Point 6. Eurostat plans for Time Use Survey data processing and dissemination Working Group on Time Use Surveys 10 April 2013.
SDMX : General introduction H. Linden, Eurostat, Unit B5
A review of the 2011 census round in the EU, including the successful implementation of a detailed European legal base First meeting of the Technical Coordination.
Item of the Agenda Towards an integrated Eurostat metadata handler – Eurostat SDMX Registry services for Member States Francesco Rizzo Unit B3 13.
Item of the Agenda SODI: Progress made and next steps
SDMX Software Libraries Eurostat, Unit B5
SODI Live Demonstration
Demography applications of SDMX Giuseppe SINDONI, Unit B3
SDMX IT Tools SDMX use in practice in NA
SDMX Progress and implementation A. Götzfried, Unit B6
Item 7.3 (b) SDMX for UOE data collection
Streamlining statistical production
X-DIS project: progress report Leonhard Maqua, Unit B3
Implementing the “Vision” within the ESS
SDMX Implementation The National Accounts use case
European Census Hub: a cooperation model for dissemination of EU statistics Paper prepared by Ioannis Xirouchakis Presentation: Christine WIRTZ, Eurostat.
Theme 14: IT normalisation and collaborative infrastructure for the ESS Agenda item 2.2 In Agenda 2000, the Commission proposed to bring together the.
Implementing the “Vision” within ESS
Marco Pellegrino, Bengt-Åke Lindblad
Standardizing and industrializing a business process – the dissemination use case Alessio Cardacino - ESTP Course “Information standards.
SDMX IT Tools SDMX Registry
SDMX IT building blocks
Presentation transcript:

X-DIS project: final report Item 5.2 of the Agenda X-DIS project: final report Marco Pellegrino Unit B5, Statistical Information Technologies 21-22 October 2009

X-DIS project XDIS = XML for Data Interoperability in Statistics « Project of common interest » in the framework of IDABC Born in 2005 (1st Global Implementation Plan: 2006-2007) 2008-2009: new production phase Last projects running until end-2010 Time for a pre-final assessment 21-22 October 2009 IT Directors Group

Project report: achievements (1) Work area A.1: SDMX OSS applications supporting SDMX standards (Registry, Data Structure Wizard, Converter, …) SDMX Implementation strategy (registry operational, development of Data Structure Definitions and technical artefacts, capacity-building, training) Re-usable components for implementing SDMX in member States (2009-2010) 21-22 October 2009 IT Directors Group 2

Project report: achievements (2) Work area A.2: SODI SDMX Open Data Interchange Generic applications, data sharing model IT infrastructure working at Eurostat, supporting SDMX-based data sharing (re-usable for different domains, e.g. for the Census hub) New SODI with a different focus (pull mode, SDMX-ML, data and metadata, new list of domains) 21-22 October 2009 IT Directors Group 2

Project report: achievements (3) Work area A.3: Sectoral networks XBRL pilot project task-force completed: tested feasibility of exploiting XBRL reporting for statistics; potential to reduce response burden. Open-source tools on OSOR. Review of other sectoral XML standards: other than XBRL, no candidates for further work with real potential to lower response burden. MEETS programme in 2010. 21-22 October 2009 IT Directors Group 2

Project report: achievements (4) B.1: Visualisation “Business Cycle Clock” for PEEIs (OSS from CIRCA) plus tools for tables and graphics. B.2: SDMX web services for the download of Eurostat’s data; toolkit for reference metadata; study on e-services. B.3: Large Datasets: Analysis, proof-of-concept tool for the dissemination of large datasets; Census Hub pilot to be integrated within the dissemination environment. C.1 OSS: “OSS and Statistics” on CIRCA, then OSOR; Guidelines on OSS. 21-22 October 2009 IT Directors Group 2

Plans 2009–2010 SDMX: concentrating on the implementation of SDMX to support member States and on the improvement of the registry-based SDMX architecture SODI: improvement of tools Large Datasets: Census Hub, Euro Groups Register (largely re-using X-DIS tools and SODI infrastructure) Sectoral standards: MEETS programme eServices to retrieve SDMX data from Eurostat’s web site OSS: Further developments taking place on OSOR Among the plans for 2009-2010, I would like to stress one point: yesterday, several countries highlighted the need of using common principles and standards (SDMX has been mentioned by several countries) to design shared tools and services. This has been the core agenda for X-DIS during all these years. SDMX standards and IT architecture, together with statistical guidelines, are being used for a series of implementations involving ESS countries: European Census Hub (later), SODI, Euro-Groups Register (EGR). These experiences have highlighted some important points: Building the SDMX architecture, from a data producer point of view, requires the analysis of several factors, and the development of complex software modules. The exchange of know-how and software between NSIs – encouraged by Eurostat (see Census Hub) – has allowed in some cases a much quicker development of the IT infrastructure. National data are stored in Member States' repositories, and described differently from how they could be in the SDMX DSD defined at international level. The start-up phase is crucial, because the expert knowledge of SDMX standards, XML and related technologies (e.g. Web Services) is not easily available. 21-22 October 2009 IT Directors Group 2

SDMX benefits: the NSI perspective SDMX can reduce reporting burden to national, European and international institutions The use of SDMX can improve harmonisation, standardisation and integration processes inside a NSI International “community” to share experiences and software. Open Source culture Eurostat, upon request, provides technical advice to NSIs interested in starting SDMX projects (missions, training) Eurostat designed a reference architecture for NSIs and is developing building blocks through its implementations SDMX can reduce the reporting burden, but standards are not enough without tools, technical advice, exchange of experiences and software, and clear architectures. A new action on “SDMX implementation and support in MS”, which is on-going, includes the analysis of some national architectures, with a particular attention to solutions which are already shared in the statistical community (e.g. PC-AXIS) and the inventory of existing software developed at national level, together with proposals on how to integrate them in the SDMX reference architecture and with the components already developed by Member States which participate in SDMX projects. This action was presented in a Technical workshop held in Madrid (22-23/9) with a good success: documentation available from CIRCA. This project is also going to provide useful material for the SDMX ESSnet. Within this framework, the specifications for the so-called “SDMX reference architecture”, together with the definition of single components and the interfaces between building blocks, are being made available on CIRCA. As you know, there are different possible architectures for data exchange... 21-22 October 2009 IT Directors Group 8

This is a “simplified” UML picture of the SDMX Reference architecture for MSs. It represents the synthesis of several experiences worldwide and can be considered not a strict specification but rather a guide or “best practice”. The objective is to provide a description of a generalized architecture to be used partially or as a whole by MSs interested in starting SDMX projects Dissemination database: This is the final storage data warehouse being kept from each NSI for data that can be published to potential Data Consumers Mapping assistant: module responsible for creating the mappings between an SDMX Data Structure Definition (DSD) and a DB schema (dissemination database) or a set of dissemination data files. It maps the DB schema from the database to the SDMX DSD (“SDMX Structure File” artefact) Mapping Store: module responsible for keeping the mappings between the SDMX and the native format (a file or a DB schema) SDMX Structure File: This artefact is the SDMX-ML DSD required by the “Mapping Assistant” module in order to map its component (i.e. Dimensions, Attributes, Measures) to the dissemination db columns and tables Data Retriever: This module is responsible for querying the dissemination database and getting the respective recordset Data Loader: module responsible for loading new data from the NSI’s production environment/database to the dissemination environment/database and updating the module “RSS Generator” SDMX-ML Data Generator: module responsible, upon receiving the recordset and the respective mappings from the “Data Retriever”, for generating an SDMX-ML Dataset message Web Service Provider: module responsible for exposing the Dataset using a Web Service interface that provides SDMX-ML messages RSS Generator: module responsible for generating a feed entry on the event of new data arriving from the “Data Loader”. SDMX Query Parser: module responsible for getting the request from the “Web Service Provider” and populate the internal data model, i.e. sdmx data model Eurostat Unit B5 – Statistical Information Technologies STNE 24th Meeting – 16-17 June 2009 9 9

Data Repository (Warehousing) Architecture NSI Eurostat Pull Requestor eDAMIS Data Input SDMX Registry Intermediate storage Verification / Conversion To SDMX Received data in SDMX-ML Loader register Warehouse Eurobase query Dissemination XSL for P U L S H In SDMX, we can have an architecture based on data warehousing, for which we can distinguish a « push » or a « pull » mode. In the push mode, the data provider takes action to send the data to the organisation collecting the data. This can take place using different means, such as e-mail or file transfer. These are the traditional modes of data collection, carried out by international organisations for many years. Once the file is received, an application based in the recipient systems processes it and uploads data in a data base. The chain in the receiving organisation can be fully automatic, ensuring the best quality of data exchange. In the pull approach, the data provider makes the data available for the users: for download in a SDMX-conformant file; as a result from a query to a web service linked to a database on the provider's side. More precisely, the organisation that consumes statistical data can for example subscribe to a RSS flow and receive in real time the last links to available data. Another scenario could be the consumer organisation system sends a SMDX-ML query file to the data provider's web service and get the requested data file. Finally, the data provider also has the possibility to deliver statistical data files in a shared place accessible by authorized organisation for download. Note that in both cases, the data are made available to any organisation requiring them, in formats which ensure that data are consistently described by appropriate metadata, whose meaning is common to all parties in the exchange. The Single Entry Point allows both push and pull methods: eDAMIS is now able to recognise and deliver SDMX-ML files. An SDMX-ML module for representing validation rules was developed and it is used by the eDAMIS validation engine. The pull approach concerns the following steps: Step 1: when new data are available, the NSI should: Create an SDMX-ML file containing the new data, or Do nothing if the NSI WS builds SDMX-ML messages upon request Step 2: the NSI should add a new feed entry, including an SDMX-ML Query message describing the new Dataset, to the NSI feed. The Pull Requestor reads the new feed entry and: Retrieves the SDMX-ML file from the specified URL, if it resides in a URL, or Uses the Query Message included in the feed to query the NSI WS, if the data are prepared by the WS The Pull Requestor forwards the SDMX-ML dataset to the rest of the modules within Eurostat production environment 21-22 October 2009 IT Directors Group 10

SDMX Architecture (Hub mode) SDMX also supports the “Data Hub” concept/architecture, where users obtain data from a central hub which itself automatically assembles the required dataset by querying other data sources. Data providers can notify the hub of new sets of data and corresponding structural metadata (measures, dimension, code lists, etc.) and make data available directly from their systems through queries. Data users can browse the hub to define a dataset of interest via the above structural metadata and retrieve the desired dataset. From the data management point of view, the hub is also based on a specific datasets, but these – contrary to the database-driven architecture – are not kept locally at the central hub system. The agreed hypercubes are fetched directly from the data producer databases when a user requests them. SDMX formats and architecture are used In the Data sharing context, both architectures reduce the burden of transferring data to multiple counterparties, if a group of partners agree on providing access to their data according to standard processes, formats and technologies. The following process operates: A user identifies a dataset through the web interface of the central hub using the structural metadata, and requests it; The central hub translates the user request in one or more queries and sends them to the related data providers’ systems; Data providers’ systems process the query and send the result to the central hub in standard format; The central hub puts together all the results originated by all interested data providers’ systems and presents them in a human readable format.

Issues Costs and benefits of the implementation Comparison of achievements with the expected results Sustainability for Eurostat and member States Good progress reached in creating and fine-tuning SDMX standards, tools and reference architecture Emphasis on the integration of information systems: SDMX at the core of the harmonisation of the statistical business process. 21-22 October 2009 IT Directors Group

For more information: Bengt-Åke.Lindblad Francesco.Rizzo Krassimir.Ivanov @ec.europa.eu Marco.Pellegrino Michel.Henrard Unit B5, Section « Standardisation and advanced IT for statistics » 21-22 October 2009 IT Directors Group