5 June 2013 SDMX Technical Working Group Luxembourg 1 5 June 2013 SDMX Technical Working Group Luxembourg 1 WP Item 6 The Expressions Language of Banca.

Slides:



Advertisements
Similar presentations
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Advertisements

Introduction to Structured Query Language (SQL)
TC3 Meeting in Montreal (Montreal/Secretariat)6 page 1 of 10 Structure and purpose of IEC ISO - IEC Specifications for Document Management.
Software Engineering Tools and Methods Presented by: Mohammad Enamur Rashid( ) Mohammad Rashim Uddin( ) Masud Ur Rahman( )
Software Configuration Management
Balance of Payments Collection and Compilation 23 Feb 2012 Central Statistics Office Ireland.
Background Data validation, a critical issue for the E.S.S.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 1: Introduction.
WP.5 - DDI-SDMX Integration
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 13 Database Management Systems: Getting Data Together.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Metadata management and statistical business process at Statistics Estonia Work Session on Statistical Metadata (Geneva, Switzerland 8-10 May 2013) Kaja.
Database Technical Session By: Prof. Adarsh Patel.
CST203-2 Database Management Systems Lecture 2. One Tier Architecture Eg: In this scenario, a workgroup database is stored in a shared location on a single.
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Introduction to Database Systems
Vincenzo Del Vecchio Banca d’Italia Statistics Collection and Processing Department 2012 ESSnet Workshop – 4 December.
Session IV - Use of administrative data for data collection - Statistics Belgium Geneva, 31 October – 2 November.
METADATA HARMONISATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.
Lecture 7 Integrity & Veracity UFCE8K-15-M: Data Management.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Planning and development of integrated economic statistics in Europe The case of euro area financial statistics Werner Bier European Central Bank Berne,
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
Data resource management
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
Metadata Working Group Jean HELLER EUROSTAT Directorate A: Statistical Information System Unit A-3: Reference data bases.
Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing.
1 SDMX Global Conference September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Conceptualization Relational Model Incomplete Relations Indirect Concept Reflection Entity-Relationship Model Incomplete Relations Two Ways of Concept.
13-14 December 2012 SDMX Technical Working Group Paris WP Item 6 Expressions and Calculations.
Harry Goossens Centre of Competence on Data Warehousing.
3 June 2013 SDMX Technical Working Group Luxembourg 1 WP Item 6 Expressions and Calculations.
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Fundamental of Database Systems
Metadata models to support the statistical cycle: IMDB
Aga Private computer Institute Prepared by: Srwa Mohammad
Data sharing at Deutsche Bundesbank: The House of Microdata
An Introduction to database system
Systems Analysis and Design
Database Management:.
Classical Waterfall Model
Modern Systems Analysis and Design Third Edition
Exchanging Reference Metadata using SDMX
Chapter 9 Designing Databases
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
Physical Database Design
ESSnet on SDMX phase II Laura Vignola
YTY − an integrated production system for business statistics
6.1 Quality improvement Regional Course on
Data Model.
Metadata Framework as the basis for Metadata-driven Architecture
SDMX Information Model: An Introduction
Contents Preface I Introduction Lesson Objectives I-2
Metadata The metadata contains
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
Prepared by Peter Boško, Luxembourg June 2012
Parallel Session: BR maintenance Quality in maintenance of a BR:
Item 7.11 SDMX Progress report
COmmon REference Environment - CORE:
Presentation transcript:

5 June 2013 SDMX Technical Working Group Luxembourg 1 5 June 2013 SDMX Technical Working Group Luxembourg 1 WP Item 6 The Expressions Language of Banca d’Italia (EXL)

5 June 2013 SDMX Technical Working Group Luxembourg 2 5 June 2013 SDMX Technical Working Group Luxembourg 2 History Mid nineties: Banca d’Italia designed a Language for validations and calculations 2009: A new version of the EXL was released as part of the new Infostat software platform, containing the operators needed for validation and basic calculation On-going: progressive upgrade of the EXL for supporting the data compilation, for example: –Operators for time series manipulation –Operators for data analysis –Operators’ syntax upgrade

5 June 2013 SDMX Technical Working Group Luxembourg 3 5 June 2013 SDMX Technical Working Group Luxembourg 3 Basic example of validation rule Check ()special operator for checks - subtract the multidimensional data <=comparison operator EXPRESSIONS: C3 = get ( C1, keep (DATE, ENTITY, AMOUNT), sum (AMOUNT)) C4 = get ( C2, keep (DATE, ENTITY, AMOUNT), sum (AMOUNT)) C5 = check ( C3 – C4 <= given_threshold ) Two collected data: C1: Loans - Date, Entity, Sector, Amount C2: Loans - Date, Entity, Geo_Area, Amount Check rule: C1 and C2 should be equal if aggregated on their common dimensions (for less than a small amount) EXL operators:

5 June 2013 SDMX Technical Working Group Luxembourg 4 5 June 2013 SDMX Technical Working Group Luxembourg 4 Example of sum Get ()read the specified data Keep ()keep the specified dimensions Sum ()sum the specified measure (if quantitative) + sum the multidimensional data EXPRESSIONS: C3 = get ( C1, keep (DATE, ENTITY, AMOUNT), sum (AMOUNT)) C4 = get ( C2, keep (DATE, ENTITY, AMOUNT), sum (AMOUNT)) C5 = C3 + C4 Two collected data: C1 (Current Accounts): Date, Entity, Sector, Amount C2 (Mortgages):Date, Entity, Geo_Area, Amount The desired result is Loans (= Current Accounts + Mortgages): C5 (Loans):Date, Entity, Amount EXL operators:

5 June 2013 SDMX Technical Working Group Luxembourg 5 5 June 2013 SDMX Technical Working Group Luxembourg 5

5 June 2013 SDMX Technical Working Group Luxembourg 6 5 June 2013 SDMX Technical Working Group Luxembourg 6

5 June 2013 SDMX Technical Working Group Luxembourg 7 5 June 2013 SDMX Technical Working Group Luxembourg 7

5 June 2013 SDMX Technical Working Group Luxembourg 8 5 June 2013 SDMX Technical Working Group Luxembourg 8

5 June 2013 SDMX Technical Working Group Luxembourg 9 5 June 2013 SDMX Technical Working Group Luxembourg 9

5 June 2013 SDMX Technical Working Group Luxembourg 10 5 June 2013 SDMX Technical Working Group Luxembourg 10

5 June 2013 SDMX Technical Working Group Luxembourg 11 5 June 2013 SDMX Technical Working Group Luxembourg 11

5 June 2013 SDMX Technical Working Group Luxembourg 12 5 June 2013 SDMX Technical Working Group Luxembourg 12

5 June 2013 SDMX Technical Working Group Luxembourg 13 5 June 2013 SDMX Technical Working Group Luxembourg 13

5 June 2013 SDMX Technical Working Group Luxembourg 14 5 June 2013 SDMX Technical Working Group Luxembourg 14

5 June 2013 SDMX Technical Working Group Luxembourg 15 5 June 2013 SDMX Technical Working Group Luxembourg 15

5 June 2013 SDMX Technical Working Group Luxembourg 16 5 June 2013 SDMX Technical Working Group Luxembourg 16 Validation Formal (Structural) –assurance that the formal structure of the data observations matches the Data Structure Definition, in term of concepts, their roles and their admissible values; the formal validation is not treated as a calculation and is not defined through an expression; Of the Information Content (Plausibility) –Assurance that the data content gives right information about the real world (as much as possible); to this end, it is possible to use the a-priori information about the real world and the possible redundancies of the data (e.g. the integrity rules, coherence rules, plausibility rules); this kind of validation rules is normally performed through calculations,

5 June 2013 SDMX Technical Working Group Luxembourg 17 5 June 2013 SDMX Technical Working Group Luxembourg 17 Validations as calculations Use of the same language of the calculations Validations possible in any phase of the process Results of the Validations like any other data are defined and stored can be inquired and disseminated can be further processed

5 June 2013 SDMX Technical Working Group Luxembourg 18 5 June 2013 SDMX Technical Working Group Luxembourg 18 SDMX Compliance The SDMX 2.0 and 2.1 versions already envisaged the introduction of a standard language for validations and calculations The SDMX 2.1 package n. 13 (Transformations and Expressions) is a generic model aimed to track the validation and the calculation of data, derived from the CWM (Common Warehouse Metamodel), a OMG standard (Object Management Group) However this model is not operational in-itself, because it requires a language to specify the validation and calculation expressions The EXL is designed according with the SDMX package n. 13 – Transformations and Expressions

5 June 2013 SDMX Technical Working Group Luxembourg 19 5 June 2013 SDMX Technical Working Group Luxembourg 19 SDMX IM – Package 13

5 June 2013 SDMX Technical Working Group Luxembourg 20 5 June 2013 SDMX Technical Working Group Luxembourg 20 Transformations; internal view Einstein equation E = MC 2  E = M*(C**2) Operator: ** b p Operand: 2 Operand: C Result: E Operator: * f f Operand: M Constant node Reference nodes Operator nodes Expression nodes 0..*

5 June 2013 SDMX Technical Working Group Luxembourg 21 5 June 2013 SDMX Technical Working Group Luxembourg 21 Transformations: User view Einstein equation E = MC 2  E = M*(C**2) Operand: 2 Operand: C Result: E Expression: E = M*(C**2) Operand: M

5 June 2013 SDMX Technical Working Group Luxembourg 22 5 June 2013 SDMX Technical Working Group Luxembourg 22 Notes on Transformations The Operands may be: Artefacts of the model (e.g. Statistical Data) Constants Operator nodes The property of “Closure” The result is an artefact of the model (e.g. Statistical Data) The result may be operand of other calculations

5 June 2013 SDMX Technical Working Group Luxembourg 23 5 June 2013 SDMX Technical Working Group Luxembourg 23 Graph of the calculations External Institutions C1C1 C2C2 C3C3 C4C4 C5C5 T1T1 T3T3 T2T2 C10C10 C 12 C 13 C 15 C 17 C 16 T 13 T 12 T14T14 Banks & OFI’s reports C.C.R. C 21 C 22 C 23 C 24 T 22 T 21 C 51 C 52 T 53 T 52 T 51 Economic research models C 54 C 53 T 54 C 60 C 61 Statistical bulletin T 60 T 61 Statistical products C 70 T 71 T 70 T 72 C 71 C 72 C 41 T 42 T 41 C 42 Supervision models

5 June 2013 SDMX Technical Working Group Luxembourg 24 5 June 2013 SDMX Technical Working Group Luxembourg 24 Software Tools Dictionary, that is a data base containing all the definitions Warehouse, that is the complex of data archives containing the data, logically unique but also physically heterogeneous and distributed Tool for the administration of the metadata (create, modify, etc.), including the expressions for calculations and controls (this package is built in-house) Tool for validation of the expressions syntax and consistency and for translation of the expressions in the language of the calculation tools (based on the open source ANTLR under the control of a software built in-house) Execution of the expressions, that is the calculation engine of the software platform, based on a software layer developed in-house that interfaces and controls the calculation software, which in turn can be various: currently it is used the open-source R, the SQL, some software built in-house and optimized for specific purposes.

5 June 2013 SDMX Technical Working Group Luxembourg 25 5 June 2013 SDMX Technical Working Group Luxembourg 25 Allowed Data The EXL is applied to any kind of data of interest in the Bank of Italy statistical environment, like –Dimensional data, including as particular cases time series cross sections –Questionnaires –Registers the Bank of Italy is gradual extending the use of EXL to the whole statistical information system to support its industrial processing

5 June 2013 SDMX Technical Working Group Luxembourg 26 End to End processing 1.Data are integrated with information relevant to the collected security codes 2.Missing observations are estimated 3.data are aggregated Use case: production of the information for the ECB concerning the balance sheet of the monetary and financial institutions sector Collect data on securities from MFI on a security by security basis. securities register data Dataflow to ECB Structural and integrity checks Disseminate Process CollectBuild Design check Collect process disseminate check

5 June 2013 SDMX Technical Working Group Luxembourg 27 5 June 2013 SDMX Technical Working Group Luxembourg 27 Some other characteristics Formal – expressed in Backus-Naur form Deals with historicity: –Takes into account the time validity of the artefacts –Allows defining changes of the algorithm with reference to the time May deal with –Mono and multi-measure data –Data attributes having a definable behaviour –Operands having different dimensionality –Subsets of dimensional cubes –Implicit / explicit zeros Allows –Persistent and non persistent results –Expressions as operands of other expressions –Invocation of external routines

5 June 2013 SDMX Technical Working Group Luxembourg 28 Operators used in the validation (1) Data retrieval / storage (Get, Put) Projection (drop, keep …) Filter (=,, >=, <>, like, between …) Aggregation (sum, avg, min, max, first, last …) Other manipulators of the data structure (rename, calc) Join (merge) Algebraic and string manipulation ( +, -, *, /) Comparison (=,, >=, <>)

5 June 2013 SDMX Technical Working Group Luxembourg 29 Operations used for validation (2) Logical (and, or, not) Tailored for Validation: –Check of a generic condition –Existence and referential integrity check –Completeness check –Imbalance –Error severity level Conditional execution (case) Currency conversion Date-time (year, month, day, time shift)

305 June 2013 SDMX Technical Working Group Luxembourg 305 June 2013 SDMX Technical Working Group Luxembourg WP Item 6 Expressions and Calculations