CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML - TAXATION METADATA AS AN EXAMPLE Ottawa, 16-18 May 2005.

Slides:



Advertisements
Similar presentations
Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.
Advertisements

Lecture-7/ T. Nouf Almujally
United Nations Statistics Division Principles and concepts of classifications.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
Chapter One Overview of Database Objectives: -Introduction -DBMS architecture -Definitions -Data models -DB lifecycle.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Metadata: Integral Part of Statistics Canada Quality Framework International Conference on Agriculture Statistics October 22-24, 2007 Marcelle Dion Director.
WP.5 - DDI-SDMX Integration
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
Metadata management and statistical business process at Statistics Estonia Work Session on Statistical Metadata (Geneva, Switzerland 8-10 May 2013) Kaja.
Using ISO/IEC to Help with Metadata Management Problems Graeme Oakley Australian Bureau of Statistics.
M ETADATA OF NATIONAL STATISTICAL OFFICES B ELARUS, R USSIA AND K AZAKHSTAN Miroslava Brchanova, Moscow, October, 2014.
Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.
Assessing Quality for Integration Based Data M. Denk, W. Grossmann Institute for Scientific Computing.
Using SAS® Information Map Studio
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Data Resource Management
Frameworks for the Access and Use of Administrative Data, With the Example of Current Practice in the UK Steven Vale Office for National Statistics UK.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
Databases Shortfalls of file management systems Structure of a database Database administration Database Management system Hierarchical Databases Network.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
United Nations Economic Commission for Europe Statistical Division Mapping Data Production Processes to the GSBPM Steven Vale UNECE
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
Developing Statistical Information Systems and XML Information Technologies - Possibilities and Practicable Solutions Geneva,
Instituto Nacional de Estadística, Geografía e Informática (INEGI), Mexico National Economic Surveys (NES) Jun 2007.
Data resource management
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
Metadata Driven Statistical Data Warehouse System at the Hungarian Central Statistical Office Imre Pap Senior IT Advisor Hungarian Central Statistical.
Integrated metadata systems History Status Vision Roadmap
Improving the visualisation of statistics: The need for an SDMX-based visualisation framework Xavier Sosnowska Luxembourg, 6 May 2008.
CONCEPTUAL MODELLING OF STATISTICAL METADATA AND METADATA DATA MODEL IN CoSSI Geneva, 3-4 April 2006 Heikki Rouhuvirta, Statistical.
Overview and challenges in the use of administrative data in official statistics IAOS Conference Shanghai, October 2008 Heli Jeskanen-Sundström Statistics.
Modernising Statistical Production: Modernising Statistical Production: Main recommendations from global assessments 7 th SPECA PWG on Statistics
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
5.8 Finalise data files 5.6 Calculate weights Price index for legal services Quality Management / Metadata Management Specify Needs Design Build CollectProcessAnalyse.
Chapter 1: Introduction. 1.2 Database Management System (DBMS) DBMS contains information about a particular enterprise Collection of interrelated data.
From Data Access to Data Integration IAOS, Shanghai October 2008 Annegrete Wulff, Statistics Denmark
1 Recent developments in quality related matters in the ESS High level seminar for Eastern Europe, Caucasus and Central Asia countries Claudia Junker,
Administrative Data and Official Statistics Administrative Data and Official Statistics Principles and good practices Quality in Statistics: Administrative.
>> Metadata What is it, and what could it be? EU Twinning Project Activity E.2 26 May 2013.
Data Integration - The ETL Process Module 4: BIC#4 – Data Integration Capability Populating Data Warehouse (Data Mart) 1.
Metadata models to support the statistical cycle: IMDB
Introduction To DBMS.
Chapter 2 Database Environment.
Prepared by: Galya STATEVA, Chief expert
Towards connecting geospatial information and statistical standards in statistical production: two cases from Statistics Finland Workshop on Integrating.
Chapter 2 Database Environment.
Chapter 2 Database Environment Pearson Education © 2009.
S-DWH layered architecture – Statiscs Finland
YTY − an integrated production system for business statistics
The implementation of a more efficient way of collecting data
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Databases.
Metadata Framework as the basis for Metadata-driven Architecture
Metadata The metadata contains
Mapping Data Production Processes to the GSBPM
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
Work Session on Statistical Metadata (Geneva, Switzerland May 2013)
Technical Coordination Group, Zagreb, Croatia, 26 January 2018
Best Practices in Higher Education Student Data Warehousing Forum
Introduction to reference metadata and quality reporting
Presentation transcript:

CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML - TAXATION METADATA AS AN EXAMPLE Ottawa, May 2005 Heikki Rouhuvirta, Statistical Methodology R&D

Heikki Rouhuvirta Contents Background The Challenge Primary Questions Test Case – Finnish Taxation Data Semantics of Register Data Taxation Metadata Definition Some Results The Future Some Practical Steps on the Way

Heikki Rouhuvirta Background Present state of compilation of administrative data as the challenge CoSSI as the methodological framework for data semantics of registers Codacmos as the organizational base for concept testing

Heikki Rouhuvirta Data Combining Data Source (e.g. RDB) Data Source data gathering (e.g. SQL) Data communication network (internet, WAN) transmission file (sequential/ Flat File) transmission file Data Store Data Store physical media (CDROM, magnetic tape) tailor-made programs or ETL products (e.g. Informatica, Oracle) tailor-made programs or ETL products (e.g. Informatica, Oracle) Operational systems or Data Warehouses Relational DB FTP (+ VPN) Destination NSI Administrative Data Source Statistical Register Data Handbook Of Taxation etc. Data Store (sequential/ Flat File) (Flat File) Statistical Application Statistical Information Data Store Relational DB Survey Data statistician data gathering (e.g. SQL) data extraction/ transformation/ loading Present state of compilation of administrative data

Heikki Rouhuvirta CoSSI Common Structure of Statistical Information – CoSSI covers different ways of statistical data organization (statistical data matrix and statistical table) includes a model to define contentual information in statistics Includes a model to define the methodology used in statistics (e.g. measuring and classification) manages the complexity of statistical information (e.g. nested variables structure) includes definitions for all types of the statistical information, data, metadata for files, statistical metadata, quality declarations, charts the main objective was to organise statistical data so that they also contain statistical metadata (describing both the structure and logic of statistical metadata at the same time) Definition Descriptions available on the web at: Statistical metadata see also from the web:

Heikki Rouhuvirta Codacmos Cluster of Data Collection Integration & Metadata Systems for Official Statistics EU Project (IST ) Consortium: Italian National Statistical Institute, Statistics Finland, University Of Edinburgh, National Statistical Service of Greece, DESAN Research Solutions, Statistical Division Of Municipality Of Milan, The Finnish Tax Administration, University Of Patras, Institute Of Informatics And Statistics, University Of Athens, National Social Security Institute, Tietokarhu Ltd, Statistics Norway TAXATION METADATA Partners: Statistics Finland, The Finnish Tax Administration and Tietokarhu Ltd

Heikki Rouhuvirta The Challenge: how the present process, where the description of administrative data can mostly be read from the authorities' administrative handbooks, can be transformed into such that it meets the requirements for the usability and presence of the contentual description of data both in the production process to statistics producers and in the distribution of statistical information to users of statistics.

Heikki Rouhuvirta Primary Questions what are the metadata of administrative data? how to process the metadata specifying the interpretation and use of administrative data collection and register data? how to combine the original data description (e.g. concept definitions of register fields) to variable description and measurement information of statistics? can accumulating interpretive metadata be “transported” in processing of information and if can, how?

Heikki Rouhuvirta Test Case – Finnish Taxation (Finnish taxation on the web at:

Heikki Rouhuvirta Taxation: Types and Sources of income

Heikki Rouhuvirta Income tax deductions

Heikki Rouhuvirta Data Semantics of Register Data Modelling methodology: starting point is to distinguish between substance concept model and information model whereby the concepts are described Information organizing method: any which doesn't lose information Technology: any without restrictions Result: Taxation metadata definition (taxmeta.dtd)

Heikki Rouhuvirta Basic Substance Concept Tax type: i.e. Personal taxation Type of income: i.e. earned income, capital income Income: i.e. salary, pension Type of tax deduction Deduction A) B)

Heikki Rouhuvirta Description Information Internal instruction: Instruction on spesific income and deduction area Law: reference to a section of law Law case: reference to a law case Formula: How the tax is calculated Income: i.e. salary, pension Deduction 1) 2) 3)

Heikki Rouhuvirta Taxation Metadata Definition (taxmeta.dtd) Available on the web at:

Heikki Rouhuvirta Taxation Metadata - Logical Concept Model (I)

Heikki Rouhuvirta Taxation Metadata - Logical Concept Model (II)

Heikki Rouhuvirta … result from register standpoint Demonstration Report is available on the web at: demoreport_on_taxation_metadata_codacmos_2004.pdf

Heikki Rouhuvirta Taxation register view Metadata Tax type code used in the register Value in euro Plain-language code (derived or column name) Structure view Metadata view Taxpayer’s tax register record

Heikki Rouhuvirta … and result from statistics standpoint

Heikki Rouhuvirta Income distribution statistics – statistical metadata

Heikki Rouhuvirta Income distribution statistics – taxation register metadata (I) statistical metadata register metadata

Heikki Rouhuvirta Income distribution statistics – taxation register metadata (II) statistical metadata register metadata

Heikki Rouhuvirta The Future Could it be …. integrated register metadata a genuinely metadata-driven statistical production process rich metadata is present and available in all production stages, including editing as well as transforming of register concepts to statistical concepts metadata accumulates as the process advances without losing old metadata rich metadata is also available for users during the dissemination process of statistical information

Heikki Rouhuvirta RDB XMLDB Hand- book of Register Register Metadata (xml) data gathering Questionnaires (xml) collection routines transaction based data storage units based data report with meta 1° aggregation data transmission units and variable based data organisation data combining collected data matrix based on CoSSI … … … … x np x nj … x n2 x n x ip x ij … x i2 x i x 2p x 2j … x 22 x 21 x 1p x 1j … x 12 x 11 x p … x j … x 2 x 1 a. n. a i.. Variable Statistical unit statistical metadata based on CoSSI combined data xml based production system checked values conceptual formation new variables new metadata data editing XML based metadata-driven statistical production

Heikki Rouhuvirta Some Practical Steps on the Way Plan to implement this scheme of things to metadata of other registers (e.g. population register) Integration of structured statistical metadata system with statistical software packages (e.g. SAS, SuperStar) for simultaneous use