Presentation is loading. Please wait.

Presentation is loading. Please wait.

CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML - TAXATION METADATA AS AN EXAMPLE Ottawa, 16-18 May 2005.

Similar presentations


Presentation on theme: "CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML - TAXATION METADATA AS AN EXAMPLE Ottawa, 16-18 May 2005."— Presentation transcript:

1 CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML - TAXATION METADATA AS AN EXAMPLE heikki.rouhuvirta@stat.fi Ottawa, 16-18 May 2005 Heikki Rouhuvirta, Statistical Methodology R&D

2 2.5.20052Heikki Rouhuvirta Contents Background The Challenge Primary Questions Test Case – Finnish Taxation Data Semantics of Register Data Taxation Metadata Definition Some Results The Future Some Practical Steps on the Way

3 2.5.20053Heikki Rouhuvirta Background Present state of compilation of administrative data as the challenge CoSSI as the methodological framework for data semantics of registers Codacmos as the organizational base for concept testing

4 2.5.20054Heikki Rouhuvirta Data Combining Data Source (e.g. RDB) Data Source data gathering (e.g. SQL) Data communication network (internet, WAN) transmission file (sequential/ Flat File) transmission file Data Store Data Store physical media (CDROM, magnetic tape) tailor-made programs or ETL products (e.g. Informatica, Oracle) tailor-made programs or ETL products (e.g. Informatica, Oracle) Operational systems or Data Warehouses Relational DB FTP (+ VPN) Destination NSI Administrative Data Source Statistical Register Data Handbook Of Taxation etc. Data Store (sequential/ Flat File) (Flat File) Statistical Application Statistical Information Data Store Relational DB Survey Data statistician data gathering (e.g. SQL) data extraction/ transformation/ loading Present state of compilation of administrative data

5 2.5.20055Heikki Rouhuvirta CoSSI Common Structure of Statistical Information – CoSSI covers different ways of statistical data organization (statistical data matrix and statistical table) includes a model to define contentual information in statistics Includes a model to define the methodology used in statistics (e.g. measuring and classification) manages the complexity of statistical information (e.g. nested variables structure) includes definitions for all types of the statistical information, data, metadata for files, statistical metadata, quality declarations, charts the main objective was to organise statistical data so that they also contain statistical metadata (describing both the structure and logic of statistical metadata at the same time) Definition Descriptions available on the web at: http://www.stat.fi/org/tut/dthemes/drafts/cossi_definition_descriptions_v_09_2003.pdf http://www.stat.fi/org/tut/dthemes/drafts/cossi_definition_descriptions_v_09_2003.pdf Statistical metadata see also from the web: http://www.stat.fi/org/tut/dthemes/papers/alternative_approach_to_metadata_codacmos_2004.pdf http://www.stat.fi/org/tut/dthemes/papers/alternative_approach_to_metadata_codacmos_2004.pdf

6 2.5.20056Heikki Rouhuvirta Codacmos Cluster of Data Collection Integration & Metadata Systems for Official Statistics EU Project 2003- 2004 (IST-2001-38636) Consortium: Italian National Statistical Institute, Statistics Finland, University Of Edinburgh, National Statistical Service of Greece, DESAN Research Solutions, Statistical Division Of Municipality Of Milan, The Finnish Tax Administration, University Of Patras, Institute Of Informatics And Statistics, University Of Athens, National Social Security Institute, Tietokarhu Ltd, Statistics Norway http://www.codacmos.eu.org TAXATION METADATA Partners: Statistics Finland, The Finnish Tax Administration and Tietokarhu Ltd

7 2.5.20057Heikki Rouhuvirta The Challenge: how the present process, where the description of administrative data can mostly be read from the authorities' administrative handbooks, can be transformed into such that it meets the requirements for the usability and presence of the contentual description of data both in the production process to statistics producers and in the distribution of statistical information to users of statistics.

8 2.5.20058Heikki Rouhuvirta Primary Questions what are the metadata of administrative data? how to process the metadata specifying the interpretation and use of administrative data collection and register data? how to combine the original data description (e.g. concept definitions of register fields) to variable description and measurement information of statistics? can accumulating interpretive metadata be “transported” in processing of information and if can, how?

9 2.5.20059Heikki Rouhuvirta Test Case – Finnish Taxation (Finnish taxation on the web at: http://www.vero.fi)http://www.vero.fi

10 2.5.200510Heikki Rouhuvirta Taxation: Types and Sources of income

11 2.5.200511Heikki Rouhuvirta Income tax deductions

12 2.5.200512Heikki Rouhuvirta Data Semantics of Register Data Modelling methodology: starting point is to distinguish between substance concept model and information model whereby the concepts are described Information organizing method: any which doesn't lose information Technology: any without restrictions Result: Taxation metadata definition (taxmeta.dtd)

13 2.5.200513Heikki Rouhuvirta Basic Substance Concept Tax type: i.e. Personal taxation Type of income: i.e. earned income, capital income Income: i.e. salary, pension Type of tax deduction Deduction A) B)

14 2.5.200514Heikki Rouhuvirta Description Information Internal instruction: Instruction on spesific income and deduction area Law: reference to a section of law Law case: reference to a law case Formula: How the tax is calculated Income: i.e. salary, pension Deduction 1) 2) 3)

15 2.5.200515Heikki Rouhuvirta Taxation Metadata Definition (taxmeta.dtd) Available on the web at: http://www.stat.fi/org/tut/dthemes/drafts/taxmeta_dtd_v_01.txt http://www.stat.fi/org/tut/dthemes/drafts/taxmeta_dtd_v_01.txt

16 2.5.200516Heikki Rouhuvirta Taxation Metadata - Logical Concept Model (I)

17 2.5.200517Heikki Rouhuvirta Taxation Metadata - Logical Concept Model (II)

18 2.5.200518Heikki Rouhuvirta … result from register standpoint Demonstration Report is available on the web at: http://www.stat.fi/org/tut/dthemes/papers/ http://www.stat.fi/org/tut/dthemes/papers/ demoreport_on_taxation_metadata_codacmos_2004.pdf

19 2.5.200519Heikki Rouhuvirta Taxation register view Metadata Tax type code used in the register Value in euro Plain-language code (derived or column name) Structure view Metadata view Taxpayer’s tax register record

20 2.5.200520Heikki Rouhuvirta … and result from statistics standpoint

21 2.5.200521Heikki Rouhuvirta Income distribution statistics – statistical metadata

22 2.5.200522Heikki Rouhuvirta Income distribution statistics – taxation register metadata (I) statistical metadata register metadata

23 2.5.200523Heikki Rouhuvirta Income distribution statistics – taxation register metadata (II) statistical metadata register metadata

24 2.5.200524Heikki Rouhuvirta The Future Could it be …. integrated register metadata a genuinely metadata-driven statistical production process rich metadata is present and available in all production stages, including editing as well as transforming of register concepts to statistical concepts metadata accumulates as the process advances without losing old metadata rich metadata is also available for users during the dissemination process of statistical information

25 2.5.200525Heikki Rouhuvirta RDB XMLDB Hand- book of Register Register Metadata (xml) data gathering Questionnaires (xml) collection routines transaction based data storage units based data report with meta 1° aggregation data transmission units and variable based data organisation data combining collected data matrix based on CoSSI … … … … x np x nj … x n2 x n1........ x ip x ij … x i2 x i1........ x 2p x 2j … x 22 x 21 x 1p x 1j … x 12 x 11 x p … x j … x 2 x 1 a. n. a i.. Variable Statistical unit statistical metadata based on CoSSI combined data xml based production system checked values conceptual formation new variables new metadata data editing XML based metadata-driven statistical production

26 2.5.200526Heikki Rouhuvirta Some Practical Steps on the Way Plan to implement this scheme of things to metadata of other registers (e.g. population register) Integration of structured statistical metadata system with statistical software packages (e.g. SAS, SuperStar) for simultaneous use


Download ppt "CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML - TAXATION METADATA AS AN EXAMPLE Ottawa, 16-18 May 2005."

Similar presentations


Ads by Google