Presentation is loading. Please wait.

Presentation is loading. Please wait.

USING THE METADATA IN STATISTICAL PROCESSING CYCLE – THE PRODUCTION TOOLS PERSPECTIVE Matjaž Jug, Pavle Kozjek, Tomaž Špeh Statistical Office of the Republic.

Similar presentations


Presentation on theme: "USING THE METADATA IN STATISTICAL PROCESSING CYCLE – THE PRODUCTION TOOLS PERSPECTIVE Matjaž Jug, Pavle Kozjek, Tomaž Špeh Statistical Office of the Republic."— Presentation transcript:

1 USING THE METADATA IN STATISTICAL PROCESSING CYCLE – THE PRODUCTION TOOLS PERSPECTIVE Matjaž Jug, Pavle Kozjek, Tomaž Špeh Statistical Office of the Republic of Slovenia

2 Overview n Current statistical production cycle in SORS n Using the metadata in Blaise applications n The role of metadata in automatic editing system in SAS n Metadata connected with the data in Oracle data warehouse n Lessons learnt n Questions

3 Current statistical production cycle n Entry and micro editing (Blaise) n Macro and statistical editing (SAS) n Storing and analysis (Oracle) n Dissemination (PC-Axis) n Central metadata stores (Klasje & Metis)

4 Using the metadata in Blaise applications n Generation of (high speed) data-entry applications using Gentry (using by non- IT personnel) n Metadata-based transformations between different data structures (EXTRA-FAT, FAT, THIN)

5 Gentry – tool for generation of the Blaise data-entry application n Questionnaire structure and layout (name, blocks, tables, routing etc.) n Field characteristics (length, data type, constants, other parameters) Field characteristics Data type

6 Gentry – example of generated application section header Data entry for table 12

7 Transformations All data for one unit (provider) in one row (EXTRA FAT): suitable for micro editing Classification and continuous variables in the columns (FAT): suitable for analysis Classification variables in the columns and continuous variables in the rows (THIN) Metadata-based transformation in Blaise Metadata-based transformation in SAS

8 The role of metadata in automatic editing system in SAS n General system for automated editing n Process metadata

9 The role of metadata in automatic editing system in SAS n In order to be general the tool must be able to: - recognize the data which are due to be subjected to editing and/or imputation; - recognize which editing method should be applied, - and with what parameters

10 Process indicators – level 1 n Mode of data collection - 1 data provided directly by reporting unit - 2 data from administrative source - 3 data computed from original values - 4 imputed data – imputation of non-response - 5 imputed data – imputation due to invalid values detected through the editing process - 6 data missing because the unit is not eligible for the item (logical skip)

11 Process indicators – level 2 n Data status - 1 original value - 2 corrected value

12 Process indicators – level 3 n Method of data correction - 11 correction after telephone contact - 12 data reported at a later stage

13 Process indicators – level 3 n Reporting methods - 11 reporting by mail questionnaire - 12 computer assisted telephone interview(CATI) - 13 telephone interview without computer assistance - 14 paper assisted personal interview (PAPI) - 15 computer assisted personal interview (CAPI) - 16 paper assisted self interviewing - 17 computer assisted self interviewing - 18 web reporting

14 Process indicators – level 3 n Imputation methods - 10 method of zero values - 11 logical imputation - 12 historical data imputation - 13 mean values imputation - 14 nearest neighbour imputation - 15 hot-deck imputation - 16 cold-deck imputation - 17 regression imputation - 18 method of the most frequent value - 19 estimation of anual value based on infraanual data - 21 stochastic hot-deck (random donor) - 22 regression imputation with random residuals - 23 multiple imputation

15 Process indicators examples - xy.zz n 11.15 means: 1 - data provided directly by reporting unit 11 - original value 11.15 - computer assisted personal interview (CAPI) n 42.19 means: 4 - imputed data – imputation of non- response 42 - corrected value 42.19 - estimation of anual value based on infraanual data

16 Statistical process Key responders Other units SAS Blaise Oracle SAS Blaise

17 Metadata connected with the data in Oracle data warehouse n On-line access to: - Historical data - Data from different phases (not only final data) - Data for multiple surveys (not only data marts) - Statistical (variables & classifications) and process (time stamps, status indicators...) metadata connected with the data n...accessible for third-party tools

18 Conceptual star scheme for SBS THIN table design

19

20 Lessons learnt n The role of central repositories for metadata - Natural source of conceptual metadata - Metadata have to be exact, complete and consistant - Process metadata should be connected with the data n Harmonisation of metadata concepts - Local metadata vs. global metadata - The cultural change is needed n Technical considerations - The possibilities for metadata exchange and system integration are good (XML, SQL)

21 Questions


Download ppt "USING THE METADATA IN STATISTICAL PROCESSING CYCLE – THE PRODUCTION TOOLS PERSPECTIVE Matjaž Jug, Pavle Kozjek, Tomaž Špeh Statistical Office of the Republic."

Similar presentations


Ads by Google