Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.

Similar presentations


Presentation on theme: "Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia."— Presentation transcript:

1 Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia

2 Summary of presentation Introduction Current generic application – main characteristics Development of global solution Changes in the statistical process Conclusions

3 Introduction Statistical data processing: –Demanding, time consuming and very expensive task –Constant pressure for budget cuts Rationalisation of the statistical process: –Take advantage of the rapid IT development –Movement from domain oriented to process oriented production –Stove-pipe IT solutions replaced by general applications Statistical Office of the Republic of Slovenia (SURS) –SURS began systematic development of generic solutions 6 years ago –Prototype solutions for several parts of the process were developed –These solutions were already used for several large surveys (e.g. 2010 Agriculture Census and the 2011 Population Census) –The prototype generic solutions are now upgraded to a more global solutions

4 Generalised solutions – main characteristics Small, generic solutions for small parts of the statistical process, called the building blocks: –Enable easy and flexible linking of inputs and outputs of the individual components to the whole statistical process –Can be plugged to different databases in different environments (e.g. ORACLE, SAS) if the input database follows few basic conditions –They are designed as fully metadata driven (MDD) systems: one program code → the parameters for the execution of the processing for the concrete survey are provided through the special metadata tables –The process metadata can be provided in different environments (SAS, MS Access, ORACLE) → the metadata organisation must follow the strict rules of its structure (tables and variables)

5 Building blocks - functioning … Different microdata databases General SAS program Ad-hoc program Ad-hoc program Building block Different databases of process metadata

6 Linking bulding blocks into the process Building block 1 Microdata Building block 2 Ad-hoc program Building block n Transformed data … Ad-hoc program Transformed data Ad-hoc program Transformed data

7 Process metadata The system is to a very large extent based on the process metadata: –Processing rules which enable adjustment of the general program for different surveys. The process metadata are at the moment inserted directly into MS Access database –High probability of syntax errors –Users must be thoroughly instructed in order to correctly fill the metadata TableVariableConditionCorr_ruleStep TABLE1XX/Y >1000Round(X/100)1 TABLE1ZZ NE XX2

8 Building blocks The basic tool of the whole system are the building blocks, which cover the particular processing phase. SAS macros which is able to operate on the basis of the process metadata. So far the building blocks for following phases are created: –Data validation (logical controls) –Deterministic corrections –Data imputations –Standard error estimation –Aggregation –Tabulation –Calculation of quality indicators –Disclosure control (testing phase)

9 Building a global solution The developed system is very open and flexible tool. However certain re-integration would be needed to increase its functionality: –To move the process metadata in ORACLE environment –To create single, unique database of process metadata where process metadata for all the surveys are stored and maintained –To develop the graphical interfaces for user friendly management of process metadata –To link the system with the metadata repository

10 The new system … Different microdata databases General SAS program Ad-hoc program Database of processing metadata Metadata repository Ad-hoc program Application for metadata management Data on tables and variables

11 Application for metadata management Deterministic corrections

12 Application for metadata management Execution of the particular process step

13 New application and statistical process Generic MDD application introduces changes in the implementation of data processing on general level: –Essentially different distribution of work between IT specialists, general methodologists and IT experts –Change in the role of subject-matter statisticians → changed expectations of their skills and capabilities –The work organisation of the IT Department and the General Methodology Department will have to be changed from domain oriented to process oriented. –Different approach of IT and methodology experts will be needed. Experts capable of thinking and operating at a much more general level Survey is just one of the realisations of the general statistical process.

14 Conclusions SURS developments in recent years: flexible, metadata driven generic solutions for different phases of data processing. Very open system will be replaced with more integrated and centralised system Main goal: Transition from the stove-pipe oriented production to the more integrated processing systems Two main challenges: –To build the generic IT solutions, which would „cover“ the wide diversity of statistical surveys –To change the very „domain oriented state of mind “ among the employees

15 Thank you for your attention


Download ppt "Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia."

Similar presentations


Ads by Google