Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing 24-26 April 2017 The Hague,

Similar presentations


Presentation on theme: "UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing 24-26 April 2017 The Hague,"— Presentation transcript:

1 UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing April 2017 The Hague, Netherlands Theme (III): Shared software tools and CSPA services – Demonstrations and implementation experiences.

2 Introduction Modernisation of data editing processes includes the use of generic data editing functions and methods, to be re-used in many E&I applications. Consequently, there is a need for generalized software tools that enable the implementation of general E&I systems, and to replace ad hoc solutions that are costly to maintain. Papers in this session present the development and implementation experiences of generic software modules Most of these tools are developed for national purposes but shareable in principle. Opportunity to share expertise and cost for development. Some papers specifically discuss the possibilities for international collaboration in development and sharing of tools. Modernisation of data editing processes includes the use of generic data editing functions and methods, to be re-used in many E&I applications. Frameworks such as GSBPM or GSDEMs advocate this approach. Consequently, there is a need for generalized software tools that enable the implementation of general E&I systems. Were these general systems are meant to replace ad hoc solutions that are costly to maintain and adapt. Papers in this session present the development and implementation experiences of such generic software modules Most of these tools are developed for national purposes but they are shareable in principle. Sharing between NSI’s clearly forms an opportunity to share the cost and expertise needed to develop high quality tools. Some papers actually discuss the further possibilities for international collaboration in development and sharing of tools. Hopefully this session will stimulate such cooperation.

3 Presentations in this theme
Software implementation of optimization-based selective editing techniques - Spain. About the implementation (in R) of the selective editing methods developed by Spain. Applied to a number of short-term business stats. Future Development of Statistics Canada’s E&I system Banff. - Canada Discusses future of Banff, collaboration with other NSI’s and the relation to common frameworks CSPA, GSDEMs. Automated E&I for Surveys of Multinational Enterprises, a Banff Implementation – USA. Replacing manual editing by automatic editing implemented by Banff. There are 6 presentations in this theme. Covering all the different data editing functions. The first is a presentation of spain about the software implementation of the new selective editing methodology developed in spain. which is already applied to a number of short term business statistics. Then we have two presentation on Banff, a comprehensive data editing system used by many NSI’s. . The firs by stat. can. thast discusses the future of banff as well as the possibilities for collabopration with other NSI’s and the relations with CSPA and GSDEMs. The second describes the development of an application of automatic editing to a survey of Multinational enterprises.

4 Presentations in this theme
The role of rules based editing for maintaining quality in the drive for efficiency - UK . About the implementation of rules for validation and selection services that should be widely applicable: business, social, census. General tool for macro-editing at SURS – Slovenia. Meta-data driven generalized tool for macro-editing, applied to all surveys, implementing a variety of methods. CSPA with R – Netherlands Experiences with the implementation of the editing functions “rule- checking” and “error localization” as CSPA services. The 4th presentation is from the UK and is about the use of rules for verification and selection for different domains of application: business, social and census statistics. The we have a presentation on a macro-editing tool that is used in slovenia. It provides functionality for several outlier detection methods. Last but not least we invited someone from our office to tell about his experiences with implementing two editing function as CSPA services using R.

5 Enjoy the presentations!

6 E. Esteban et al..: Software implementation of optimization-based selective editing techniques at Statistics Spain Combined use of a representation of data (or statistical data model) with a representation of statistical treatments Data model: data = (identifier,value) Statistical treatment = a function or method applied on a object with a set of parameters Very generic implementation Simple design of production usable by non-expert programmers / Programmation Details are left to experts developping methods and objects Simplifies development and servicing Metadata are easy to follow thanks to the parameter objects with which each method is applied

7 S.Thomas: Future development of Statistics Canada's Edit and Imputation system Banff
General System for E & I developed in SAS : Specification, simplification of edits and selection of units Implementation of various imputation methods (deterministic, donor...) as well as outlier treatment methods Procedures may be used as independent tools or with the Banff processor generating programs based on specified metadata System with multiple users : inside Statistics Canada / in other National Statistical Institutes or Agencies System under review for improvements in  : need for a strategy to collect users needs to develop partnerships with other NSI's to help implement Banff's development Need to overcome practical and administrative difficulties Limits generated by the closed-form nature of Banff and the associated validation strategy and users's assistance policy vs strategies associated with open-source tools such as R

8 Adaptation of Banff to BEA’s survey on multinational enterprises
M. Xu: Automated data editing and imputation for surveys of multinational enterprises – a Banff implementation Adaptation of Banff to BEA’s survey on multinational enterprises Three treatments Banff cannot deal with: soft edits, non linear edits, edits on categorical variables If possible, modification of edits to make them linear First step of treatments to edit categorical variables and apply non linearizable edits Iterations on Banff’s edit verification, error localization and imputation methods, with a specific set of edits for each observation Comparisons of Banff auto-editing and complete manual editing on two surveys Auto-editing edits less units,

9 K. Moore: The role of rule-based editing for maintaining quality in the drive for efficiency
Objective to create a generic Data editing tool common to all ONS's surveys First step with creation of the CORA plateform for business surveys Process to create a Rule-Based-Editing system E & I checks and methods stored as part of the statistical methods library Metadata associated to each dataset used as parameters by RBE methods: rules to be applied, actions to be taken when edits fail Information on applied edits stored in a validation rule audit service Validation rules classified according to the EESNet Validat Handbook, distinguishing rules whether they concern the same file / dataset / source / domain / statistical provider / date Test on multiple ONS's outputs Enable integrated and simpler process to maintain the rules set with a special team devoted to it To assess the rules' efficiency among different sources

10 N. Jevšnik, Z. Klun, R. Seljak: The general tool for macroediting at SURS
Statistical data processing handled through an integrated system of production SOP Based on generic SAS programs Based on metadata, structured in a central unique metadata base With user-friendly graphical interface E & I and estimates computation already implemented New step: integration of macro-editing Three services: validation, outlier detection, graphical analyses Working with a unique dataset of all aggregates (macro database) enabling comparisons between periods and estimates Great flexibility in the choice of methods and graphical representations

11 Discussion points How to establish fruitful forms of international cooperation? How CSPA and GSBPM can help? Generalized vs ad hoc methods Are there generalized tools missing? How to implement flexible tools? Open source vs commercial software (e.g. R vs SAS) Which theoretical paradigms (e.g. Fellegi- Holt, optimization) can we use? (…)


Download ppt "UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing 24-26 April 2017 The Hague,"

Similar presentations


Ads by Google