UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing 24-26 April 2017 The Hague,

Slides:



Advertisements
Similar presentations
T-FLEX DOCs PLM, Document and Workflow Management.
Advertisements

Modernisation of Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS Workshop on Modernisation of Statistical Production Geneva, 15–17.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Vienna, 23 April 2008 UNECE Work Session on SDE Topic (v) Editing on results (post-editing) 1 Topic (v): Editing based on results Discussants: Maria M.
Eurostat Statistical Data Editing and Imputation.
Introduction and key issues identified in the papers UNECE Conference of European Statisticians June 2015 Second Seminar, Session I.
Topic (ii): New and Emerging Methods Maria Garcia (USA) Jeroen Pannekoek (Netherlands) UNECE Work Session on Statistical Data Editing Paris, France,
Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.
Jump to first page (o ns) Modernising Statistical Systems to improve Quality The experiences of the Office for National Statistics (ONS) Presented by Emma.
Topic (vi): New and Emerging Methods Topic organizer: Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Oslo, Norway, September 2012.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
CBS-SSB STATISTICS NETHERLANDS – STATISTICS NORWAY Work Session on Statistical Data Editing Oslo, Norway, September 2012 Jeroen Pannekoek and Li-Chun.
Lyne Guertin Census Data Processing and Estimation Section Social Survey Methods Division Methodology Branch, Statistics Canada UNECE April 28-30, 2014.
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
© Federal Statistical Office, Institute for Research and Development in Federal Statistics, Elmar Wein Federal Statistical Office Concepts, materials and.
Topic (i): Selective editing / macro editing Discussants Orietta Luzi - Italian National Statistical Institute Rudi Seljak - Statistical Office of Slovenia.
Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing.
1 Statistical business registers as a prerequisite for integrated economic statistics. By Olav Ljones Deputy Director General Statistics Norway
1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.
Generic Statistical Data Editing Models (GSDEMs) Workshop on the Modernisation of Official Statistics The Hague, 24 November 2015.
RECENT DEVELOPMENT OF SORS METADATA REPOSITORIES FOR FASTER AND MORE TRANSPARENT PRODUCTION PROCESS Work Session on Statistical Metadata 9-11 February.
ESS-net DWH ESSnet on microdata linking and data warehousing in statistical production.
Administrative Data and Official Statistics Administrative Data and Official Statistics Principles and good practices Quality in Statistics: Administrative.
ROMA 23 GIUGNO 2016 MODERNISATION LAB - FOCUSSING ON MODERNISATION STRATEGIES IN EUROPE: SOME NSIS’ EXPERIENCES Insert the presentation title Modernisation.
The Role of service Granularity in Successful CSPA Realization Zvone Klun, Tomaž Špeh Geneve, 22 June 2016.
United Nations Economic Commission for Europe Statistical Division CSPA: The Future of Statistical Production Steven Vale UNECE
Session topic (i) – Editing Administrative and Census data Discussants Orietta Luzi and Heather Wagstaff UNECE Worksession on Statistical Data Editing.
1 CASE Computer Aided Software Engineering. 2 What is CASE ? A good workshop for any craftsperson has three primary characteristics 1.A collection of.
Li-Chun Zhang Statistics Norway
Theme (iv): Standards and international collaboration
Short Training Course on Agricultural Cost of Production Statistics
UNECE-CES Work session on Statistical Data Editing
Theme (v): Managing change
Generic Statistical Data Editing Models (GSDEMs)
Theme (i): New and emerging methods
PLM, Document and Workflow Management
An information model for a metadata-driven editing and imputation system Rok Platinovsek UNECE Work Session on Statistical Data Editing, April
Towards more flexibility in responding to users’ needs
Theme (ii): New Data Sources and Census
Mark Xu, Andy K. Kim, and Larkin Terrie
Chapter 8 – Software Testing
Conference of European Statisticians
Rudi Seljak, Aleš Krajnc
State of Palestine Generic Statistical Business Process Model )GSBPM) - Palestine Case August 2017.
Kevin Moore Head of Platforms Development and Support Branch
Estimation methods for the integration of administrative sources
Guidelines on the use of estimation methods for the integration of administrative sources DIME/ITDG meeting 2018/02/22.
Profiling in Switzerland Costs and benefits
Generic Statistical Business Process Model (GSBPM)
SDMX: A brief introduction
Improving the efficiency of editing in ONS business surveys
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
Scanning the environment: The global perspective on the integration of non-traditional data sources, administrative data and geospatial information Sub-regional.
Issues in Administrative Data
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Agenda Context of the BR Redesign Redesign Objectives Redesign changes
The Generic Statistical Business Process Model
Jeroen Pannekoek, Sander Scholtus and Mark van der Loo
Introducing the GSBPM Steven Vale UNECE
Streamlining statistical production
GSBPM AND ISO AS QUALITY MANAGEMENT SYSTEM TOOLS: AZERBAIJAN EXPERIENCE Yusif Yusifov, Deputy Chairman of the State Statistical Committee of the Republic.
Business architecture
T-FLEX DOCs PLM, Document and Workflow Management.
Étienne Saint-Pierre, Statistics Canada
Work Session on Statistical Metadata (Geneva, Switzerland May 2013)
A handbook on validation methodology. Metrics.
Data compilation and pre-validation
High-Level Group for the Modernisation of Official Statistics
Presentation transcript:

UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing 24-26 April 2017 The Hague, Netherlands Theme (III): Shared software tools and CSPA services – Demonstrations and implementation experiences.

Introduction Modernisation of data editing processes includes the use of generic data editing functions and methods, to be re-used in many E&I applications. Consequently, there is a need for generalized software tools that enable the implementation of general E&I systems, and to replace ad hoc solutions that are costly to maintain. Papers in this session present the development and implementation experiences of generic software modules Most of these tools are developed for national purposes but shareable in principle. Opportunity to share expertise and cost for development. Some papers specifically discuss the possibilities for international collaboration in development and sharing of tools. Modernisation of data editing processes includes the use of generic data editing functions and methods, to be re-used in many E&I applications. Frameworks such as GSBPM or GSDEMs advocate this approach. Consequently, there is a need for generalized software tools that enable the implementation of general E&I systems. Were these general systems are meant to replace ad hoc solutions that are costly to maintain and adapt. Papers in this session present the development and implementation experiences of such generic software modules Most of these tools are developed for national purposes but they are shareable in principle. Sharing between NSI’s clearly forms an opportunity to share the cost and expertise needed to develop high quality tools. Some papers actually discuss the further possibilities for international collaboration in development and sharing of tools. Hopefully this session will stimulate such cooperation.

Presentations in this theme Software implementation of optimization-based selective editing techniques - Spain. About the implementation (in R) of the selective editing methods developed by Spain. Applied to a number of short-term business stats. Future Development of Statistics Canada’s E&I system Banff. - Canada Discusses future of Banff, collaboration with other NSI’s and the relation to common frameworks CSPA, GSDEMs. Automated E&I for Surveys of Multinational Enterprises, a Banff Implementation – USA. Replacing manual editing by automatic editing implemented by Banff. There are 6 presentations in this theme. Covering all the different data editing functions. The first is a presentation of spain about the software implementation of the new selective editing methodology developed in spain. which is already applied to a number of short term business statistics. Then we have two presentation on Banff, a comprehensive data editing system used by many NSI’s. . The firs by stat. can. thast discusses the future of banff as well as the possibilities for collabopration with other NSI’s and the relations with CSPA and GSDEMs. The second describes the development of an application of automatic editing to a survey of Multinational enterprises.

Presentations in this theme The role of rules based editing for maintaining quality in the drive for efficiency - UK . About the implementation of rules for validation and selection services that should be widely applicable: business, social, census. General tool for macro-editing at SURS – Slovenia. Meta-data driven generalized tool for macro-editing, applied to all surveys, implementing a variety of methods. CSPA with R – Netherlands Experiences with the implementation of the editing functions “rule- checking” and “error localization” as CSPA services. The 4th presentation is from the UK and is about the use of rules for verification and selection for different domains of application: business, social and census statistics. The we have a presentation on a macro-editing tool that is used in slovenia. It provides functionality for several outlier detection methods. Last but not least we invited someone from our office to tell about his experiences with implementing two editing function as CSPA services using R.

Enjoy the presentations!

E. Esteban et al..: Software implementation of optimization-based selective editing techniques at Statistics Spain Combined use of a representation of data (or statistical data model) with a representation of statistical treatments Data model: data = (identifier,value) Statistical treatment = a function or method applied on a object with a set of parameters Very generic implementation Simple design of production usable by non-expert programmers / Programmation Details are left to experts developping methods and objects Simplifies development and servicing Metadata are easy to follow thanks to the parameter objects with which each method is applied

S.Thomas: Future development of Statistics Canada's Edit and Imputation system Banff General System for E & I developed in SAS : Specification, simplification of edits and selection of units Implementation of various imputation methods (deterministic, donor...) as well as outlier treatment methods Procedures may be used as independent tools or with the Banff processor generating programs based on specified metadata System with multiple users : inside Statistics Canada / in other National Statistical Institutes or Agencies System under review for improvements in 2018-2020 : need for a strategy to collect users needs to develop partnerships with other NSI's to help implement Banff's development Need to overcome practical and administrative difficulties Limits generated by the closed-form nature of Banff and the associated validation strategy and users's assistance policy vs strategies associated with open-source tools such as R

Adaptation of Banff to BEA’s survey on multinational enterprises M. Xu: Automated data editing and imputation for surveys of multinational enterprises – a Banff implementation Adaptation of Banff to BEA’s survey on multinational enterprises Three treatments Banff cannot deal with: soft edits, non linear edits, edits on categorical variables If possible, modification of edits to make them linear First step of treatments to edit categorical variables and apply non linearizable edits Iterations on Banff’s edit verification, error localization and imputation methods, with a specific set of edits for each observation Comparisons of Banff auto-editing and complete manual editing on two surveys Auto-editing edits less units,

K. Moore: The role of rule-based editing for maintaining quality in the drive for efficiency Objective to create a generic Data editing tool common to all ONS's surveys First step with creation of the CORA plateform for business surveys Process to create a Rule-Based-Editing system E & I checks and methods stored as part of the statistical methods library Metadata associated to each dataset used as parameters by RBE methods: rules to be applied, actions to be taken when edits fail Information on applied edits stored in a validation rule audit service Validation rules classified according to the EESNet Validat Handbook, distinguishing rules whether they concern the same file / dataset / source / domain / statistical provider / date Test on multiple ONS's outputs Enable integrated and simpler process to maintain the rules set with a special team devoted to it To assess the rules' efficiency among different sources

N. Jevšnik, Z. Klun, R. Seljak: The general tool for macroediting at SURS Statistical data processing handled through an integrated system of production SOP Based on generic SAS programs Based on metadata, structured in a central unique metadata base With user-friendly graphical interface E & I and estimates computation already implemented New step: integration of macro-editing Three services: validation, outlier detection, graphical analyses Working with a unique dataset of all aggregates (macro database) enabling comparisons between periods and estimates Great flexibility in the choice of methods and graphical representations

Discussion points How to establish fruitful forms of international cooperation? How CSPA and GSBPM can help? Generalized vs ad hoc methods Are there generalized tools missing? How to implement flexible tools? Open source vs commercial software (e.g. R vs SAS) Which theoretical paradigms (e.g. Fellegi- Holt, optimization) can we use? (…)