1 Editing Administrative Data and Combined Data Sources Introduction.

Slides:



Advertisements
Similar presentations
Statistics NZs experience in using Administrative Data in an Integrated Programme of Economic Vince Galvin General Manager Strategy & Communications.
Advertisements

Quality Review of Key Indicators at Statistics Canada ICES-III, June 2007 Claude Julien and Don Royce.
Paul Smith Office for National Statistics
Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Workshop on Energy Statistics, China September 2012 Data Quality Assurance and Data Dissemination 1.
United Nations Statistics Division Scope and Role of Quarterly National Accounts Training Workshop on the Compilation of Quarterly National Accounts for.
UNECE Work Session on Statistical Data Editing Vienna April 2008 Topic ii – Editing Administrative Data and Combined Sources.
OECD Short-Term Economic Statistics Working PartyJune Analysis of revisions for short-term economic statistics Richard McKenzie OECD OECD Short.
The Use of Administrative Sources for Economic Statistics An Overview Steven Vale Office for National Statistics UK.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
1 Development and Application of Statistical Business Registers in Africa Key findings Besa Muwele Besa Muwele Michael Colledge Michael Colledge 9th African.
Administrative Data at Statistics Canada – Current Uses and the Way Forward 27 th Voorburg Group Meeting Warsaw, Poland André Loranger October 4, 2012.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic The use of administrative data sources (experience and challenges)
1 Quality Assurance In moving information from statistical programs into the hands of users we have to guard against the introduction of error. Quality.
Rudi Seljak, Metka Zaletel Statistical Office of the Republic of Slovenia TAX DATA AS A MEANS FOR THE ESSENTIAL REDUCTION OF THE SHORT-TERM SURVEYS RESPONSE.
12th Meeting of the Group of Experts on Business Registers
Q2010, Helsinki Development and implementation of quality and performance indicators for frame creation and imputation Kornélia Mag László Kajdi Q2010,
Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical.
A generic tool to assess impact of changing edit rules in a business survey – SNOWDON-X Pedro Luis do Nascimento Silva Robert Bucknall Ping Zong Alaa Al-Hamad.
Recent Developments of the OECD Business Tendency and Consumer Opinion Surveys Portal coi/coordination
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
The Future of Administrative Data ICES III End Panel Discussion Don Royce Statistics Canada June 2007.
Quality issues on the way from survey to administrative data: the case of SBS statistics of microenterprises in Slovakia Andrej Vallo, Andrea Bielakova.
Deliverable 2.6: Selective Editing Hannah Finselbach 1 and Orietta Luzi 2 1 ONS, UK 2 ISTAT, Italy.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
Jeroen Pannekoek - Statistics Netherlands Work Session on Statistical Data Editing Oslo, Norway, 24 September 2012 Topic (I) Selective and macro editing.
System of Economic Surveys in Egypt. Agenda Introduction Survey design stages What types of surveys are needed Challenges in surveying the informal sector.
European Conference on Quality in Official Statistics Session 26: Quality Issues in Census « Rome, 10 July 2008 « Quality Assurance and Control Programme.
Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.
Topic (vi): New and Emerging Methods Topic organizer: Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Oslo, Norway, September 2012.
for statistics based on multiple sources
United Nations Economic Commission for Europe Statistical Division Mapping Data Production Processes to the GSBPM Steven Vale UNECE
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
New sources – administrative registers Genovefa RUŽIĆ.
Quality Assurance Programme of the Canadian Census of Population Expert Group Meeting on Population and Housing Censuses Geneva July 7-9, 2010.
CBS-SSB STATISTICS NETHERLANDS – STATISTICS NORWAY Work Session on Statistical Data Editing Oslo, Norway, September 2012 Jeroen Pannekoek and Li-Chun.
© Federal Statistical Office Germany, Division IB, Institute for Research and Development in Federal Statistics Sheet 1 Surveys, administrative data or.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
Copyright 2010, The World Bank Group. All Rights Reserved. Principles, criteria and methods Part 2 Quality management Produced in Collaboration between.
The challenge of a mixed-mode design survey and new IT tools application: the case of the Italian Structure Earning Surveys Fabiana Rocci Stefania Cardinleschi.
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Compilation of Meta Data Presentation to OG6 Canberra, Australia May 2011.
S T A T I S T I K A U S T R I A Quality Assessment of register-based Statistics A Quality Framework Manuela LENK Directorate.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
Towards a Process Oriented View on Statistical Data Quality Michaela Denk, Wilfried Grossmann.
Overview and challenges in the use of administrative data in official statistics IAOS Conference Shanghai, October 2008 Heli Jeskanen-Sundström Statistics.
Q2010 Special session 34 Data quality and inference under register information Discussion by Carl-Erik Särndal.
Census quality evaluation: Considerations from an international perspective Bernard Baffour and Paolo Valente UNECE Statistical Division Joint UNECE/Eurostat.
Administrative Data at Statistics Canada – Current Uses and the Way Forward Wesley Yung and Peter Lys, Statistics Canada.
Methods for Data-Integration
Implementation of Quality indicators for administrative data
Improvements in editing methods and processes for use of Value Added Tax data in UK National Accounts Martina Portanti and Robert Breton Office for National.
Estimation methods for the integration of administrative sources
Survey phases, survey errors and quality control system
Survey phases, survey errors and quality control system
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
6.1 Quality improvement Regional Course on
Administrative Data and their Use in Economic Statistics
Data processing German foreign trade statistics
Mapping Data Production Processes to the GSBPM
Metadata used throughout statistics production
A handbook on validation methodology. Metrics.
Presentation transcript:

1 Editing Administrative Data and Combined Data Sources Introduction

2 Sub-topic:Use of administrative data for business surveys and economic data Papers focus on methods for pre-processing and edit and imputation to obtain high quality administrative data for supporting survey data and incorporating into statistical data. Administrative data is used as a direct statistical source in business surveys and economic censuses by replacing survey data of smaller units thus reducing costs and response burden. Administrative data supports processing of survey data through error localization, imputation models, selective editing techniques and setting thresholds.

3 Use of administrative data for business surveys and economic data Relevant papers for sub-topic: WP2 -Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures, Canada WP4 -Use and Editing of Administrative Data in the Business Indicators Unit, New Zealand WP5 -Detecting Outliers in Price Quotes for the Canadian Consumer Price Index, Canada WP6 -Imputation of External Trade Data in Denmark, Denmark WP9 -The Use of Administrative Data in the Annual Survey of Retail, Wholesale and Services, United States

4 Sub-topic:Combining Data Sources Combining multiple administrative data sources may replace the need to carry out some surveys. Target variables can be obtained by direct replacement or modeled using administrative data as covariates. Administrative data supports other statistical processes related to edit and imputation and enhances the dimensions of quality with respect to accuracy, coherence, consistency and completeness. Relevant papers for sub-topic: WP7 -Evaluation of Editing and Imputation Supported by Administrative Records, Israel WP8 -Editing and Imputation for the Creation of a Linked Micro File from Base Registers and Other Administrative Data, Norway

5 Sub-topic: Other Processes Supporting Edit and Imputation Statistical data are becoming more dependent on combining administrative data with survey data. There is a need to expand processes beyond conventional and traditional methods of data collection. High quality, unambiguous metadata about the administrative data must be fully integrated with the survey data and communicated through every step of the processing operation and especially to end users. Relevant paper for sub-topic: WP3 -Conceptual Modeling of Administrative Register Information and XML - Taxation Metadata as an Example, Finland

6 Editing Administrative Data and Combined Data Sources Enjoy the Presentations!

7 Editing Administrative Data and Combined Data Sources Summary of Papers

8 Use of administrative data for business surveys and economic data All the papers focus on the use of administrative data for enhancing and improving economic statistical data: Reduction of costs and response burden; Improving edit and imputation processes by using administrative data for error localization and imputation models; Setting thresholds and benchmarks for selective editing techniques. Examples were shown on the use of tax data and trade data with an emphasis on the need for direct pre-processing and edit and imputation procedures to define timely and accurate target variables that are needed for survey processing.

9 Use of administrative data for business surveys and economic data Other main points: Quality assessment presented in the papers:  Indicators for evaluating definitions, consistency, correlation and distributions between survey variables and administrative data,  Assessment of edit and imputation procedures on final point estimates and their variance. The importance of understanding the needs of users to produce “fit for use” data through selective editing techniques compared to “perfect” data through full editing. Outlier detection as a form of selective editing technique which take into account skewed distributions of the economic data.

10 Use of administrative data for business surveys and economic data Other main points: Imputation models in the papers included the use of historical data, ratio imputation and nearest neighbor donor imputation, as well as imputation on both a micro and macro level. One example was the imputation for statistical units reporting bi-monthly or half-yearly on tax data to obtain timely monthly data. All papers emphasize the importance of quality and error checks on the final outputs based on the combined administrative and survey data sources.

11 Use of administrative data for business surveys and economic data Specific topics from the papers: At Statistics Canada, an Economic Census was developed based on high-quality administrative data, the Business Register and survey data. At Statistics New Zealand and the Census Bureau a comprehensive program is being carried out to incorporate more administrative data into the survey processes of economic data by replacing survey data of smaller units and moving towards selective editing techniques.

12 Use of administrative data for business surveys and economic data Specific topics from the papers: Both Statistics New Zealand and Statistics Denmark discuss edit and imputation processes specifically for trade statistics where administrative data is fundamental to the imputation of missing and erroneous data. Statistics Canada present different methods for outlier detection as a special case of selective editing techniques. Both Statistics Canada and the Census Bureau assess the quality of outputs based on administrative data by the impact on the efficiency of the final point estimates.

13 Combining Data Sources The emphasis of the papers is on linking multiple high quality administrative data sources to model and impute target variables for social surveys. The more sources linked together the higher the risk of errors through conflicting values of variables. Each data source must be assessed for its completeness and accuracy to avoid introducing new errors into the statistical data. Administrative data improves the quality of statistical data through error localization, imputation models, outlier detection, and selective editing techniques. It also reduces the need for edit and imputation. Boundaries between edit and imputation are constantly moving due to the use of multiple sources of data. Administrative data support the detection and correction of errors. They also provide a source of data as a reference file for imputation.

14 Combining Data Sources Other main points: Administrative data supports both the error detection and error correction processes:  by supplementing survey data and allowing for better model specification for imputation either by adding covariates or by actually replacing missing or erroneous data;  for use as a reference file to confirm erroneous values of variables and reasons for failed edit checks;  for quality assurance to identify errors resulting from both the data collection phase or the data processing phase. Prior knowledge and understanding of the data in a multi-source data collection is essential for the selection and integration of the data sources.

15 Combining Data Sources Specific topics from the papers: At Statistics Norway multiple administrative data sources are linked to obtain employment characteristics. The electronic data capture has a large impact on the development of integrated and coherent statistical systems. Papers demonstrate methods for identifying units, timeliness of the variables, definitions and classifications in order to merge multiple administrative sources and develop imputation models for target variables not present in the data sources. CBS Israel has wide experience working with multiple sources of administrative data and its use for both the editing stage and the imputation stage and also supporting other statistical processing.

16 Other Processes Supporting Edit and Imputation Many survey processes are based on traditional methods of collected survey data. With more use of multiple data sources, statistical processing has to encompass all of the statistical data, both survey and administrative data. The edit and imputation processes and its validation provide important metadata which result in future key explanations to users on movements in the series. Other statistical processing supported by the edit and imputation processes are record linkage, coding and the imputation of new variables as well as quality assessment of the final outputs.

17 Other Processes Supporting Edit and Imputation Other main points: The need to understand and interpret register data through a uniform reference frame and in a standard format is vital to both producers and users of the statistical data. Quality dimensions are enhanced by the use of administrative data with respect to coherence, consistency, comparability, completeness and accuracy. Imputation for new variables is supported by administrative data by providing better models, more covariates and definitions of weighting classes or the direct replacement by administrative data.

18 Other Processes Supporting Edit and Imputation Specific topics from the paper: Owners of administrative registers do not often hold information about the data in electronic format. The challenge for NSI’s is to translate this information about the data into structured metadata. Statistics Finland uses the Common Structure of Statistical Information (CSOSI) method, and gives an example of the system when applied to personal taxation data and to the administrative information describing it. When registers are used in the survey process, producers of statistics must ensure that users gain a good understanding of the content to ensure that they make accurate interpretations.

19 Editing Administrative Data and Combined Data Sources Points for Discussion

20 Use of administrative data for business surveys and economic data Points for discussion: How can differences in definitions, classifications and timeliness of variables in administrative data be reconciled with survey data without introducing new bias into the data? Can we automatically assume that administrative data has higher quality than survey data? How should thresholds be set below which administrative data should not be used at all? Can administrative data directly replace survey data? Quality measures in the papers focused on the efficiency of point estimates. Are there other quality measures that measure the impact of using administrative data in survey processes, in particular at a micro level?

21 Use of administrative data for business surveys and economic data Points for discussion: How can edit rules be managed and updated to take into account dynamic and constantly changing administrative sources? Selective editing thresholds described in the papers were determined by budget constraints. Can we incorporate historical data, external knowledge and the influence on the final estimates into the setting of thresholds? Selective editing techniques for administrative data target larger statistical units, however smaller units are typically used for replacing survey data. Is there a way to efficiently edit smaller units through selective editing techniques?

22 Use of administrative data for business surveys and economic data Points for discussion: Outlier detection methodology is proposed as a selective editing technique but it does not necessarily target the most influential units. Can the methodologies be combined and how should thresholds be determined? Can selective editing techniques be carried out for multi-variate editing? How can we measure the impact of influential multiple variables and to set thresholds in this framework? How can better imputation models be developed for administrative data as opposed to survey data which make more use of historical data and multiple data sources? For example, can units reporting monthly be used to impute units reporting by-monthly or half-yearly?

23 Combining Data Sources Points for discussion: Can we develop a mechanism to influence the methods of data collection from suppliers of administrative data in terms of content and format to ensure more generic pre-processing and edit and imputation processes? The integration of multiple data sources can result in introducing new errors. How should the quality of the variables be assessed in a multi-source data collection, in particular when having to choose between values for the same variable? The quality of administrative data can vary widely. When we consider combining data sources, should they all be of a similar quality?

24 Other Processes Supporting Edit and Imputation Points for discussion: Is there a mechanism by which we can influence the suppliers of administrative registers to collect and maintain metadata in a machine readable format? How can we best integrate content information about administrative registers into the metadata describing the overall statistical processing operation, in particular with different formats of data?

25 Editing Administrative Data and Combined Data Sources Conclusions and Future Research

26 Underlying theme in all of the papers: The use of administrative data for survey processing and in particular for supporting efficient edit and imputation processes based on error localization techniques, imputation modeling and selective editing techniques, increases the quality of the statistical data and reduces response burden and costs. There is a clear need for standardization/harmonization of definitions and concepts to facilitate the use of multiple sources of administrative data within the survey process.

27 Future Research: The development of generic modules for editing and imputation of administrative data is particularly challenging since data collection methods and formats vary greatly depending on the source. More research needs to go into the development of common portals and electronic data collection which will have a direct effect on methods used for editing and imputation. Better modeling techniques, edit and imputation processes and quality indicators are needed to assess and correct administrative data prior to its use in statistical processing as well as to increase the quality of the final product.

28 Future Research: Further development of a time series methodology approach for error localization and imputation of administrative data which usually have rich historical data. Administrative data is diverse and may include both numerical and categorical data. The edit and imputation modules have to be able to handle both types of data. Better methods for setting selective editing thresholds for administrative data based on the influence of the variable as well as the development of a multi-variate framework.

29 Editing Administrative Data and Combined Data Sources Thank you for your attention! Natalie and Heather