Determination of Administrative Data Quality : Recent results and new developments Piet J.H. Daas, Saskia J.L. Ossen, and Martijn Tennekes Statistics Netherlands.

Slides:



Advertisements
Similar presentations
The Response Process Model as a Tool for Evaluating Business Surveys Deirdre Giesen Statistics Netherlands Montreal, June 20th 2007 ICES III.
Advertisements

Guide to statistics in European Commission Development Co-operation
Saskia Ossen, and Piet Daas Introduction in the Source and Metadata hyperdimension.
Regional Workshop for African Countries on Compilation of Basic Economic Statistics Pretoria, July 2007 Administrative Data and their Use in Economic.
March 2013 ESSnet DWH - Workshop IV DATA LINKING ASPECTS OF COMBINING DATA INCLUDING OPTIONS FOR VARIOUS HIERARCHIES (S-DWH CONTEXT)
Results and next steps from the ESSnet Admin Data Alison Pritchard Business Outputs & Developments, Office for National Statistics, UK 4 December 2012.
1 Editing Administrative Data and Combined Data Sources Introduction.
United Nations Workshop on Revision 3 of Principles and recommendations for Population and Housing Censuses and Census Evaluation Amman, Jordan, 19 – 23.
The Use of Administrative Sources for Economic Statistics An Overview Steven Vale Office for National Statistics UK.
Manual on Disability Statistics Central Statistics Office Ministry of Statistics & PI Government of India New Delhi.
Linking administrative and survey data - employment variable for enterprises and establishments in Finnish Business Register Jaakko Salmela Statistics.
Saskia Ossen, and Piet Daas Introduction in the Data hyperdimension.
United Nations Workshop on Revision 3 of Principles and recommendations for Population and Housing Censuses and Census Evaluation Amman, Jordan, 19 – 23.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic The use of administrative data sources (experience and challenges)
Q2010, Helsinki Development and implementation of quality and performance indicators for frame creation and imputation Kornélia Mag László Kajdi Q2010,
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Met a-data Resources in Europe: within NSIs and from Dosis Projects Wilfried Grossmann Department of Statistics and Decision Support Systems University.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
Quality issues on the way from survey to administrative data: the case of SBS statistics of microenterprises in Slovakia Andrej Vallo, Andrea Bielakova.
Deliverable 2.6: Selective Editing Hannah Finselbach 1 and Orietta Luzi 2 1 ONS, UK 2 ISTAT, Italy.
Modernisation and Quality of Business Statistics – NSI Perspective Ger Snijkers (Statistics Netherlands) Gustav Haraldsen (Statistics Norway) EESW, 9-11.
How to use the VSS to design a National Strategy for the Development of Statistics (NSDS) 1.
Assessing the Capacity of Statistical Systems Development Data Group.
Quality framework for the evaluation of administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven.
Eurostat The impact of the Memobust project results.
Jeroen Pannekoek - Statistics Netherlands Work Session on Statistical Data Editing Oslo, Norway, 24 September 2012 Topic (I) Selective and macro editing.
European Conference on Quality in Official Statistics Session 26: Quality Issues in Census « Rome, 10 July 2008 « Quality Assurance and Control Programme.
Explaining the statistical data warehouse (S-DWH)
The Dutch Virtual Census based on registers and already existing surveys Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics.
Recommended Practices for Editing and Imputation in the European Statistical System: the EDIMBUS Project* Orietta Luzi (Istat, Italy) Ton De Waal (Statistics.
ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT ORGANISATION DE COOPÉRATION ET DE DEVELOPMENT ÉCONOMIQUES OECDOCDE Workshop on improving statistics.
Towards a more efficient system of administrative data management and quality evaluation to support statistics production in Istat Grazia Di Bella, Simone.
for statistics based on multiple sources
DEFINING the BUSINESS REQUIREMENTS. Introduction OLTP and DW planning is different in term of requirements clarity Planning DW is about solving users’
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
>>. ESSnet Measuring Global Value Chains 1.Globalisation indicators 2.Methodological development and support for International Organisation and Sourcing.
1 Monitoring & evaluation 2013+: concepts and ideas (ERDF & CF) CMEF meeting, 17 th June 2011, Kai Stryczynski, DG REGIO Evaluation Unit.
Overview of measures used by NSIs to reduce response burden Conference on Administrative Simplification in Official Statistics, SIMPLY 2010 Virginie Raymond-Blaess.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
Increasing Efficiency in Data Collection Processes Arie Aharon, Israel Central Bureau of Statistics.
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Work packages SGA II ESSnet on microdata linking and data warehousing in statistical production Harry Goossens – Statistics Netherlands Head Data Service.
Statistics Netherlands’ modernization programme: the use of administrative data, lessons learned and the way ahead. Geert Bruinooge Assistant Director.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
ESS-net DWH ESSnet on microdata linking and data warehousing in statistical production Harry Goossens – Statistics Netherlands Head Data Service Centre.
Could that be true? Methodological issues when deriving educational attainment from different administrative datasources and surveys Bart F.M. Bakker Manager.
MetaPlus Klas Blomqvist Statistics Sweden Research and Development – Central Methods
1 Statistical business registers as a prerequisite for integrated economic statistics. By Olav Ljones Deputy Director General Statistics Norway
S T A T I S T I K A U S T R I A Quality Assessment of register-based Statistics A Quality Framework Manuela LENK Directorate.
Overview and challenges in the use of administrative data in official statistics IAOS Conference Shanghai, October 2008 Heli Jeskanen-Sundström Statistics.
ESS-net DWH ESSnet on microdata linking and data warehousing in statistical production.
Q2010 Special session 34 Data quality and inference under register information Discussion by Carl-Erik Särndal.
Overview of Programme of the Working Group on Flash Estimates of GDP Roberto Barcellan European Commission - Eurostat.
Eurostat Overview of the project Meeting of the Expert Group on the integration of the European social surveys January 2015.
Session topic (i) – Editing Administrative and Census data Discussants Orietta Luzi and Heather Wagstaff UNECE Worksession on Statistical Data Editing.
Computer industry stock Apple Vs. Dell
UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva
Towards more flexibility in responding to users’ needs
Kåre Vassenden, Statistics Norway
Dual Mode of Data Collection – A New Approach in the Population, Housing and Dwelling Census in Slovakia in 2011 European Conference on Quality in Official.
WP8 Methodology (SGA2) Piet Daas NL, AT, BG, IT, PT, PL, SL.
Rolling Review of Education Statistics
6.1 Quality improvement Regional Course on
Administrative Data and their Use in Economic Statistics
Quality of administrative data
A modest attempt at measuring and communicating about quality
Presentation transcript:

Determination of Administrative Data Quality : Recent results and new developments Piet J.H. Daas, Saskia J.L. Ossen, and Martijn Tennekes Statistics Netherlands May 6, 2010, Helsinki, Finland

2 Overview  Introduction  View on quality  Framework developed for admin. data sources Construction and composition  Application (first part) Checklist and results  New developments Ideas and future work BLUE-ETS

3 Introduction  Statistics Netherlands increases the use of data (sources) collected and maintained by others To decrease response burden and costs  As a result, Statistics Netherlands becomes: More dependent on administrative data sources Must be able to monitor the quality of those data sources –What is ‘quality’ in this context?

4 View on quality  Statistics Netherlands defines quality of administrative data sources as: “Usability for the production of statistics”  Differs from ‘quality’ as used by the data source keeper –Often does not have statistical use in mind –Can’t use the quality report of the data source keeper (if available)  And it is quality of the input !

5 Framework developed  No standard framework available for input quality of administrative data sources  Quality of administrative data is only occasionally observed in the literature Majority of studies on quality and statistics focus on: –output quality –quality of survey data  Framework for the determination of the quality of administrative data sources based on: Statistics Netherlands experiences and ideas Including the results published by others

6  Many quality indicators were identified In total 57!  Many dimensions were identified In total 19  How to combine and structure these indicators? Distinguish different views on quality Alternative name is Hyperdimensions  3 Hyperdimensions were required to combine all quality indicators into a single framework !! First step towards a structured approach Framework overview (1)

7  Three high level views on the input quality of administrative data sources 3 hyperdimensions Framework overview (2)

8  Three high level views on the input quality of administrative data sources 3 hyperdimensions Framework overview (2) 3 Different high level views on quality

9 SOURCE DATA SOURCE: - Focus on data source as a whole - Delivery related aspects - and some other things METADATA: Focuses on the (availability of the) information required to understand and use the data in the data source DATA: - Technical checks - Accuracy related issues

10 Determine Source and Metadata quality  With a checklist Used for both Source and Metadata  Tested 8 administrative data sources Took on average about 2 hours per data source  Results expressed at the dimensional level 5 for Source, 4 for Metadata

11 Checklist results (1) - Source +, good; o, reasonable; -, poor; ?, unclear IPA: Insurance Policy records Administration; 1FigHE: coordinated register for Higher Education SFR: Student Finance Register; 1FigSGE: coordinated register for Secondary General Education CWI: register of Centre for Work and Income; NCP: National Car Pass register ERR: Exam Results Register; MBA, Dutch Municipal Base Administration

12 Checklist results (2) - Metadata +, good; o, reasonable; -, poor; ?, unclear IPA: Insurance Policy records Administration; 1FigHE: coordinated register for Higher Education SFR: Student Finance Register; 1FigSGE: coordinated register for Secondary General Education CWI: register of Centre for Work and Income; NCP: National Car Pass register ERR: Exam Results Register; MBA, Dutch Municipal Base Administration

13 Overall conclusions  Data sources CWI only negative scoring data source –Tempted to recommend not using it! –Result of delivery issues and vague definitions –However, it is the only administrative data source that contains educational data on the non-student part of the population! –Solve the weaknesses!! Other data sources –Quite OK (there are always some things you can improve) –Data processing by data source keeper needs attention  Checklist –Good way to assist the user, quite fast –Quality information on a basic but essential level –Not all information is commonly known!

14 What about the Data hyperdimension  How to study data quality? A draft list of indicators is available –10 dimensions and 26 indicators A structured approach needs to be developed! 1. Data inspection should be efficient 2. Assist user with scripts/software (were possible) ?A checklist?

15 Overview of data quality approach

16 Data: Technical checks  Very basic For RAW data Should be easy and quick No other info required!  Examples File size Number of (unique) units / records received Metadata compliance (standard for XML-files) Visual checks (Data fingerprinting) –2 examples

17 Technical checks: Visualization examples Missing data ‘Data fingerprinting’

18 Data: Accuracy related indicators  First true indicators in the process Information from other data sources is required  Examples of indicator for units Over coverage indicator –Units in source not belonging to NSI-population Under coverage indicators –Missing units –NSI-population units not in source –Selectivity –Representativity of units in data source compared to NSI-population (RISQ-project) Linkability indicators –Correct, incorrect and selectivity of linked units

19 Data: Output related indicators  Report data quality on an aggregated level Quality of the output! Need to link input quality to output quality  Examples of indicators: Precision of estimates of core variables Selectivity of core variable totals

20 How to report data quality ?  ‘Quality Report Card’ paper / computerized version Place were all results are combined and orderly presented  Which indicators always? Is there a basic/minimum set? Hierarchy of quality indicators  Which indicators can be automatically determined? Create standardized scripts Create a software prototype

21 Future plans  Fully focus on Data hyperdimension Is a lot of work!  Study this in a European context BLUE-Enterprise and Trade Statistics project –7th Framework program –From till –One of the topics is the study of admin. data quality –This topic is studied jointly by he NSI’s of: Netherlands, Italy, Norway, Slovakia, Sweden

22 Thank you for your attention!  More details in the Q2010-paper  Checklist can be obtained From the Statistics Netherlands website by mailing and request a copy