Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quality framework for the evaluation of administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven.

Similar presentations


Presentation on theme: "Quality framework for the evaluation of administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven."— Presentation transcript:

1 Quality framework for the evaluation of administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven Statistics Netherlands

2 Overview  Reason for work  View on Quality  Starting point  Combined results  The quality framework  Application  Future work

3 Reason for work  Statistics Netherlands increases the use of data (sources) collected and maintained by others To decrease response burden and costs  As a result: More dependent on administrative data sources Must be able to monitor the quality of such data sources –How?

4 View on quality  Statistics Netherlands definition of the quality of administrative data sources: “Usability for the production of statistics”  Differs from quality as used by the data source maintainer –Often does not have statistical use in mind –Can’t use the quality report of the data source maintainer (if available)

5 Starting point  Stat. Netherlands paper of Daas & Fonville –Register seminar 2007, Helsinki, Stat. Finland Hands on approach, limited scope (Dutch view)  Should included experiences of others Papers and books that studied the quality of administrative data sources and registers Excluded paper that only focused on quality of surveys Ended up with quite a limited lists of important papers: –Book of the Wallgren’s (S) –Eurostat paper on Quality of administrative Data –Work performed at ONS and by Thomas (UK) –UNECE paper of Nordic countries

6 Combined results  Conclusions: A general level of mutuality –The papers identified many similar quality aspects (quality indicators) None of the ‘views’ on quality were exactly alike How to combine all these views? Something higher than a dimension was needed Karr et al. (2006)* used the term Hyperdimension to distinguish different views on quality Combine all quality aspects identified in all studies and new aspects in a single framework !! * Karr et al. (2006) Stat. Methodol. 3, pp. 137-173

7 Quality framework  Framework has 4 hyperdimensions Four views on the quality of the external data source  The hyperdimensions identified are: Source → Data source as a whole Metadata → Conceptual metadata of data in source Data → Facts (values) in data source Process → Processing related quality aspects

8 Quality framework levels  Levels distinguished:

9 1) Source hyperdimension  Here the data source is viewed upon as a file delivered by the data source maintainer to the NSI  Dimensions (5): Supplier, Relevance, Privacy and security, Delivery, and Procedures

10 Source hyperdimension Hyper- Dimension Indicator Measurement method dimension Source SupplierContact Name, Contact information RelevanceAdm. burden Effect of use on adm. burden of NSI (time and money) Privacy andLegal provision Check if Personal Data security Protection act applies DeliveryCosts Costs of use for NSI

11 2) Metadata hyperdimension  Focuses on the conceptual metadata quality aspects of the data source.  Other metadata aspects (such as process meta) are not included  Dimensions (4): Clarity, Comparability, Unique keys, and Data treatment by data source maintainer

12 Metadata hyperdimension Hyper- Dimension Indicator Measurement method dimension Metadata Clarity Population Description of the population definition used in data source Unique keys Identification Presence of unique keys keys present (which) Data treatment Checks Variable value checks by data source performed maintainer Modifications Familiarity with data modifications

13 3) Data hyperdimension  Aspects related to data in the data source All aspects are accuracy related  Actively being discussed at our office Future changes are very likely  Dimensions (9) Over coverage, Under coverage, Linkability, Unit non-response, Item non-response, Measurement, Processing, Precision, and Sensitivity

14 Data hyperdimension Hyper- Dimension Indicator Measurement method dimension Data Over coverageNon-pop. units Percentage of units not belonging to population of NSI LinkabilityLinkable units Percentage if units linked MeasurementIncompatible Fraction of fields with violated records edit rules ProcessingAdjustment Fraction of fields adjusted Imputation Fraction of fields imputed R-index: Representative index; RMSE: Root mean square Error; MSE: Mean Square Error

15 4) Process hyperdimension  Focuses on the processing of the data source by the data source maintainer by the NSI  Not discussed here, future work  Framework was developed without specifically focusing at process related quality aspects main focus was product related

16 Scope of the framework  Developed for administrative data Registers and other secondary data sources  It could also be used for surveys when: the data is collected by an organization other than Statistics Netherlands  Do not use the whole framework for statistical data sources To much, use only parts of it.

17 Application of the framework  How to apply? Source and Metadata hyperdimension –Checklists have been developed Data hyperdimension –Methods of calculation have been proposed –Currently looking at a practical means to apply these Process hyperdimension –Under investigation

18 Future work  Evaluate various administrative data sources Is the framework generally applicable to all external data sources?  Thoroughly test Source and Metadata checklists Feed-back on usability by users  Calculation methods for Data Determine the quality indicators for various sources  Study how to efficiently evaluate Data E.g. Scripts or computer program  Study the quality aspects in the Process hyperdimension

19 Questions?


Download ppt "Quality framework for the evaluation of administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven."

Similar presentations


Ads by Google