Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quality of administrative data

Similar presentations


Presentation on theme: "Quality of administrative data"— Presentation transcript:

1 Quality of administrative data
ESTP Training Course “Quality Management and survey Quality Measurement” Rome, 24 – 27 September 2013 Giovanna Brancato, Senior Researcher Head of unit “Auditing, Quality and Harmonisation“ Istat Antonia Boggia, Researcher Unit “Auditing, Quality and Harmonisation“ Istat

2 Outline Definitions and nature of administrative data
Reasons for using administrative data in statistical production Main uses of administrative data in statistical production Quality in statistics produced from administrative data

3 Why administrative data? / Growing interest in…
ESS Statistics Code of Practice Principle 10. Cost effectiveness Indicator 10.3: Proactive efforts are made to improve the statistical potential of administrative data and to limit recourse to direct surveys. Principle 9. Non-excessive burden on respondents Indicator 9.4: Administrative sources are used whenever possible to avoid duplicating requests for information. Principle 2. Mandate for data collection Indicator 2.2: The statistical authorities are allowed by law to use administrative data for statistical purposes

4 Why administrative data?
ESS Statistics Code of Practice Principle 8. Appropriate statistical procedures Indicator 8.1: When European Statistics are based on administrative data, the definitions and concepts used for administrative purposes are a good approximation to those required for statistical purposes. Indicator 8.7: Statistical authorities are involved in the design of administrative data in order to make administrative data more suitable for statistical purposes. Indicator 8.8: Agreements are made with owners of administrative data which set out their shared commitment to the use of these data for statistical purposes. Indicator 8.9: The statistical authorities co-operate whit owners of administrative data in assuring data quality Principle 8 on Appropriate Statistical Procedures includes many indicators related to the procedure when using , administrative data, here only the one that states

5 1. Definitions and nature of administrative data
Many definitions available for: administrative record, data, register, source Administrative data is the set of units and data derived from an administrative source An administrative source is the organisational unit responsible for implementing an administrative regulation (or group of regulations) for which the corresponding register of units and the transactions are viewed as a source of statistical data (OECD Glossary of Statistical Terms)

6 1. Definitions and nature of administrative data
Administrative versus Statistical data Administrative source in charge of the data instead of NSI Data are collected for administrative rather then statistical purposes Interest is in the single objects (person, business, …) rather than in statistical populations

7 1. Definitions and nature of administrative data
Main purposes of administrative data (Brackstone, 1987) Regulation of flow of goods and people across borders Legal requirements related to particular events Administration of benefits or obligations Administration of public institutions Government regulation of industry Provision of utilities

8 1. Definitions and nature of administrative data
Some examples Tax data Personal income tax Value Added Tax (VAT) Business / profits tax Social security data Health / education records Published business accounts Internal accounting data ….. Nowadays it is becoming more and more spread the availability of data also from private business: utility companies, telephone directories, …

9 1. Nature of administrative data
A model for statistical registers by object type and subject field (Stats Sweden is not responsible for all of them) Wallgren A., Wallgren B. (2007)

10 2. Reasons for using administrative data in statistical production
Costs reduction Diminution of burden on the respondents Technology opportunity Possible improvements in terms of Relevance (small area estimation, …) Accuracy (coverage, no sampling error, no missing data) Timeliness Increase of public image of the statistical authority Use pushed by CoP

11 3. Main uses of administrative data in statistical production
To build a statistical register to be used for Direct statistical tabulations Sample selection and use of auxiliary information in the estimation process To completely or partially replace survey data with respect to sub-population(s) or subset of variables or to support survey process Post stratification Survey estimation Imputation, …. Survey quality evaluation Validation before publishing results Coherence indicators Studies on quality (nonrespondent characteristics, ...) The first corresponds to an objective in terms of microdata, the second macro.

12 3. Main uses of administrative data in statistical production
Whichever is the use of the administrative data in the statistical production process, rarely the administrative register will be usable as it is. In general transformations will be necessary. Wallgren & Wallgren (2007) have provided a summary of the steps usually needed to pass from an administrative register to a statistical register.

13 3. Main uses of administrative data in statistical production
Administrative Register Processing steps Variable harmonisation Unit harmonisation Check of basic data (editing) Handling of missing objects and of missing values Linking, matching, joint processing Processing of time references Creating derived objects Creating derived variables Statistical Register

14 3. Some examples of processes using administrative sources
Marriages Separation of spouses, Marriage dissolution and termination of marriage civil effects (divorces) OROS Survey (Employment, earnings and social security contributions) based on Inps records (dm 10 forms) European Union Statistics on Income and Living Conditions (EU-SILC)

15 4. Quality in statistics produced from administrative data
Output quality: quality of administrative data-based survey vs. quality of the register (one objective vs. many objectives) Sources of errors in surveys vs. administrative-data based surveys Input and Through-put quality NON SVILUPPO PER NIENTE LA PARTE RELATIVA A ONE OBJECTIVE: FORSE Può DIVENTARE QUASI INPUT QUALITY?

16 4. Quality in statistics produced from administrative data
The quality of administrative data-based survey (output quality) is defined according to the dimensions of EU quality vector Relevance Accuracy Quality Timeliness & Punctuality Accessibility & Clarity Comparability Coherence

17 4. Quality in statistics produced from administrative data
Concerning accuracy and the main sources of errors: Errors Sample survey Statistics based on admin data Sampling Non Sampling - specification - coverage - unit nonresponse ? - item nonresponse - measurement - processing

18 4. Quality in statistics produced from administrative data
Specification errors Concepts and definitions underlying the object of the administrative register may not match with the statistical concepts

19 4. Quality in statistics produced from administrative data
Coverage errors coverage with respect to the administrative population, i.e. those units (individuals, business) for whom the administrative function applies coverage with respect to the statistical population, i.e. those units (individuals, businesses) representing the target population

20 4. Quality in statistics produced from administrative data
Nonresponse errors unit nonresponse: technically it does not apply (units not included in the register represent undercoverage); it may happen to have records with missing values on the most important variable thus leading to consider these units as nonresponses item nonresponse: missing data are rarely observed in the variables of interest for the administrative regulation, but may strongly affect the variables which are not of interest for the administrative data owner

21 4. Quality in statistics produced from administrative data
Measurement/response errors in statistical surveys they are due to: respondent, interviewer, questionnaire, data collection mode in administrative data they may depend on: intentional errors (false declaration), lack of accuracy in seeking data for variables not of interest, drawbacks in the administrative systems to capture and maintain the data related to the specification error on the variables (systematic nature)

22 4. Quality in statistics produced from administrative data
Processing errors administrative processing: difficult to track, need for accurate documentation from the provider of the register statistical processing

23 4. Quality in statistics produced from administrative data
Input quality Quality of the sources. It refers not only to the quality of the data contained in the administrative file, but also to the documentation of the metadata and to the conditions in which the administrative data are obtained Hyperdimensions, dimensions and quality indicators* Source Metadata Data * Checklist for the quality evaluation of the Administrative Data Sources (Stats. Netherlands, 2009)

24 4. Quality in statistics produced from administrative data
Hperdimension: Source Dimensions (indicators) Supplier (contact, purpose) Relevance (usefulness, use, information demand, response burden) Privacy and security (legal provision, confidentiality, security) Delivery (cost, arrangements, punctuality, format, selection) Procedures (data collection, planned changes, feedback, fall- back scenario)

25 4. Quality in statistics produced from administrative data
Hperdimension: Metadata Dimensions (indicators) Clarity (definition of units and variables, time dimension, changes) Comparability (of units, variables, time references) Unique keys (identification keys, unique combination of variables) Data treatment (checks, modifications)

26 4. Quality in statistics produced from administrative data
Hperdimension: Data Dimensions (indicators) Technical checks (readability, metadata compliance) Overcoverage (non-population units) Undercoverage (missing units, selectivity, …) Linkability (linkable units, mismatches, …) Unit nonresponse (units without data, selectivity, …) Item nonresponse (missing values, selectivity, …) Measurement (external checks, incompatible records, …) Processing (adjustments, imputation, outliers, …) Precision (standard errors, …) Sensitivity (missing values, selectivity, …)

27 4. Quality in statistics produced from administrative data
Input quality Indicators* Dimensions Technical checks Accuracy Completeness Time related dimension Integrability * BLUE-Enterprise and Trade Statistics (Blue-Ets, Work Package 4)

28 4. Quality in statistics produced from administrative data
Source: BLUE-Enterprise and Trade Statistics (Blue-Ets, Work Package 4)

29 4. Quality in statistics produced from administrative data
Source: BLUE-Enterprise and Trade Statistics (Blue-Ets, Work Package 4)

30 4. Quality in statistics produced from administrative data
Source: BLUE-Enterprise and Trade Statistics (Blue-Ets, Work Package 4)

31 4. Quality in statistics produced from administrative data
Source: BLUE-Enterprise and Trade Statistics (Blue-Ets, Work Package 4)

32 4. Quality in statistics produced from administrative data
Source: BLUE-Enterprise and Trade Statistics (Blue-Ets, Work Package 4)

33 4. Quality in statistics produced from administrative data
Administrative Register Through-put quality Contacts with suppliers Checking received data Causes and extent of missing values and objects Causes and extent of nonlinking objects Register maintenance surveys Evaluate accuracy of objects and variables Investigation and documentation of inconsistencies Statistical Register

34 4. Quality in statistics produced from administrative data
Remark Care should be taken when analysing the relationship among variables obtained by record linkage of two or more registers. Scheuren and Winkler (1993, 1997) pointed out that regression analysis involving variables coming from two different files linked together can be affected by serious bias of possible false links, i.e. coupled records that do not refer to the same unit. They suggest a way to reduce bias. In the same direction goes the work by Lahiri and Larsen (2004)

35 Main References Blue-Ets, BLUE-Enterprise and Trade Statistics Brackstone G.J. (1987) Issues in the Use of Administrative records for Statistical purposes. Survey methodology, 1987, 13, 1, Lahiri P., Larsen M. (2004). Regression analysis with linked data. Dept. of Statistics Preprint 04-9, Iowa State University Scheuren and Winkler (1993) Regression Analysis of Data Files that Are Computer Matched – Part I. Survey Methodology, 19, 1, pp Scheuren and Winkler (1997) Regression Analysis of Data Files that Are Computer Matched – Part II. Survey Methodology, 23, pp Statistics Finland (2004). Use of Registers and Administrative Data Sources for Statistical Purposes. Best Practice of Statistics Finland Statistics Sweden (2001). The future development of the Swedish register system. R&D Report Unece(2011). Using Administrative and Secondary Sources for Official Statistics: A Handbook of Principles and Practices Wallgren A, Wallgren B. (2007). Register-based Statistics. Administrative data for statistical purposes. John Wiley & Sons, The Atrium, Southern Gate, Chichester


Download ppt "Quality of administrative data"

Similar presentations


Ads by Google