Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva

Similar presentations


Presentation on theme: "UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva"— Presentation transcript:

1 UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva
A quality monitoring system for statistics based on administrative data Manuela Lenk Statistics Austria Registers, Classifications and Methods Division 31st Oct.– 2nd Nov. 2012 UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva

2 Register-based census in Austria
First register-based census in Austria 2011 Full census, no sampling Census topics Population census, housing census, census of enterprises and their local units of employment Data availability On municipality level Geo-Codes Statistical databases Interactive maps

3 Quality assessment of the census
Application of a quality framework The framework is independent from data processing, allowing the application on other statistical projects Data processes can be evaluated without influencing them Three stages of quality evaluation Raw data Registers provided by the data holders Central Database (CDB) Combined information from the registers Data is merged by a unique key Final Data Pool (FDP) Final data including imputations

4 Quality framework - Overview

5 Quality assessment on register level I
Calculation of quality indicators Each attribute in each register gets a quality between 0 and 1 Quality calculation is based on 3 so-called hyperdimensions HD Documentation Focuses on factors which possibly predetermine data quality Realized by a questionnaire which is filled out in accordance with the data authority Questions are weighted by their impact on data quality Quality indicator: maximum obtainable score obtained score

6 Quality assessment on register level II
HD Pre-processing Detection of formal errors, like missing primary keys, out-of-range values and item non-response Usable records are calculated by the subtraction of erroneous records from total records Quality indicator: HD External Source The accuracy of the data is checked Comparison with existing representative surveys total number of records usable records total number of linked records number of consistent values

7 Quality framework - Overview

8 Quality assessment of the CDB and FDP
SEX_Reg3 SEX_Reg2 SEX_Reg1 Unique Attributes Attribute exists in only one register, directly transferred to the CDB (e.g. highest level of education) Multiple Attributes Attribute exists in more than one register, combined in the CDB using certain decision rules (e.g demographic attributes) Derived Attributes Attribute is created based on other attributes (e.g. type of commuter) Multiple Attribute Attrib1 Attrib2 Derived Attribute

9 Quality assessment of unique attributes
The highest level of education (EDU) is delivered by one single register. The quality indicator is derived by the three hyperdimensions. There are still missing values (with quality=0) that decrease the quality indicator in the CDB. After imputations of missing values, we assess the quality indicator of the attribute EDU in the Final Data Pool.

10 Quality assessment of multiple attributes
SEX is available in two registers. The attribute is evaluated in both data sources with the three hyperdimensions. Does the information differ between the two data sources? Which register should we believe in? Dempster-Shafer theory takes uncertainty, consistency and conflict into account.

11 Quality assessment of derived attributes
There is no information on current activity status (CAS) or commuters (COM) in the raw data. We derive the information for CAS from two other attributes in two data sources. We obtain the required information for COM from the already derived attribute CAS. Thus, the quality indicator of both attributes is equal. Imputations are applied on CAS. The imputed values are transferred to the COM attribute by the same derivation process already done in the CDB.

12 Usability of the results
Raw data Which register delivers a certain attribute with the highest quality indicator? Is there a register with a below-average quality for all delivered attributes? Is the quality indicator of a certain attribute worse than in the last delivery? Census Database Is there any advancement of data quality by the use of multiple data sources? Comparison with prior censuses – plausibility checks Final Data Pool Comparison of attributes for further advancement Comparison of census generations over time RAW CDB FDP

13 Further Information Austrian Journal of Statistics, Volume 39 (2010), Number 4 Statistica Neerlandica, Volume 66 (2012), Issue 1 ESSnet on Data Integration 2011, Madrid ISI World Statistics Congress STS50 - Methods and quality of administrative data used in a census 2011, Dublin NTTS Conference 2011, Brussels UNECE/Eurostat Expert Group Meeting on Register-Based Censuses 2010, The Hague European Conference on Quality in Official Statistics 2010, Helsinki European Conference on Quality in Official Statistics, June 2012

14 A quality monitoring system for statistics based on administrative data
Please address queries to: Manuela Lenk Register based census Contact information: Guglgasse 13, 1110 Vienna phone: +43 (1) fax: +43 (1) UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva


Download ppt "UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva"

Similar presentations


Ads by Google