Work Session on Statistical Data Editing Paris, France, April 2014 Topic (i): Selective editing / macro editing Experiences from Selective Editing at Statistics Sweden by Anders Norberg, Karin Lindgren and Can Tongur Statistics Sweden
Purpose of SE Reduce costs for the manual work at the NSI without losing substantially in precision in estimates Reduce the response burden for enterprises
History from a Swedish perspective Granquist, L. and Kovar, J.G. (1997). Editing of Survey Data: How Much is Enough? Foreign Trade with Goods (2005) Case studies of the potential use of SE (2007) SELEKT comprises of a well-structured set of open-coded SAS macros and programmes
SE in eleven surveys that had large spending on micro editing 1.Foreign Trade with Goods (Intrastat) 2.Commodity Flow Survey 3.International Trade in Services 4.Wage and Salary Structures in the Private Sector 5.STS, Wages and Salaries, Private Sector 6.STS, Employment, Private Sector 7.STS, Business Activity Indicators 8.Rents for dwellings 9.Revenues and Expenditure Survey for Multi-Dwelling Buildings 10.Energy Use in Manufacturing Industry 11.Consumer Price Index (CPI)
Survey design Annual / Short term Sample / Census One-stage / Multi-stage Errors in classification variables Little / voluminous output
IT systems Error lists on paper or Excel Bespoke system interfaces for the editing staff in VB6 or VB.net and data stored in SQL- databases Triton, is a general production system for the collection and editing of micro data, under construction at SCB. Here SE is a service
Macro Editing Decreased substantially, transferred to micro editing
Quality Improved edit rules on the way Re-contacts closer to the data delivery Less risk for mishaps when units are prioritised
Editing staff opinions More efficient, More interesting Less stressful Focus on fewer units Now we know that the manual review is important and that our work really matters
Confidence of the respondents Wonder why merely some of the cluster elements (products / employees) were flagged as erroneously when the respondent could identify all elements having the same (systematic) error? No thorough survey has been done to explore the respondent´s confidence of SE
Resources needed 300 – 600 hours of work
Resources saved 10 – 60 per cent cost savings for editing
Maintenance, Process data New Statistical Classification of Economic Activities (NACE 2007) New International Standard Classification of Occupations (ISCO 2014) Now considering a five-year evaluation for update of thresholds Sampling under the global threshold for estimation of remaining measurement errors
Merci pour votre attention