Improved Register Data Matching and its Impact on Survey Population Estimates Steve Vale Office for National Statistics, UK.

Slides:



Advertisements
Similar presentations
The Business Register Research, Design and Evaluation Division Statistical Institute of Jamaica.
Advertisements

Département fédéral de lintérieur DFI Office fédéral de la statistique OFS Record Linkage : a key and challenging process for CATI surveys. ESSnet on Data.
Sampling Frames for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Fulvia Cerroni - Serena Migliardo - Enrica Morganti Italian National Institute of Statistics Session 27: Use of administrative sources I Helsinki 5 May.
Business Register Outputs in Support of Regional Policy John Perry UK Office for National Statistics.
Using Administrative Data to Improve Social Statistics – An Example of Collaborative Work Minda Phillips, Office for National Statistics. Paul Sinclair,
The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007.
Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Post-enumeration Survey-A.
1 © 2006 by Smiths Group: Proprietary Data Smiths Group Online Performance Review Tool Training.
March 2013 ESSnet DWH - Workshop IV DATA LINKING ASPECTS OF COMBINING DATA INCLUDING OPTIONS FOR VARIOUS HIERARCHIES (S-DWH CONTEXT)
Results and next steps from the ESSnet Admin Data Alison Pritchard Business Outputs & Developments, Office for National Statistics, UK 4 December 2012.
T heme: To enable more effective facilitation of the DLHE survey. Background to problem or opportunity The DLHE collection changed in 2011/12 with a more.
Classifications and CASCOT Ritva Ellison Institute for Employment Research University of Warwick.
Scottish Neighbourhood Statistics and the Scottish Index of Multiple Deprivation (SIMD) 2004 TRACEY STEAD OFFICE OF THE CHIEF STATISTICIAN SCOTTISH EXECUTIVE.
Statistics on enterprise groups – the EGR potential European Commission – Eurostat Directorate G: Global business statistics.
Overview of CSO Business Demography release Workshop on Business Demography and Job Churn statistics Dublin Castle, May 12 th 2011 Jillian Delaney.
The Use of Administrative Sources for Economic Statistics An Overview Steven Vale Office for National Statistics UK.
The Use of Administrative Sources for Statistical Purposes Administrative Sources and Statistical Registers.
Quality assuring the UK business register Andrew Allen.
Balance of Payments Collection and Compilation 23 Feb 2012 Central Statistics Office Ireland.
A. Skalitz – INSEE 26 novembre 2008 The French Business Register : from a quality approach …. ….to a statistical register.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
1 Constructing and Maintaining a Business Register: Singapore’s Experience By Ong Lai Heng Singapore Department of Statistics International Workshop on.
Improving Quality in the Office for National Statistics’ Annual Earnings Statistics Pete Brodie & Kevin Moore UK Office for National Statistics.
RecruitMAX: What is it? RecruitMAX is a web-based software system providing electronic tools for managing the lab’s employment processes.
Website address Carbon Disclosure Project.
Getting the measure of the inputs - II Jonathan Waller Higher Education Statistics Agency (HESA) United Kingdom.
1 BUSINESS REGISTER CBS-ISRAEL. 2 LEGAL FRAME WORK in 1997 two inter-governmental committees issued: 1. LEGAL ASPECTS 2. PRACTICAL & TECHNICAL ASPECTS.
Use of survey (LFS) to evaluate the quality of census final data Expert Group Meeting on Censuses Using Registers Geneva, May 2012 Jari Nieminen.
Eurostat Q2014 – Session 35 Quality assurance for Business Statistics in Europe through the ESS.VIP.ESBRs project D. Francoz Eurostat.
Session 1 – Use of profiling for public administration Linda Scott Head of Business Register Operations UK.
19 th Bled eConference, 06 June Hannes Selhofer European Commission An initiative of the Hannes Selhofer empirica GmbH 19 th Bled eConference –
Towards a high quality 2011 Census The Census Field Operation and LA liaison Pete Benton Deputy Director, Census Programme.
1 Statistical Disclosure Control for Communal Establishments in the UK 2011 Census Joe Frend Office for National Statistics.
The Use of Administrative Sources for Statistical Purposes Matching and Integrating Data from Different Sources.
2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT.
Combining survey and administrative data to create a new input data file for National Accounts processes Shaun McLaughlin Central Statistics Office, Ireland.
EPASS - Overview November 2007 eWiSACWIS Production Access Security System.
A Strategy for Prioritising Non-response Follow-up to Reduce Costs Without Reducing Output Quality Gareth James Methodology Directorate UK Office for National.
The application of selective editing to the ONS Monthly Business Survey Emma Hooper Office for National Statistics
A comparison of sample and register based survey: the case of labour market data De Gregorio C., Filipponi D., Martini A., Rocchetti I.
Simon Power Managing Consultant John Rae Director Understanding Communities Through PayCheck
Statistik.atSeite 1 Norbert Rainer Quality Reporting and Quality Indicators for Statistical Business Registers European Conference on Quality in Official.
Combining Survey and Administrative Data to Develop Statistics for Monitoring Climate Change UNECE Experts Forum on Climate Change Related Statistics Geneva,
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
Mike Treadaway Director of Research Fischer Family Trust.
Process Quality in ONS Rachel Skentelbery, Rachael Viles & Sarah Green
Register entries/exits and demographic flows: some comparisons for statistical aggregates Caterina Viviano 18th Roundtable Beijing, China October 2004.
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Building the address register for the 2011 Census (England & Wales) Alistair.
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Preparing for A Strategy for Change Based on Previous Experiences Steve Vale Office for National Statistics, UK.
2011 Census Address Register Development Garnett Compton 2 October 2008.
August 2002BioCoRE 2002 Survey1 D. Brandon, R. Brunner, K. Vandivort and G. Budescu August 2002.
1 Overview of the U.S. Census Bureau’s Business Register Profiling Operations Presented to International Roundtable on Business Survey Frames– Wiesbaden.
Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate.
4-6 September 2013, Vilnius Quality in Statistics: Administrative Data and Official Statistics USING ADMINISTRATIVE DATA SOURCES IN OFFICIAL.
CLIENT RELATIONSHIP MANAGEMENT KEEPING TRACK OF REQUESTS THE EASY WAY
Canada’s trade in services by industry
Check it out! 1.5.1: Estimating Sample Proportions
Matching and Industry Coding
Administrative Data and their Use in Economic Statistics
Outline of the control approach
New data sources for the EuroGroups Register
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
Session 7 – Eurostat 2017 SBR User Survey
Istat - Structural Business Statistics
Improved Register Data Matching and its Impact on Survey Population Estimates Steve Vale Office for National Statistics, UK.
Task Force on Small and Medium Sized Enterprise Data (SMED)
Pete Benton , Beyond 2011 Programme Director
Preparing for A Strategy for Change Based on Previous Experiences Steve Vale Office for National Statistics, UK.
Presentation transcript:

Improved Register Data Matching and its Impact on Survey Population Estimates Steve Vale Office for National Statistics, UK

Contents Background Current matching systems Enhancements Impact on survey populations

Background No common business identifier in UK Data from different sources matched using name, address and postcode Software based around SSAName3 Limited clerical input for “possible match” category (>10 employment) Quality marker (“inquiry stop”) used to indicate probability of duplication and to exclude some enterprises from survey populations

The Project Aim to improve the quality of automatic matching Reduce the number of units on the register that are not included in survey populations Improve certainty about probability of duplication Part funded by Eurostat

Matching Process 1 Name is standardised to form a name key Name keys are checked against existing records at decreasing levels of accuracy until possible matches are found The name, address and post codes of possible matches are compared, and a score out of 100 is calculated

Matching Process 2 If the score is >79 it is considered to be a definite match If the score is between 60 and 79 it is considered a possible match, and is reported for clerical checking If the score is <60 it is considered a non-match

Matching Process 3 Possible matches are checked clerically and linked where appropriate using an on-line system Non-matches with >9 employment are checked - if no link is found they are sent a Business Register Survey form Samples of definite matches and smaller non-matches are checked periodically

Improvements 1 Re-matching using cleaned addresses –Gains from timing –Gains from cleaning and standardising addresses –Needs extra storage space on the register for cleaned addresses (approx. 3Gb) –Address cleaning tool used: Matchcode5 by Capscan

Improvements 2 Enhancing name keys –Standardised creation –Inclusion of part of postcode Better treatment of compound names –E.g. John Smith trading as Smiths Bakery More use of data on company registrations to assist matching of corporate units

Results 1 Approximately 30% of units outside survey populations will match to units already in those populations Less than 5% of the remainder are duplicates of units in the survey populations Some units in survey populations found to be duplicates (1%?)

Results 2 Overall impact: –6% more units in survey populations –Maximum of 1.4% increase in employment –Timing of change is an issue –The risk of duplication will be less than the risk of under-coverage

Conclusions Matching rates will be improved by regular re-matching using cleaned addresses. Initial matching by name can be improved if part of the postcode is included. Improvements to matching increase the certainty that the remaining unmatched units are genuinely single source. Desk profiling and clerical matching can reduce duplication still further if targeted at high risk units.

Further information Any Questions?