All the answers? Statistics New Zealand’s Integrated Data Infrastructure Paper by Felibel Zabala, Rodney Jer, Jamas Enright and Allyson Seyb Presented.

Slides:



Advertisements
Similar presentations
Statistics NZs experience in using Administrative Data in an Integrated Programme of Economic Vince Galvin General Manager Strategy & Communications.
Advertisements

A Statistical Architecture for Economic Statistics Ron McKenzie ICES III.
Migration of a large survey onto a micro-economic platform Val Cox April 2014.
Overview of Transaction Processing and Enterprise Resource Planning Systems Chapter 2.
New Coops: Accounting & Taxes Presented by and ©: Bruce Mayer, CPA.
Chapter 8 Income and Taxes.
What are Wage Records? Wage records are an administrative database used to calculate Unemployment Insurance benefits for employees who have been laid-off.
Counting the Dutch, The Future of the Virtual Census in the Netherlands Presentation at the seminar Counting the 7 Billion 24 February 2012 * Geert Bruinooge.
1 Editing Administrative Data and Combined Data Sources Introduction.
Payroll Accounting, Taxes, and Reports
New Zealand’s International Trade Towards an integrated approach February 2011.
Reporting and Interpreting Liabilities
Changes in the Structure of the Business Population Business Demography Jillian Delaney 29 September 2011.
WAGES Eastbourne Citizens Advice Bureau Financial Literacy Wages
Foundation Data Workshop
An Integrated Approach to Economic Statistics “ The Canadian Experience” UNSD – IBGE Workshop on Manufacturing Statistics Kevin Roberts Rio de Janeiro,
Pay and Taxes INVESTIGATE WHAT IMPACTS YOUR PAYCHECK AND PERSONAL TAXES ©2014 National Endowment for Financial Education | Lesson 3-3: Pay and Taxes.
Use of administrative data in statistics - challenges and opportunities ICES III End Panel Discussion Montreal, June 2007 Heli Jeskanen-Sundström Statistics.
Business Register Guidelines for Small Developing Nations Proposal for Discussion Geoff Mead and Ron Mckenzie Statistics New Zealand
X © 2010 The McGraw-Hill Companies, Inc. All rights reserved.
2.3 Growth Assignment 2 Part One B.
Integrated Data Infrastructure (IDI) Project manager – Guido Stark June 2012 Linking data across government How Statistics New Zealand maintains privacy.
Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University.
Regional GDP Workshop. Purpose of the Project October Regional GDP Workshop Regional GDP Scope Annual Current price (nominal) GDP By region.
Module 13 Employee vs Independent Contractor. Employee (E’e) vs Independent Contractor (IC) Key Learning Objectives n n Income and payroll taxes withholding.
Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on.
Understanding Your Paycheck and Tax Forms
1 Business Register: Quality Practices Eddie Salyers
Dutch Virtual Census Presentation at the International Seminar on Population and Housing Censuses; Beyond the 2010 Round November, 2012 Egon Gerards,
The employment and retirement transitions of New Zealanders aged in their '60s Sylvia Dixon and Dean Hyslop Statistics NZ and Department of Labour Population.
Use of administrative data in short term economic indicators Statistics NZ Rochelle Barrow.
Payroll Taxes and Forms
Paychecks and Tax Forms Take Charge of your Finances Dollars & Sense Unit 2: Taxes & Paychecks, Part 2: W4 I9.
1 Presentation to OG6 Canberra, Australia May 2011 Statistical Uses of Administrative Data in Canada.
Payroll Accounting, Taxes, and Reports
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
The Future of Administrative Data ICES III End Panel Discussion Don Royce Statistics Canada June 2007.
Integration of Annual Economic Collections – The Australian Experience ICESIII, Canada, 2007 Presented by Eden Brinkley.
Combining survey and administrative data to create a new input data file for National Accounts processes Shaun McLaughlin Central Statistics Office, Ireland.
On Tap: Developments in Statistical Data Editing at Statistics New Zealand Paper by Allyson Seyb, Felibel Zabala and Les Cochran Presented by Felibel Zabala.
Register-based migration statistics and using additional administrative data sources Barica Razpotnik Statistical Office of the Republic of Slovenia UNECE.
Longitudinal Data Recent Experience and Future Direction August 2012.
ICASIII Cancun Mexico, November 2004 Establishing a survey frame for agriculture: The New Zealand experience Andrew Hunter Manager Business, Financial.
Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21 st Century” Gary Dunnet.
The Dutch Virtual Census of 2001 A New Approach by Combining Different Sources Eric Schulte Nordholt ECE Census meetings Geneva, November 2004.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
Session topic (iii) – Editing and Imputation in the context of data integration from multiple sources and mixed modes Discussants Felipa Zabala, Orietta.
Editing of linked micro files for statistics and research.
© 2014 Cengage Learning. All Rights Reserved. Learning Objectives © 2014 Cengage Learning. All Rights Reserved. LO6 Prepare a payroll register. LO7 Prepare.
Integrated Approach Processing Marie Brodeur Director General, Industry Statistics Branch, Statistics Canada St. Lucia February, 2014 SNA seminar in the.
G1 © Family Economics & Financial Education – Revised March 2008 – Paychecks and Taxes Unit – Understanding Your Paycheck Funded by a grant from.
Developing the prototype Longitudinal Business Database: New Zealand’s Experience Julia Gretton IAOS Conference Shanghai, China, October 2008
1 Statistical business registers as a prerequisite for integrated economic statistics. By Olav Ljones Deputy Director General Statistics Norway
Jeopardy Q$100 Q$200 Q$300 Q$400 Q$500 Q$100 Q$200 Q$300 Q$400 Q$500 5 Q$100 Q$200 Q$300.
Source Document Deadlines Payroll Terms Grab Bag.
Processing Methodology of Tax Data at Statistics Canada Authors: François Brisebois, Richard Laroche and Rossana Manriquez (Statistics Canada) Presenter:
The 2011 Census: Estimating the Population Alexa Courtney.
7-1. Unit 7 Employee Earnings Records McGraw-Hill/Irwin Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved.
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
Towards the 2011 UK Census Editing Strategy Heather Wagstaff and Steven Rogers Methodology Directorate Office for National Statistics, U.K.
G1 © Family Economics & Financial Education – Revised March 2008 – Paychecks and Taxes Unit – Understanding Your Paycheck Funded by a grant from.
Using administrative data to produce official social statistics New Zealand’s experience.
LED Local Employment Dynamics Bradley Keen Pennsylvania Department of Labor & Industry Center for Workforce Information & Analysis (CWIA)
Marc Hamel and Julie Trépanier May 21, 2014 Canadian Statistical Demographic Database: A research project.
Business Entity Concept
Economics 2.3 Growth Assignment 2
Prague EU-SILC Best Practice Workshop, 14th and 15th September 2017
The use of Linked Employer-Employee Data in Maintaining the Statistics New Zealand Business Frame and in producing Business Demographic Statistics Geoff.
A new fantastic source for updating the Statistical Business Register
Jeroen Pannekoek, Sander Scholtus and Mark van der Loo
Presentation transcript:

All the answers? Statistics New Zealand’s Integrated Data Infrastructure Paper by Felibel Zabala, Rodney Jer, Jamas Enright and Allyson Seyb Presented by Felibel Zabala Sept 2012

Statistics New Zealand’s Integrated Data Infrastructure (IDI) Merges data from different suppliers including Statistics NZ Variable quality of the different datasets, both within and between 2

Statistics New Zealand’s Integrated Data Infrastructure (IDI) Linking clean datasets is not easy, much more difficult for variable quality in datasets Importance of an effective and efficient editing strategy 3

Main objective Present some of the issues on and solutions to any linked administrative dataset with a focus on one of Statistics NZ‘s first integrated dataset, the Linked Employer-Employee Data (LEED) 4

LEED Provides the backbone of the IDI prototype Links longitudinal business data from Statistics NZ’s Business Frame to a longitudinal series of payroll tax data from Inland Revenue (IRD) Used to produce quarterly statistics that measure labour market dynamics at various levels, eg filled jobs, worker flows, and total earnings 5

LEED Payroll data Collected from employers for New Zealand’s taxation system through IRD’s Employer Monthly Schedule (EMS) Information available from EMS  Employer/employee name and IRD number  taxable earnings for work performed taxed at source of income  tax deductions (pay-as-you-earn or PAYE, withholding tax, child support payment, student loan indicator amount)  start and finish dates of employment 6

LEED – additional details Also includes payments made to beneficiaries by the government Contains a subset of the self-employed 7

LEED – additional details (cont’d) Collection unit - the legal entity that files the EMS return Statistical unit – or the ‘employer’ in LEED is the geographical or physical location of the business 8

Methods of integration in LEED Figure 1. Unit record links in LEED 9

10 Linking employer to enterprise

Figure 1. Unit record links in LEED 11 Linking employer longitudinally

Figure 1. Unit record links in LEED 12 Linking enterprise and geo longitudinally

Figure 1. Unit record links in LEED 13 Linking employee longitudinally

Variables edited in LEED IRD numbers Gross earnings Date of birth Sex Workplace of an employee Start and end dates of employment Editing strategy: Do not replace any IRD data unless there is strong evidence it is an error 14

Variables edited in LEED (cont’d) IRD numbers Imputation of sex Imputation of start and end dates of employment 15

Variables edited in LEED (cont’d) Gross earnings  Presence of systematic errors  Detection method – use of ratio edit: PAYE/gross earnings  Imputation method Date of birth  Presence of systematic errors  Detection method – edit rules based on an employee’s age against some events  Imputation method 16

Variables edited in LEED (cont’d) Imputation of workplace of an employee  Uses transportation method, where  the imputed workplace of an employee is the geo that minimises the distance between an employee’s home address to the geo, subject to the constraints that  each employee is assigned to a geo and  the total number of employees allocated to a geo should equal the number of employees expected from the geo 17

The IDI prototype Datasets linked to LEED Benefit data Tertiary education data Administrative tertiary education data and student loans and allowances data Statistics NZ’s Household Labour Force Survey (HLFS) and its supplementary surveys 18

The IDI prototype (cont’d) Other linked dataset in IDI The Longitudinal Business Database (LBD) prototype  includes information on business demographics, financial data, employment, goods exports, government assistance, and management practices 19

The IDI prototype (cont’d) Figure 2. Linking in the IDI prototype 20

Issues in linking in the IDI Lack of a common identifier across datasets Main variables in the Central Linking Concordance (CLC)  IRD numbers, passport numbers, and student ID, where available Use of demographic variables as partial identifiers 21

Issues in linking in the IDI (cont’d) Need for a standard software for automated data linkage robust to data changes Timing of receipt of data 22

Editing strategy in the IDI Focus on ensuring high-quality linking variables are used in linking. Examples:  Validity rules were used to edit names across data sources  Sex and date of birth are reformatted to ensure common coding is used across data sources Where inconsistencies occur in records linked from two different data sources, it is important to know which of the two data sources is more reliable 23

Editing strategy in the IDI (cont’d) Process to resolve inconsistencies in personal details  Most common value present in the datasets should be kept  Prioritise the data sources to determine the order of retaining their values 24

Editing strategy in the IDI (cont’d) Editing strategy should be able to Edit inconsistencies from the same unit from different sources Treat erroneous and missing variables in a record Ensure consistency in variables across a record for a time period and over time 25

Next steps Build of the IDI with a focus on improving the linking methodology Determine standard quality measures for outputs produced using administrative data 26

Next steps (cont’d) Redevelopment of LEED and SLA systems  Investigate the use of geospatial information to improve the employee allocation method  Review of the editing of gross earnings  Investigate the use of Banff 27