Access to microdata in the Netherlands: from a cold war to co-operation projects Eric Schulte Nordholt Senior researcher and project leader of the Census.

Slides:



Advertisements
Similar presentations
Balancing Access and Confidentiality Jenny Telford Australian Bureau of Statistics September 2008.
Advertisements

The Statistics Act and Research Access to Data Paul J Jackson Legal Services ONS.
Conference Programme Introduction to the Samples of Anonymised Records - Keith Spicer, ONS CCSR's role in providing SAR's support - Jo Wathan,
Issues in Designing a Confidentiality Preserving Model Server by Philip M Steel & Arnold Reznek.
Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.
Implementation of the CoP in SLOVENIA Cooperation with data users Genovefa RUŽIĆ Deputy Director-General.
© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables Magnitude Tables Web.
Eurostat Statistical Disclosure Control. Presented by Peter-Paul de Wolf, Statistics Netherlands (CBS)
Workshop DwB, Lausanne, March 2014 Leo Engberts, accountmanager International Affairs Accreditation Data Service Statistics Netherlands.
In a Virtual Data Centre Protecting Confidentiality COMPUTATIONAL INFORMATICS Christine O’Keefe, Mark Westcott, Adrien Ickowicz, Maree O’Sullivan, CSIRO.
Counting the Dutch, The Future of the Virtual Census in the Netherlands Presentation at the seminar Counting the 7 Billion 24 February 2012 * Geert Bruinooge.
«Doing research with secondary data» Eva Deuchert Bern, November
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
Access routes to 2001 UK Census Microdata: Issues and Solutions Jo Wathan SARs support Unit, CCSR University of Manchester, UK
The UK Statistics and Registrations Services Act Tanvi Desai Data Manager LSE Research Laboratory Research Laboratory IASSIST Tampere.
The Dutch Censuses of 1960, 1971 and 2001 Producing public use files in the IPUMS project Wijnand Advokaat Statistics Netherlands Division Social and Spatial.
Settings, Practices and Data Access: Results of a Survey of UK Social Scientists Jo Wathan Centre for Census and Survey Research University of Manchester.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
Session 4. Panel session: How useful is the notion of “circle of trust” concept ? A vision for the future. Maurice Brandt Destatis Germany 2ND EUROPEAN.
Basque Statistics Office Confidentiality Project: Final stages Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain,
Microdata Simulation for Confidentiality of Tax Returns Using Quantile Regression and Hot Deck Jennifer Huckett Iowa State University June 20, 2007.
Research data workflow Practice in Slovenian Social Science Data Archives SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census Keith Spicer, Caroline Tudor and George Cornish 1 Joint UNECE/Eurostat.
Comparing approaches of different (partly) register-based countries Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics.
Confidentiality Issues with “Small Cell” Data Michael C. Samuel, DrPH STD Control Branch California Department of Public Health 2008 National STD Prevention.
Disclosure Avoidance: An Overview Irene Wong ACCOLEDS/DLI Training December 8, 2003.
CES Task Force on Confidentiality and Microdata Tiina Luige UNECE Statistical Division Conference of European Statisticians UN Economic Commission for.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Plans for Access to UK Microdata from 2011 Census Emma White Office for National Statistics 24 May 2012.
Daniel Beckler United States Department of Agriculture National Agricultural Statistics Service Timothy Mulcahy NORC at the University of Chicago Topic.
1 The 2001 Census PUMFS Odyssey Sponsored by HAL and PALS Presented by Chuck Humphrey.
Population census micro data for research: the case of Slovenia Danilo Dolenc Statistical Office of the Republic of Slovenia Ljubljana, First Regional.
Framework of Statistical Information. This is a typology of the categories or classes of statistical information. Remember the relationship between statistics.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
Access to official statistical micro data at the Statistical Office of the Republic of Slovenia and cooperation with the Slovenian Social Science Data.
The Dutch Virtual Census based on registers and already existing surveys Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics.
National design, fieldwork and data harmonization for Labour Force Survey Irena Svetin Statistical Office of the Republic of Slovenia September 2014.
The Dutch Virtual Census of 2001 A New Approach by Combining Different Sources Eric Schulte Nordholt ECE Census meetings Geneva, November 2004.
Methodology used for estimating Census tables based on incomplete information Eric Schulte Nordholt Senior researcher and project leader of the Census.
WP 19 Assessment of Statistical Disclosure Control Methods for the 2001 UK Census Natalie Shlomo University of Southampton Office for National Statistics.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
1 IPAM 2010 Privacy Protection from Sampling and Perturbation in Surveys Natalie Shlomo and Chris Skinner Southampton Statistical Sciences Research Institute.
CES Recommendations for 2020 round on census methodology Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Statistical data confidentiality and micro data in Albania
The experience of a National Statistical Institute after a law change: Estonia First Regional Workshop Microdata Access in European Countries ― Cooperation.
The availability of Dutch census microdata Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands Division Social.
Using Targeted Perturbation of Microdata to Protect Against Intelligent Linkage Mark Elliot, University of Manchester Cathie.
Why register-based statistics? Eric Schulte Nordholt Statistics Netherlands Division Social and Spatial Statistics Department Support and Development Section.
The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki,
Statistics Netherlands’ modernization programme: the use of administrative data, lessons learned and the way ahead. Geert Bruinooge Assistant Director.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Joint UNECE/Eurostat work session on statistical data confidentiality Manchester, December 2007 Dealing with Confidentiality in Dissemination: The.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
HETUS Pilot Group 8 Privacy procedures and ethical issues Kimberly Fisher, Centre for Time Use Research – co-ordinator External consultant Kai Ludwigs.
Joint UNECE/Eurostat work session on statistical data confidentiality October 2015 Helsinki, Finland Circle of trust Maurice Brandt DESTATIS.
Census 2011 – A Question of Confidentiality Statistical Disclosure control for the 2011 Census Carole Abrahams ONS Methodology BSPS – York, September 2011.
Michel ISNARD Insee – Head of Legal Affairs Access to individual data in France 28/10/2013.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
Disclosure scenario and risk assessment: Structure of Earnings Survey
Data Confidentiality and the Common Good.
Census developments in the Netherlands
Statistics Netherlands Division Social and Spatial Statistics
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
TG EHIS January 2012 Item 3.2 of the agenda EHIS wave 1 anonymised data Bart De Norre, Eurostat.
Disclosure Avoidance: An Overview
Telling Canada’s story in numbers Marie-Josée Major
Federal Statistical Office Germany Research Data Centre
The role of metadata in census data dissemination
Item 2.2 Scientific Use Files for the Time Use Survey
Presentation transcript:

Access to microdata in the Netherlands: from a cold war to co-operation projects Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands Division Socio-economic and Spatial Statistics Work Session on Statistical Data Confidentiality in Ottawa October 2013

2 Contents Introduction The release of public use microdata files The release of microdata for researchers On-site access Remote access Co-operation projects Conclusions

3 Introduction (1) Statistical offices have a lot of statistical information Researchers and policy makers want this information Privacy! Statistical Disclosure Control in the Netherlands: publish and release as much detail as possible without disclosing sensitive information that can be attributed to individual respondents

4 Introduction (2) Information from NSIs: tabular data and microdata Monopoly of NSIs Eighties of last century: end of monopoly less microdata available (risk awareness) How to end the cold war between Statistics Netherlands and the academia?

5 The release of public use microdata files (1) For everybody, but severe protection Rules for public use microdata files: 1.Microdata must be at least one year old 2.No direct identifiers or direct regional variables 3.Only 1 kind of indirect regional variables. Values of indirect regional variables sufficiently scattered. Each area should contain at least 200,000 persons in the target population and should consist of municipalities from at least six of the twelve provinces. No dominating municipality in any area. 4.At most 15 indirect identifiers 5.No sensitive variables

6 The release of public use microdata files (2) Rules for public use microdata files (continued): 6.Sampling weights should not provide additional identifying information 7.Rule against spontaneous recognition: at least 200,000 individuals in the population for each category of an identifying variable 8.Another rule against spontaneous recognition: at least 1000 individuals in the population for each category in the crossing of two identifying variables 9.At least 5 households per combination of categories of household variables 10.Records should be in random order Conclusion: public use microdata files are useful for educational purposes and promoting the Census

7 The release of microdata for researchers (1) Only for bonafide researchers (under contract) Rules for microdata for researchers: 1.No direct identifiers 2.Rule against spontaneous recognition: each combination of an extremely identifying variable, a very identifying variable and an identifying variable should occur at least 100 times in the population 3.Extension of this rule: maximum level of detail of some variables (occupation, level of education, branch of economic activity) is determined by the most detailed direct regional variable 4.Each region that can be distinguished in the microdata should contain at least 10,000 inhabitants 5.No direct regional variables in panel data

8 The release of microdata for researchers (2) Identifying variables Direct (formal) identifiers Name, address, citizen service number, … Indirect identifiers, differentiated into Extremely identifying (E) Very identifying (V) Identifying (I) V E I

9 The release of microdata for researchers (3) Examples of identifying variables Extremely identifying: Regional variables (residence, work, …) Very identifying: Sex, nationality + Extremely identifying variables Identifying: Age, occupation, education + Very identifying variables E V I

10 On-site access Researchers work in a secure area of the statistical institute Researchers can apply the standard statistical software packages and also bring their own programmes Researchers and their superiors have to sign that they will not disclose the individual information of respondents

11 Remote access Combination of advantages of on desk (no commuting to the NSI) and on-site (large detail in microdata) Security risks are high, especially with remote execution (no intermediary between the researcher and the statistical institute) Remote access has become very popular in several countries Both on-site access and remote access require output checking

12 Co-operation projects Special contracts with research institutes Three different situations: Only Statistics Netherlands is publishing output Also partners publish output but the protection rules of Statistics Netherlands apply Other partners also publish and have their own rules (respecting the national privacy act)

13 Conclusions Both public use microdata files and microdata for researchers can be produced easily with μ- ARGUS Lots of protection techniques are available, most popular strategy is to first recode the identifying variables, and then to suppress the remaining unsafe combinations Microdata for on-site and remote access may contain all variables except direct identifiers Co-operation projects have given Statistics Netherlands a stronger and more relevant position

14 Time for questions and discussion