Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.

Slides:



Advertisements
Similar presentations
Eurostat T HE E UROPEAN PROCESS OF ENHANCING ACCESS TO E UROSTAT DATA A LEKSANDRA B UJNOWSKA E UROSTAT.
Advertisements

Confidentiality risks of releasing measures of data quality Jerry Reiter Department of Statistical Science Duke University
Business microdata dissemination at Istat Daniela Ichim Luisa Franconi
Implementation of the CoP in SLOVENIA Cooperation with data users Genovefa RUŽIĆ Deputy Director-General.
Eurostat Statistical Disclosure Control. Presented by Peter-Paul de Wolf, Statistics Netherlands (CBS)
Workshop DwB, Lausanne, March 2014 Leo Engberts, accountmanager International Affairs Accreditation Data Service Statistics Netherlands.
The Special Licence model for access to more detailed micro data IASSIST 2006 Thursday 25 May Karen Dennison UK Data Archive.
Access to and specifics of detailed national LFS data – the case of Slovenia Sebastian Kočar Social Science Data Archives University of Ljubljana 4th DwB.
Developing a Statistical Disclosure Standard for Europe Tanvi Desai LSE Research Laboratory Data Manager Research Laboratory IASSIST 2010: Cornell.
Modernisation of Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS Workshop on Modernisation of Statistical Production Geneva, 15–17.
DWB – 2 nd Regional Workshop Athens, October 2014 Adolfo Gálvez INE Accesing microdata for scientific research purposes- INE Spain.
Eurostat M ODES OF ACCESS TO EU MICRODATA IN THE NEW LEGAL FRAMEWORK A LEKSANDRA BUJNOWSKA E UROSTAT S TATISTICAL OFFICE OF THE E UROPEAN U NION.
Metadata driven application for aggregation and tabular protection Andreja Smukavec SURS.
Basque Statistics Office Confidentiality Project: Final stages Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain,
Research data workflow Practice in Slovenian Social Science Data Archives SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
FORMS OF COOPERATION BETWEEN NATIONAL STATISTICAL INSTITUTES AND DATA ARCHIVES Sebastian Kočar (ADP, UL) First Regional Workshop – Microdata Access in.
CES Task Force on Confidentiality and Microdata Tiina Luige UNECE Statistical Division Conference of European Statisticians UN Economic Commission for.
WP. 46 Providing access to data and making microdata safe, experiences of the ONS Jane Longhurst Paul Jackson ONS.
Accreditation practices at the Hungarian Central Statistical Office Zoltán Vereczkei Methodology Department Hungarian Central Statistical Office
Access to microdata in Europe P resented by Michel Isnard – Insee DwB Training Course, Barcelona, Jan
Mara Cammarrota Italian National Institute of Statistics Development of Information System and Corporate Products, Information Management and Quality Assessment.
Population census micro data for research: the case of Slovenia Danilo Dolenc Statistical Office of the Republic of Slovenia Ljubljana, First Regional.
1 New Implementations of Noise for Tabular Magnitude Data, Synthetic Tabular Frequency and Microdata, and a Remote Microdata Analysis System Laura Zayatz.
Access to official statistical micro data at the Statistical Office of the Republic of Slovenia and cooperation with the Slovenian Social Science Data.
Data Protection Corporate training Data Protection Act 1998 Replaces DPA 1994 EC directive 94/46/EC The Information Commissioner The courts.
The use of protected microdata in tabulation: case of SDC-methods microaggregation and PRAM Researcher Janika Konnu Manchester, United Kingdom December.
Frameworks for the Access and Use of Administrative Data, With the Example of Current Practice in the UK Steven Vale Office for National Statistics UK.
Administrative procedures for microdata access at SURS October 2013.
Access to micro data in Europe P resented by Michel Isnard – Insee DwB Training Course, Paris, 19 February 2014.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
26 August 2011 Future of access to EU confidential data for scientific purposes Jean-Marc Museux Eurostat – 58th ISI conference,
Statistical data confidentiality and micro data in Albania
The experience of a National Statistical Institute after a law change: Estonia First Regional Workshop Microdata Access in European Countries ― Cooperation.
Access to microdata in Statistics Estonia First DwB European Data Access Forum Luxembourg, 28th March 2012 Tuulikki Sillajõe.
Implementation of the European Statistics Code of Practice Yalta September 2009 Pieter Everaers, Eurostat.
The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki,
Protection of frequency tables – current work at Statistics Sweden Karin Andersson Ingegerd Jansson Karin Kraft Joint UNECE/Eurostat.
European Conference on Quality in Official Statistics, Rome, July 2008 Community Innovation Survey: a Flexible Approach to the Dissemination of Microdata.
Giving research access to official microdata through the facilities of a National Statistical Institute: INE in Portugal Jose A. PINTO MARTINS - D ISSEMINATION.
Access to microdata in the Netherlands: from a cold war to co-operation projects Eric Schulte Nordholt Senior researcher and project leader of the Census.
Joint UNECE/Eurostat work session on statistical data confidentiality Manchester, December 2007 Dealing with Confidentiality in Dissemination: The.
Overview and challenges in the use of administrative data in official statistics IAOS Conference Shanghai, October 2008 Heli Jeskanen-Sundström Statistics.
Data Dissemination Conditions in the European Statistical System (ESS) UNECE, Warschau May 2009.
Joint UNECE/Eurostat work session on statistical data confidentiality October 2015 Helsinki, Finland Circle of trust Maurice Brandt DESTATIS.
Michel ISNARD Insee – Head of Legal Affairs Access to individual data in France 28/10/2013.
How official statistics is produced Alan Vask
National Statistics - access and disclosure issues for Vital Events data Allan Baker Office for National Statistics.
M O N T E N E G R O Negotiating Team for Accession of Montenegro to the European Union Working Group for Chapter 18 – Statistics Bilateral screening: Chapter.
Improving researcher access to USDA’s Agricultural Resource Management Survey Charles Towe and Mitch Morehart Economic Research Service, USDA.
Data Accessibility, Confidentiality and Copyright United Nations Statistics Division Demographic Statistics Section.
Confidentiality in Published Statistical Tables
Measures for Information Loss in Protected Data
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
Legal, political and methodological issues in confidentiality in the ESS Maria João Santos, Jean-Marc Museux Eurostat.
Ethical questions on the use of big data in official statistics
Working Group on Statistical Confidentiality, October 2009
Data from statistical modeling (e. g
TG EHIS January 2012 Item 3.2 of the agenda EHIS wave 1 anonymised data Bart De Norre, Eurostat.
Quality, efficiency and productivity: a challenge for official statistics EFTA/CROSTAT/EUROSTAT Strategic Management Seminar, Split, November 2007.
Operational Programme and Personal Data Protection
On data accessibility and confidentiality……..
Dissemination Working Group Luxembourg, May 2009
Access to micro data in Europe
Point 6. Eurostat plans for Time Use Survey data processing and dissemination Working Group on Time Use Surveys 10 April 2013.
Federal Statistical Office Germany Research Data Centre
Statistical Office of the Republic of Slovenia and microdata for research Tomaž Smrekar.
Item 5 Wim Kloek, Eurostat
Anco Hundepool Sarah Giessing
Item 2.2 Scientific Use Files for the Time Use Survey
OVERVIEW ON STATISTICAL DATA CONFIDENTIALITY AND ACCES TO MICRODATA
Presentation transcript:

Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector

Why is confidentiality protection needed? One of the fundamental principles of official statistics is that statistical information of data suppliers is strictly confidential, and is used only for statistical purposes. Legislation places a legal obligation on NSIs to protect data suppliers. Data suppliers should have confidence in the NSI to preserve the confidentiality of individual information – better quality of the collected data.

National legislation National Statistics Act –Data published in aggregated form. –Data may be published individually if written consent of reporting units is obtained; data are collected from public data collections; data are published in such a way that the reporting units cannot be directly identified. –The Office or authorized producers shall transmit individual data to users on the basis of a written application. Other legislation –Personal Data Protection Act; –…

European legislation European Regulation (EC) No 223/2009 –General definitions; –Chapter 5 – Statistical Confidentiality Access to confidential data for scientific purposes European Statistics Code of Practice -Principle 5: The confidentiality of the information the data providers provide and its use only for statistical purposes are absolutely guaranteed.

What does SDC cover at SORS? Tabular data protection –Publication –Eurostat and other institutions –Users‘ requests Microdata protection –Preparation of public-use files and scientific- use files –Checking rules set up by Eurostat Output checking

Tabular data protection Tables – aggregated data –Magnitude tables Sum of quantitative variable of respondents, where respondents are grouped by categorical variables. –Frequency tables Number of respondents, where respondents are grouped by categorical variables.

Tabular data protection at SURS Method Cell Suppression -Post-tabular method -Non-perturbative method (less information available) -Implemented in Tau-Argus software (CASC project) -The interval of possible values for each sensitive cell is sufficiently large

Tabular data protection

Cell Suppression Sensitivity rules – defining unsafe cells –Threshold The number of respondents in a cell is below a certain threshold value. –Concentration rules One or two respondents are dominant. –Group disclosure All respondents in one cell have the same value for a sensitive variable.

Cell Suppression Secondary suppression -Needed due to sums in the tables. The feasibility interval for each unsafe cell has to be wide enough. -Optimisation problem -> LP-solver used (XPress, CPlex).

Cell Suppression - Publication

Microdata protection Microdata are deindividualized pieces of information for individual units (enterprises, persons, households). –no direct identifiers (ID numbers, TAX numbers, name + address…) Microdata files are available to our researchers in the secure room and via remote access.

Microdata protection Scientific-use file (SUF) Prepared for researchers Signed contract Usually sent by CD + password, has to be destroyed after usage More information (variables) available Only unintentional disclosures are protected

Microdata protection Public-use file (PUF) Publicly available or after registration Less information (variables) available All microdata protection methods are NOT usable (too complex for normal users) Intentional disclosures are protected

Microdata protection The goal of microdata protection is to make a safe microdata file, where –disclosure risk is low; –analyses done on a safe file have to give results which are close or equal to results of analyses done on original data.

Microdata protection methods, used at SURS Modifying original microdata file, done by –non-perturbative methods: global recoding; top and bottom coding; local suppression (not very usable for PUFs). –some perturbative methods: microaggregation; rounding. Software packages Mu-Argus and R.

Labour Force Survey - PUF Prepared for Social Data Archives (DwB project). We used Eurostat‘s rules for creating SUF and by method sampling created PUF (one third of original sample). We didn‘t use local suppression. The quality of statistics used as parameters for method sampling is ensured, other should be used with precaution.

Output checking 1.Researchers fill out our form after finishing work. 2.An is sent to our common address 3.One of the SDC methodologists checks the output. In case of disclosive data or incorrectly filled form, the researchers are contacted for additional information or to correct the output. 4.After the SDC methodologist agrees with the dissemination, the output is sent to the researcher by .

Rules for output checking Rule-of-thumb model –Threshold N – all tabular and similar output should have at least N units. –Dominance rule – the analysis should not be done on groups with a dominant unit. –Maximum and minimum are usually not released (exception if they refer to more than one unit). –100% percentile is usually not released (maximum).

Thank you for your attention!