© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables Magnitude Tables Web.

Slides:



Advertisements
Similar presentations
Characterization and Management of Multiple Components of Cost and Risk in Disclosure Protection for Establishment Surveys Discussion of Advances in Disclosure.
Advertisements

WP 33 Information Loss Measures for Frequency Tables Natalie Shlomo University of Southampton Office for National Statistics Caroline.
The Microdata Analysis System (MAS): A Tool for Data Dissemination Disclaimer: The views expressed are those of the authors and not necessarily those of.
Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.
BTS Confidentiality Seminar Series June 11, 2003 FCSM/CDAC Disclosure Limiting Auditing Software: DAS Mark A. Schipper Ruey-Pyng Lu Energy Information.
Confidentiality risks of releasing measures of data quality Jerry Reiter Department of Statistical Science Duke University
Eurostat Statistical Disclosure Control. Presented by Peter-Paul de Wolf, Statistics Netherlands (CBS)
SDC for continuous variables under edit restrictions Natalie Shlomo & Ton de Waal UN/ECE Work Session on Statistical Data Editing, Bonn, September 2006.
In a Virtual Data Centre Protecting Confidentiality COMPUTATIONAL INFORMATICS Christine O’Keefe, Mark Westcott, Adrien Ickowicz, Maree O’Sullivan, CSIRO.
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
Estimation of the number of people with undiagnosed HIV infection in a country Andrew Phillips, UCL, London HIV in Europe Meeting 2 November 2009, Stockholm.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect5_1.
Aspects of Bayesian Inference and Statistical Disclosure Control in Python Duncan Smith Confidentiality and Privacy Group CCSR University of Manchester.
A Measure of Disclosure Risk for Fully Synthetic Data Mark Elliot Manchester University Acknowledgements: Chris Dibben, Beata Nowak and Gillian Raab.
Metadata driven application for aggregation and tabular protection Andreja Smukavec SURS.
11 Comparison of Perturbation Approaches for Spatial Outliers in Microdata Natalie Shlomo* and Jordi Marés** * Social Statistics, University of Manchester,
1. “Software for Tabular Data Protection” Joe Fred Gonzalez, Jr. Lawrence H. Cox National Center for Health Statistics NCHS Data Users Conference July.
1 Methods of Confidentiality Protection Arnold P. Reznek U.S. Census Bureau CES Room 2K128F Washington, DC Fax
Overview of 2002 CIPSEA: Methods to Protect Confidential Tabular Data Amrut Champaneri, Ph.D. U.S. Department of Transportation Bureau of Transportation.
Version 1.1 Tau-Argus and SuperCROSS A practical example using the UK Business Register Unit data Andrea Staggemeier Philip Lowthian Grant Lee.
Confidentiality Issues with “Small Cell” Data Michael C. Samuel, DrPH STD Control Branch California Department of Public Health 2008 National STD Prevention.
1 Tel Aviv April 29th, 2007 Disclosure Limitation from a Statistical Perspective Natalie Shlomo Dept. of Statistics, Hebrew University Central Bureau of.
Statistical Disclosure Control for the 2011 UK Census Jane Longhurst, Caroline Young and Caroline Miller (ONS)
1 Statistical Disclosure Control Methods for Census Outputs Natalie Shlomo SDC Centre, ONS January 11, 2005.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
Clanché – July 2010 The dissemination French census results since 2009 INSEE (French NSI) - Demographic departement.
Chapter 10. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes Jean-Hugues Chauchat and Ricco.
Luisa Franconi Integration, Quality, Research and Production Networks Development Department Unit on microdata access ISTAT Essnet on Common Tools and.
Discussion of “ Statistical Disclosure Limitation: Releasing Useful Data for Statistical Analysis” Nancy J. Kirkendall Energy Information Administration.
Daniel Beckler United States Department of Agriculture National Agricultural Statistics Service Timothy Mulcahy NORC at the University of Chicago Topic.
Copyright 2010, The World Bank Group. All Rights Reserved. Managing processes Core business of the NSO Part 2 Strengthening Statistics Produced in Collaboration.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
1 New Implementations of Noise for Tabular Magnitude Data, Synthetic Tabular Frequency and Microdata, and a Remote Microdata Analysis System Laura Zayatz.
1 Assessing the Impact of SDC Methods on Census Frequency Tables Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton.
Assessing Disclosure for a Longitudinal Linked File Sam Hawala – US Census Bureau November 9 th, 2005.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
The Dutch Virtual Census based on registers and already existing surveys Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics.
The use of protected microdata in tabulation: case of SDC-methods microaggregation and PRAM Researcher Janika Konnu Manchester, United Kingdom December.
Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.
9-1 Using SafetyAnalyst Module 4 Countermeasure Evaluation.
WP 19 Assessment of Statistical Disclosure Control Methods for the 2001 UK Census Natalie Shlomo University of Southampton Office for National Statistics.
Michelle Simard Statistics Canada UNECE Worksessions on Statistical Disclosure Control Methods Helsinki, October 2015 Development of rules from administrative.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
1 IPAM 2010 Privacy Protection from Sampling and Perturbation in Surveys Natalie Shlomo and Chris Skinner Southampton Statistical Sciences Research Institute.
© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Topic ii New Methodologies for Protecting Data (Disclosure Limitation) Univ. Edinburgh:
Statistical data confidentiality and micro data in Albania
The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki,
Protection of frequency tables – current work at Statistics Sweden Karin Andersson Ingegerd Jansson Karin Kraft Joint UNECE/Eurostat.
Michelle Simard, Thérèse Lalor Statistics Canada CSPA Project Manager UNECE Work Session on Statistical Data Confidentiality Helsinki, October 2015 Confidentialized.
Access to microdata in the Netherlands: from a cold war to co-operation projects Eric Schulte Nordholt Senior researcher and project leader of the Census.
Joint UNECE/Eurostat work session on statistical data confidentiality Manchester, December 2007 Dealing with Confidentiality in Dissemination: The.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.
1 The Process of Practicing Statistical Disclosure Control in Tabular Data at Statistics Sweden Q2010 Helsinki, May 4-6 Ingegerd Jansson, Michael Carlson,
Remote Analysis Server for Tabulation and Analysis of Data Tarragonia, October 2011 James Chipperfield and Frank Yu (presenter)
Census 2011 – A Question of Confidentiality Statistical Disclosure control for the 2011 Census Carole Abrahams ONS Methodology BSPS – York, September 2011.
Reconciling Confidentiality Risk Measures from Statistics and Computer Science Jerry Reiter Department of Statistical Science Duke University.
NATIONAL STATISTICS OFFICES AND THE PROSUMER CHALLENGE New Techniques and Technologies for Statistics (NTTS) Seminar Brussels, February 2009 Space-Time.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
Improving researcher access to USDA’s Agricultural Resource Management Survey Charles Towe and Mitch Morehart Economic Research Service, USDA.
Natalie Shlomo Social Statistics, School of Social Sciences
Data Confidentiality and the Common Good.
Assessing Disclosure Risk in Microdata
Confidentiality in Published Statistical Tables
Treatment of statistical confidentiality Table protection using Excel and tau-Argus Practical course Trainer: Felix Ritchie CONTRACTOR IS ACTING UNDER.
Treatment of statistical confidentiality Table protection using Excel and tau-Argus Practical course Trainer: Felix Ritchie CONTRACTOR IS ACTING UNDER.
Martha Stinson. T. Kirk White. James Lawrence
Classification Trees for Privacy in Sample Surveys
Item 5 Wim Kloek, Eurostat
Open Data Sharing and its Statistical Limitations
Presentation transcript:

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables Magnitude Tables Web Access

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables C. Dwork et al. (Microsoft Research) suggest a new concept for disclosure risk avoidance for frequency data - “differential privacy” – and techniques to ensure it. These techniques are based on adding noise to Fourier coefficients corresponding to a given contingency table. A new method for assessing disclosure risk (i.e. risk of attribute disclosure) for tables of counts, the subtraction - attribution probability (SAP) method has been proposed by D. Smith and M. Elliot (University of Manchester). N. Shlomo (University of Southampton) compares the performance of several techniques to protect population counts tables with respect to disclosure risk and information loss. Statistics New Zealand uses Random Rounding, a mean cell size rule and a threshold rule for SDC of population count tables. M. Camden et al. calculate measures for utility and safety assessing the quality of this SDC concept. J.J. Salazar (Univerity La Laguna) explains advantages and disadvantages of the mathematical models for Controlled (Integer) Rounding vs. (continous) Tabular Adjustment. Limitations (No of variables/Categories)? Could other SDC methods ensure differential privacy? Applicable to tabulations from a survey with hundreds of variables?

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables C. Dwork et al. (Microsoft Research) suggest a new concept for disclosure risk avoidance for frequency data - “differential privacy” – and techniques to ensure it. These techniques are based on adding noise to Fourier coefficients corresponding to a given contingency table. A new method for assessing disclosure risk (i.e. risk of attribute disclosure) for tables of counts, the subtraction - attribution probability (SAP) method has been proposed by D. Smith and M. Elliot (University of Manchester). N. Shlomo (University of Southampton) compares the performance of several techniques to protect population counts tables with respect to disclosure risk and information loss. Statistics New Zealand uses Random Rounding, a mean cell size rule and a threshold rule for SDC of population count tables. M. Camden et al. calculate measures for utility and safety assessing the quality of this SDC concept. J.J. Salazar (Univerity La Laguna) explains advantages and disadvantages of the mathematical models for Controlled (Integer) Rounding vs. (continous) Tabular Adjustment. How to embed the SAP method into an SDC strategy?

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables C. Dwork et al. (Microsoft Research) suggest a new concept for disclosure risk avoidance for frequency data - “differential privacy” – and techniques to ensure it. These techniques are based on adding noise to Fourier coefficients corresponding to a given contingency table. A new method for assessing disclosure risk (i.e. risk of attribute disclosure) for tables of counts, the subtraction - attribution probability (SAP) method has been proposed by D. Smith and M. Elliot (University of Manchester). N. Shlomo (University of Southampton) compares the performance of several techniques to protect population counts tables with respect to disclosure risk and information loss. Statistics New Zealand uses Random Rounding, a mean cell size rule and a threshold rule for SDC of population count tables. M. Camden et al. calculate measures for utility and safety assessing the quality of this SDC concept. J.J. Salazar (Univerity La Laguna) explains advantages and disadvantages of the mathematical models for Controlled (Integer) Rounding vs. (continous) Tabular Adjustment. Cell suppression/simple Imputation least distortion – Cell suppression best method???

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables C. Dwork et al. (Microsoft Research) suggest a new concept for disclosure risk avoidance for frequency data - “differential privacy” – and techniques to ensure it. These techniques are based on adding noise to Fourier coefficients corresponding to a given contingency table. A new method for assessing disclosure risk (i.e. risk of attribute disclosure) for tables of counts, the subtraction - attribution probability (SAP) method has been proposed by D. Smith and M. Elliot (University of Manchester). N. Shlomo (University of Southampton) compares the performance of several techniques to protect population counts tables with respect to disclosure risk and information loss. Statistics New Zealand uses Random Rounding, a mean cell size rule and a threshold rule for SDC of population count tables. M. Camden et al. calculate measures for utility and safety assessing the quality of this SDC concept. J.J. Salazar (Univerity La Laguna) explains advantages and disadvantages of the mathematical models for Controlled (Integer) Rounding vs. (continous) Tabular Adjustment. Do integrality problems matter for magnitude tables? Could variable controlled rounding be modelled (efficiently)?

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Magnitude Tables The US Census Bureau adds noise to the underlying microdata prior to tabulation. The paper by L. Zayatz also addresses other SDC research areas at the USBC like synthetic micro data generation (also used to protect frequency tabular data) and a remote microdata analysis system. L. Cox (US NCHS) compares properties of two methods for Controlled Tabular Adjustment, one based on LP technology, the other on iterative proportional fitting. Using tabular structures of EIA publications, and artificial microdata, R. Dandekar compares empirically the performance of various methods for tabular data protection, i.e. CTA, USBC’s noise method and cell suppression. P.P. de Wolf (CBS Netherlands) discusses a possible way to describe a simple class of linked tables that is often considered at NSI's. Web Access The USDA Economic Research Service has developed web-based data delivery tools for access to farm survey data (M. Morchart, C. Towe) User reactions?

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Magnitude Tables The US Census Bureau adds noise to the underlying microdata prior to tabulation. The paper by L. Zayatz also addresses other SDC research areas at the USBC like synthetic micro data generation (also used to protect frequency tabular data) and a remote microdata analysis system. L. Cox (US NCHS) compares properties of two methods for Controlled Tabular Adjustment, one based on LP technology, the other on iterative proportional fitting. Using tabular structures of EIA publications, and artificial microdata, R. Dandekar compares empirically the performance of various methods for tabular data protection, i.e. CTA, USBC’s noise method and cell suppression. P.P. de Wolf (CBS Netherlands) discusses a possible way to describe a simple class of linked tables that is often considered at NSI's. Web Access The USDA Economic Research Service has developed web-based data delivery tools for access to farm survey data (M. Morchart, C. Towe) Any plans for Linked Tables version of  -ARGUS HiTaS?

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Magnitude Tables The US Census Bureau adds noise to the underlying microdata prior to tabulation. The paper by L. Zayatz also addresses other SDC research areas at the USBC like synthetic micro data generation (also used to protect frequency tabular data) and a remote microdata analysis system. L. Cox (US NCHS) compares properties of two methods for Controlled Tabular Adjustment, one based on LP technology, the other on iterative proportional fitting. Using tabular structures of EIA publications, and artificial microdata, R. Dandekar compares empirically the performance of various methods for tabular data protection, i.e. CTA, USBC’s noise method and cell suppression. P.P. de Wolf (CBS Netherlands) discusses a possible way to describe a simple class of linked tables that is often considered at NSI's. Web Access The USDA Economic Research Service has developed web-based data delivery tools for access to farm survey data (M. Morchart, C. Towe) Details on cell suppression approach within the tool?

© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Discussion/Questions to the authors Dwork et al.: Limitations (No of variables/Categories)? Could other SDC methods ensure differential privacy? Applicable to tabulations from a survey with hundreds of variables? Smith/Elliot: How to embed the SAP method into an SDC strategy? Shlomo: Cell suppression+simple imputation least distortion – Cell suppression best method??? Salazar: Do integrality problems matter for magnitude tables? Is variable controlled rounding a realistic option? Zayatz: User reactions? De Wolf: Any plans for Linked Tables version of  -ARGUS HiTaS? Morehart/Towe: Details on cell suppression approach within the tool?