Michelle Simard Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, November 23 rd, 2011 Progress on Real Time Remote.

Slides:



Advertisements
Similar presentations
Multiple Indicator Cluster Surveys Survey Design Workshop
Advertisements

Balancing Access and Confidentiality Jenny Telford Australian Bureau of Statistics September 2008.
Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.
Confidentiality risks of releasing measures of data quality Jerry Reiter Department of Statistical Science Duke University
Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada.
SPSS Session 5: Association between Nominal Variables Using Chi-Square Statistic.
FAO assessment of global undernourishment. Current practice and possible improvements Carlo Cafiero, ESS Rome, September CFS Round Table on.
1. 2 BIOSTATISTICS TOPIC 5.4 MEASURES OF DISPERSION.
Statistics Canada Statistique Canada Protecting Confidentiality in Canadian Research Data Centres Cynthia Cook Senior Research Data Centre Analyst, Statistics.
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Quick Data Summaries in SAS Start by bringing in data –Use permanent data set for these examples Proc Tabulate –Produces summaries very quickly and easily.
Timed. Transects Statistics indicate that overall species Richness varies only as a function of method and that there is no difference between sites.
The Overview of GDP Estimates and Related Issues in China Discussant comments Session: Recent Developments in the NBS (I) Tuesday, December 1, 2009 Kim.
Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.
Metadata driven application for aggregation and tabular protection Andreja Smukavec SURS.
Frequency Table Frequency tables are an efficient method of displaying data The number of cases for each observed score are listed Scores that have 0 cases.
National Household Survey: collection, quality and dissemination Laurent Roy Statistics Canada March 20, 2013 National Household Survey 1.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Joint UNECE/Eurostat Meeting on Population and Housing Censuses (28-30 October 2009) Accuracy evaluation of Nuts level 2 hypercubes with the adoption of.
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
Statistics Canada’s Real Time Remote Access Solution 2011 MSIS Meeting – Karen Doherty May 2011.
G-Confid: Turning the tables on disclosure risk Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Ottawa, Canada 30 October 2013 Peter.
Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates.
Biostatistics: Measures of Central Tendency and Variance in Medical Laboratory Settings Module 5 1.
4 - 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Discussion of “ Statistical Disclosure Limitation: Releasing Useful Data for Statistical Analysis” Nancy J. Kirkendall Energy Information Administration.
Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista.
Daniel Beckler United States Department of Agriculture National Agricultural Statistics Service Timothy Mulcahy NORC at the University of Chicago Topic.
Product Evaluation & Quality Improvement. Overview Objectives Background Materials Procedure Report Closing.
1 Things That May Affect Estimates from the American Community Survey.
American Community Survey Maryland State Data Center Affiliate Meeting September 16, 2010.
The Research Data Centre Program Microdata Access Division Heather Hobson April 23, 2009.
1 Assessing the Impact of SDC Methods on Census Frequency Tables Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton.
Stop the Madness: Use Quality Targets Laurie Reedman.
Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.
Computing tasks associated with Time Series Processing Extract from a presentation by Fortier and Quenneville Statistics Canada -TSRAC BSMD Seminar --
Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia.
T T03-01 Calculate Descriptive Statistics Purpose Allows the analyst to analyze quantitative data by summarizing it in sorted format, scattergram.
Things that May Affect the Estimates from the American Community Survey Updated February 2013.
TYPES There are several TYPES of variables that reflect characteristics of the data Ratio Interval Ordinal Nominal.
WP 19 Assessment of Statistical Disclosure Control Methods for the 2001 UK Census Natalie Shlomo University of Southampton Office for National Statistics.
Michelle Simard Statistics Canada UNECE Worksessions on Statistical Disclosure Control Methods Helsinki, October 2015 Development of rules from administrative.
Quality Assurance Programme of the Canadian Census of Population Expert Group Meeting on Population and Housing Censuses Geneva July 7-9, 2010.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
1 Using Fixed Intervals to Protect Sensitive Cells Instead of Cell Suppression By Steve Cohen and Bogong Li U.S. Bureau of Labor Statistics UNECE/Work.
Lyne Guertin Census Data Processing and Estimation Section Social Survey Methods Division Methodology Branch, Statistics Canada UNECE April 28-30, 2014.
A Quality Driven Approach to Managing Collection and Analysis
State Statistical Institute Berlin-Brandenburg Jörg Höhne / Julia HöningerResearch Data Centre Morpheus – Remote Data Access with a Quality Measure Joint.
The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki,
Protection of frequency tables – current work at Statistics Sweden Karin Andersson Ingegerd Jansson Karin Kraft Joint UNECE/Eurostat.
Energy Balances Communication Of The Results The Canadian Experience Gary Smalldridge, Chief, Energy Section May 2009.
Joint UNECE/Eurostat work session on statistical data confidentiality Manchester, December 2007 Dealing with Confidentiality in Dissemination: The.
Microdata masking as permutation Krish Muralidhar Price College of Business University of Oklahoma Josep Domingo-Ferrer UNESCO Chair in Data Privacy Dept.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
David Price October 2011 Real Time Remote Access (RTRA) #10.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Joint UNECE-Eurostat worksession on confidentiality, 2011, Tarragona Sampling as a way to reduce risk and create a Public Use File maintaining weighted.
Wesley Yung and Claude Poirier 2015 World Statistics Congress CSPA from a Methodologist’s Point of View.
The business process models and quality issues at the Hungarian Central Statistical Office (HCSO) Mr. Csaba Ábry, HCSO, Methodological Department Geneva,
Census 2011 – A Question of Confidentiality Statistical Disclosure control for the 2011 Census Carole Abrahams ONS Methodology BSPS – York, September 2011.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
Étienne Saint-Pierre and Serge Godbout, Statistics Canada
An Active Collection using Intermediate Estimates to Manage Follow-Up of Non-Response and Measurement Errors Jeannine Claveau, Serge Godbout and Claude.
Quick Data Summaries in SAS
Assessing Quality of Paradata to Better Understand the Data Collection Process for CAPI Social Surveys François Laflamme Milana Karaganis European Conference.
Protecting Confidential Data
Item 2.2 of the Agenda Remote access to confidential data for researchers: possible actions under the 7th Framework Programme Pascal JACQUES Unit B 5 15.
SAFE – a method for anonymising the German Census
Étienne Saint-Pierre, Statistics Canada
Presentation transcript:

Michelle Simard Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, November 23 rd, 2011 Progress on Real Time Remote Access

19/02/2016 Statistics Canada Statistique Canada 2 Since 2009  Developed a Prototype offering tabulated counts  Developed Statistical Disclosure Control (SDC)  Continued development on different fronts

19/02/2016 Statistics Canada Statistique Canada 3 The Prototype

19/02/2016 Statistics Canada Statistique Canada 4  Spring 2010  Tabular(counts) outputs only - SAS only  Modified PROC FREQ, Data steps  Limited to particular household surveys data sets  Confidentiality automated, no manual intervention  Limited to some Canadian Federal Departments only  Ability to query RTRA micro data at any time  Access from any computer with internet access, using a secure username and password  No travel to Research Data Centres The Prototype

19/02/2016 Statistics Canada Statistique Canada 5  Minimum 4 minutes plus process time  Maximum 3 hours plus process time  notification for outputs with 7-day retention  Formatted table in HTML or in SAS The Prototype

19/02/2016 Statistics Canada Statistique Canada 6 Additive and Controlled Rounding (ACROUND) Create rounded additive table close to original table with controlled grand total → semi-controlled rounding Use an iterative process to improve the semi- controlled result → controlled rounding Protects against possible matching of information with PUMF and small impact on precision Maximum : 5 dimensions Current SDC

7 Proc Percentile Release the percentile only if 1.there are at least n 1 observations ≥ the percentile value and at least n 2 observations ≤ the percentile value 2.it is ≠ minimum or maximum value 3.the total number of unweighted observations is ≥ m 4.the rounded frequency associated (from ACROUND) with the percentile is ≠ 0 Statistics Canada Statistique Canada 19/02/2016 Recent Development

8 Proc Mean Release the mean only if 1.there are at least n 3 observations present in the domain 2.the rounded frequency associated with the mean (from ACROUND) is ≠ 0 For both PROC, “magnitude” rounding will be applied on statistics to balance precision and noise Statistics Canada Statistique Canada 19/02/2016 Recent Development

 Not only balancing confidentiality and precision BUT quality measures as well  Evaluating the risk  Displaying information (What and How) Statistics, Standard Error(SE), Variance, Coefficient of Variation (CV), Confidence Interval (CI),Quality Indicator (QI), weighted counts, unweighted counts, ACROUND outputs? 19/02/2016 Statistics Canada Statistique Canada 9 Challenges and Issues

10 Value of CV* Quality Indicator Guideline 0 ≤ C.V. ≤ 0.10(a)very good 0.10 < C.V. ≤ 0.20(b)acceptable 0.20 < C.V. ≤ 0.35(c)marginal C.V. > 0.35(d)very poor Quality Measures  Release estimate with SE and a Quality Indicator (QI)  If not releasable ==> put ‘X’s or other symbol  otherwise release SE and QI as follows: *Note: CV is calculated from original non-rounded S.E. and percentile Statistics Canada Statistique Canada 19/02/2016

Statistics Canada Statistique Canada 11 Next steps  Used output control techniques rather than input control techniques  Next step: proportions, ratios, totals, models  May need input control techniques when going into modeling  Expansion to the academic community  Expansion to Censuses, then administrative data  Streamlining the approval processes  Developing a “fee” structure and “penalty” processes

19/02/2016 Statistics Canada Statistique Canada 12 THANK YOU  For more information,  Pour plus d’information, please contact:veuillez contacter : Michelle Simard