1 Using Fixed Intervals to Protect Sensitive Cells Instead of Cell Suppression By Steve Cohen and Bogong Li U.S. Bureau of Labor Statistics UNECE/Work.

Slides:



Advertisements
Similar presentations
Alternative Approaches to Data Dissemination and Data Sharing Jerome Reiter Duke University
Advertisements

Characterization and Management of Multiple Components of Cost and Risk in Disclosure Protection for Establishment Surveys Discussion of Advances in Disclosure.
Adaptation of Evans, Zaytaz, and Slanta (EZS) Disclosure Method to Quarterly Census of Employment and Wages (QCEW) Shail Butani U.S Bureau of Labor Statistics.
BTS Confidentiality Seminar Series June 11, 2003 FCSM/CDAC Disclosure Limiting Auditing Software: DAS Mark A. Schipper Ruey-Pyng Lu Energy Information.
Confidentiality risks of releasing measures of data quality Jerry Reiter Department of Statistical Science Duke University
Presented to: Presented by: Transportation leadership you can trust. LEHD OnTheMap Data Planning Applications Conference, Session 2 Bruce Spear, Cambridge.
Labor Market Information Colorado Department of Labor and Employment.
17 September SME Statistics OECD Workshop SME data and methodologies in the EU - item 5 Paul Feuvrier / Eurostat.
CIPSEA, Confidentiality and the ALMIS Database Roger Therrien Director, Office of Research Connecticut Department of Labor ALMIS Database Seminar San Diego,
Transportation leadership you can trust. presented to MTF Data Committee presented by Capton Siluvairajan Cambridge Systematics, Inc. Date: 10/17/2008.
Current Employment Statistics & Local Area Unemployment Statistics Basics Current Employment Statistics & Local Area Unemployment Statistics Basics Joseph.
NORM BASED APPROACHES FOR AUTOMATIC TUNING OF MODEL BASED PREDICTIVE CONTROL Pastora Vega, Mario Francisco, Eladio Sanz University of Salamanca – Spain.
MCCORMICK SRI: GOING DEEP WITH CENSUS DEMOGRAPHIC AND ECONOMIC DATA EMPLOYMENT AND UNEMPLOYMENT ESTIMATES FROM THE U.S. DEPARTMENT OF LABOR, BUREAU OF.
UNIVERSITY OF MASSACHUSETTS Dept
Labor Statistics in the United States Grace York March 2004.
1 For: Greater Yakima Chamber of Commerce Donald W. Meseck Regional Labor Economist February 9, 2015 Yakima County Economy – 2014 in Review.
Enhancing U.S. Statistics on Trade in Services Maria Borga U.S. Bureau of Economic Analysis September 14, 2010.
The North American Industry Classification System (NAICS)
The Strategic Role of Information in Sales Management
American Community Survey Household Surveys
Introduction to Computer Technology
Improvements in the BLS Business Register Richard Clayton David Talan 12th Meeting of the Group of Experts on Business Registers Paris, France September.
Fundamentals of Python: From First Programs Through Data Structures
Measuring and Enhancing Services Trade Data and Information Conference September 14, 2010 U.S. Department of Commerce, Washington, DC Service Statistics.
Data Sharing to Reduce Respondent Burden for the U.S. Census Bureau’s Business Register Presented to 12 th Meeting of the Group of Experts on Business.
Occupational Employment Statistics Prepared by: Phyllis Stallins Colorado Program Manager, OES Colorado Department of Labor and Employment.
RTI International is a trade name of Research Triangle Institute 3040 Cornwallis Road ■ P.O. Box ■ Research Triangle Park, North Carolina, USA
Fundamentals of Python: First Programs
Improving Economic Data through Data Synchronization Presentation for APDU September 25, 2009 Adrienne Pilot
System Development Process Prof. Sujata Rao. 2Overview Systems development life cycle (SDLC) – Provides overall framework for managing system development.
G-Confid: Turning the tables on disclosure risk Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Ottawa, Canada 30 October 2013 Peter.
Local Employment Dynamics (LED) & OnTheMap Nick Beleiciks Oregon Census State Data Center Meeting April 14, 2009.
Business Employment Dynamics David M. Talan Branch Chief, Quarterly Census of Employment and Wages (QCEW) Program The Council for Community and Economic.
Evaluating the Local Employment Dynamics Program as a Source of Journey-to- Work Data for Transportation Planning 1 Wende A. Mix, Ph.D. Associate Professor,
Economic Census and related sources Sarah Cohen Duke University 00:00.
Discussion of “ Statistical Disclosure Limitation: Releasing Useful Data for Statistical Analysis” Nancy J. Kirkendall Energy Information Administration.
Daniel Beckler United States Department of Agriculture National Agricultural Statistics Service Timothy Mulcahy NORC at the University of Chicago Topic.
BLS Developments Thomas J. Nardone Associate Commissioner for Employment Statistics C2ER Annual Conference June 8, 2012.
Size Standards Analysis: SBA Methodology Presented to: The Council on Federal Procurement of Architectural & Engineering Services (COFPAES) By: Khem R.
Expanding Business Employment Dynamics Industry and Survival 18 th International Roundtable on Business Survey Frames Beijing, China 10/22/04 Richard L.
The Impact of Classification Changes on Time Series Continuity The Case of U.S. Monthly Retail Sales Presented to OECD Short-Term Economic Statistics Working.
Visualizing the Bureau of Labor Statistics Employment Dataset by Siva Mohan and Curran Kelleher.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.
1 Assessing inconsistencies in reported job characteristics of employed stayers: An analysis on two-wave panels from the Italian Labour Force Survey,
LED: A New Source for Measuring Job Gains and Losses Henry Hyatt, Ph. D. Stephen Tibbets Jeremy S. Wu, Ph. D Association of Public Data Users (APDU)
Can We Continue to Exclude Small Single-Establishment Businesses from Data Collection in the Annual Retail Trade Survey and the Service Annual Survey?
Federal Geographic Data Committee October 2, 2001 John Blumenthal Bureau of Labor Statistics Suite Massachusetts Ave, N.E. Washington, D.C
Instituto Nacional de Estadística, Geografía e Informática (INEGI), Mexico National Economic Surveys (NES) Jun 2007.
Florida Manufacturing Bureau of Labor Market Statistics September 2015 Labor Statistics Data Release Date: October 16, 2015.
Protection of frequency tables – current work at Statistics Sweden Karin Andersson Ingegerd Jansson Karin Kraft Joint UNECE/Eurostat.
Labor Market Risks of a Magnitude 6.9 Earthquake in Alameda County Richard Holden, U.S. DOL/BLS Donna Bahls, California EDD Charles Real, Cal. Geological.
1 Valdosta-Lowndes 2010 Economic Survey prepared by Illuminomics for Valdosta-Lowndes Chamber of Commerce Regional Wage and Employment Trends A Comparative.
Tunable QoS-Aware Network Survivability Presenter : Yen Fen Kao Advisor : Yeong Sung Lin 2013 Proceedings IEEE INFOCOM.
1 For: South Central WDC – Full Board Meeting Donald W. Meseck Regional Labor Economist February 9, 2016 Yakima County Economy – 2015 in Review.
Security Methods for Statistical Databases. Introduction  Statistical Databases containing medical information are often used for research  Some of.
G. Merola Winton Capital Management 1 UN/ECE Work Session On Statistical Data Confidentiality (Geneva, 9-11 November 2005) WP30: Safety rules in statistical.
1 For: Economic Leadership Roundtable, Port of Douglas County Donald W. Meseck, Regional Labor Economist January 21, 2016 Wenatchee MSA Economy: 2015 in.
Local Employment Dynamics: Partnership, Public-Use Data, and Innovative Web Tools Eric Coyle Data Dissemination Specialist U.S. Census Bureau 1.
1 For: Leadership Kittitas County Donald W. Meseck Regional Labor Economist April 18, 2014 Kittitas County Economic Update.
1 For: Washington State Adult Education Advisory Council (AEAC) Donald W. Meseck, Regional Labor Economist June 8, 2016 Central Washington Economic Update.
13th OECD-NBS Workshop on National Accounts
Confidentiality in Published Statistical Tables
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
The North American Industry Classification System (NAICS)
Martha Stinson. T. Kirk White. James Lawrence
Decision Science Chapter 6 Assignment Models.
Longview 2020 Forum by the Hibbs Institute Wednesday, March 6, 2019
SAFE – a method for anonymising the German Census
Anco Hundepool Sarah Giessing
Presentation transcript:

1 Using Fixed Intervals to Protect Sensitive Cells Instead of Cell Suppression By Steve Cohen and Bogong Li U.S. Bureau of Labor Statistics UNECE/Work Session on Statistical Data Confidentiality Working Paper 38

2 BLS Quarterly Census of Employment and Wages (QCEW) Monthly employment and wages All 6-digit industries by county, by ownership, and by size group Used as benchmark source for other important surveys such as Current Employment Statistics survey and Occupational Employment Statistics. As important input for other Federal and State programs Based upon UI administrative records BLS protects the identity of cooperating employers; disclosure restrictions apply. The QCEW program that publishes a census of employment and wages covering 98 percent of employment, at the county, MSA, state and national levels by 6-digit North American Industrial Classification System every quarter:

3 Current Publication Format and Suppression rules … an example from total employment

4 Research Goal Replace Primary and Secondary Suppression with intervals containing the suppressed value Fixed set of intervals Previous efforts 1. J. J. Salazar UNECE/Eurostat Work Session on Statistical Data Confidentiality Fischetti and Salazar 1999

5 Proposed Change to Current Publication Format Using Fixed Intervals … pre-defined, fixed intervals replacing nondisclosable cells

6 Disclosure Risks Associated with Fixed Interval publication format By obtaining ranges or bounds of previously suppressed cells and incorporate them into the additive relationship in the table, outside attackers could improve their estimation precision of the primary cells that current CSP methods intend to protect Contributor to a cell or knowledgeable insiders may subtract its value from FI bounds to obtain a narrower estimate of other contributors in the same cell For cells with few contributors, small contributors can significantly improve their estimate of the dominate contributor by knowing which end of the FI bound to use For single contributor cells, one end of the FI bound may be too close to the actual value that the single respondent feels uncomfortable about

7 Our Proposed Selection-Improvement Solution to the Fixed Interval Publication Problem (FIPP) Step 1. Identify primary and secondary cells via a CSP method and publish them in pre-defined FIs Step 2. Apply linear constrained optimization to identify those primary cells with disclosure risks (audit) Step 3. Select additional protecting cells for those primary cells at risk while minimizing information loss Step 4. Audit the table one more time, exit if all primary cells are protected, otherwise reiterate steps 3-4.

8 The “Selection-Improvement” algorithm re-iterate itself until the table is fully safe, while minimizing information loss during each iteration loop

9 Methods to Select Additional Protecting Cells (PCs) Systematic method: selects the smallest cell in value among all cells that form additive relationships with two primary cells at risk. Publish this cell in pre-defined FI. Single Source Shortest Path (SSSP) method: selects the cells on the “shortest path” connecting all primary cells at risk on the table network, fixing the order of the vertices. Random Selection method: randomly select a cell that form additive relationship with the primary exposure cells. No minimization of information loss is aimed, last resort when above two methods fail.

10 Advantages of Our Selection-Improvement Method Easy implementation Zero disclosure risk Applicable to tables with n-dimensions Order of complexity is that of the auditing program used

11 The Publication Table Used to Evaluate Selection-Improvement Algorithm Employment of eight 2-digit NAICS super-sector industries of an U.S. state, 48,250 cells, including Manufacturing, Retail Trade, Transportation, Information, Finance and Insurance, Real Estate and Rental, Professional services, Healthcare 60,845 establishments 1,166,388 employments 60% of publication cells 14% of total employment 7.6% of establishments are completely suppressed (primary & secondary) under current disclosure protection rule.

12 Tabular Output Comparisons Prior to using Selection-Improvement algorithm (current)

13 … After applying Selection-Improvement Algorithm (proposed) Tabular Output Comparisons (cont’d)

14 Additional Protecting Cells Selected for Alternative Selecting Schemes

15 “Information Loss” Compared to Complete Suppression

16 “Selection-Improvement Method Produced Safe Publication Tables with a User’s Gain of Information” The entire table is safely protected Number of selection-improvement iterations is between 2 and 5 times Increase in employment level in protecting cells is approx. 1% Increase in number of establishment in protecting cells in approx. 2.5% Increase in number of protecting cells is between 1% to 10% (with Random Selection method being least efficient) All cells are published in pre-defined, fixed intervals! Conclusion: A gain of industrial employment information provided to data users is achieved through minimal amount of additional selection cycles.

17 Limitations and Considerations The cell selection process is not repeatable. Random selection method produces different sets of protecting cells each time. The method applies to table with multi-dimensions and hierarchies, but modeling its relationship could be complex and cumbersome. No production computer software exists.

18 Contact Information Bogong T. Li Steve Cohen Bureau of Labor Statistics / OSMR 2 Massachusetts Ave. N.E. Washington, DC BLS QCEW program