No Free Lunch: Working Within the Tradeoff Between Quality and Privacy

Slides:



Advertisements
Similar presentations
1 Local Employment Dynamics Powerful Analytic Tools: First Look at OnTheMap Version 3 (Beta) Colleen D. Flannery New Jersey State Data Center June 11,
Advertisements

Local Employment Dynamics (LED) Online Toolset For the Workforce Information in Regional Economic Development Conference ETA Regions 4 and 6, Phoenix,
Local Employment Dynamics Data: Advanced Topics C2ER Training Workshop June 4, 2012 Stephen Tibbets Erika McEntarfer LEHD Program US Census Bureau.
Local Employment Dynamics Training October 2014 Earlene Dowell Longitudinal Employer-Household Dynamics U.S. Census Bureau 1.
Federal Guidance on Statistical Use of Administrative Data Shelly Wilkie Martinez, Statistical and Science Policy, OIRA U. S. Office of Management and.
Presented to: Presented by: Transportation leadership you can trust. LEHD OnTheMap Data Planning Applications Conference, Session 2 Bruce Spear, Cambridge.
What are Wage Records? Wage records are an administrative database used to calculate Unemployment Insurance benefits for employees who have been laid-off.
Presented to: Presented by: Transportation leadership you can trust. LEHD OnTheMap Data 2011 GIS in Public Transportation Tampa, FL Bruce Spear September.
Labor Statistics in the United States Grace York March 2004.
Recent Advances In Confidentiality Protection – Synthetic Data John M. Abowd April 2007.
Census Bureau Employment Data ACS, EC, and LED… And why you should use the data from one program vs. another… SDC/CIC Annual Training Conference Wednesday,
State Data Center Annual Affiliate Meeting New York State Department of Labor Earlene Dowell LEHD Program Center for Economic Studies U.S. Census Bureau.
Planning.Maryland.gov LEHD L ONGITUDINAL E MPLOYER – H OUSEHOLD D YNAMICS.
“OnTheMap” The Census Bureau’s New Tool for Residence-Workplace Analysis Fredrik Andersson and Jeremy Wu May 7, 2007 Daytona Beach, FL.
Improvements in the BLS Business Register Richard Clayton David Talan 12th Meeting of the Group of Experts on Business Registers Paris, France September.
New Census Bureau Data for Entrepreneurship Research Ron S Jarmin US Census Bureau OECD November 19, 2007 This report is released to inform interested.
Local Employment Dynamics Jeff Matson CURA, University of Minnesota Oriane Casale Labor Market Information Office, MN Dept. of Employment and Economic.
0 presented to Model Task Force Meeting presented by Vidya Mysore, FDOT Central Office Krishnan Viswanathan, Cambridge Systematics, Inc. 12/12/06 LEHD.
OnTheMap and LODES Data Heath Hayward Geographer LEHD Program Center for Economic Studies.
1 Supplementing ACS: The LEHD Program Jeremy S. Wu Marc Roemer U.S. Census Bureau May 12, 2005 Jeremy S. Wu Marc Roemer U.S. Census Bureau May 12, 2005.
Local Employment Dynamics (LED) & OnTheMap Nick Beleiciks Oregon Census State Data Center Meeting April 14, 2009.
Business Employment Dynamics David M. Talan Branch Chief, Quarterly Census of Employment and Wages (QCEW) Program The Council for Community and Economic.
1 Longitudinal Employer- Household Dynamics (LEHD) Program Jeremy S. Wu U.S. Census Bureau May 11, 2005 Jeremy S. Wu U.S. Census Bureau May 11, 2005.
Expanding Business Employment Dynamics Industry and Survival 18 th International Roundtable on Business Survey Frames Beijing, China 10/22/04 Richard L.
LED: A New Source for Measuring Job Gains and Losses Henry Hyatt, Ph. D. Stephen Tibbets Jeremy S. Wu, Ph. D Association of Public Data Users (APDU)
Using Census Data to Understand Things ​ OpenGovChicago March 26, 2014.
LEHD and OnTheMap From Jobs to Transportation Matthew Graham Geographer U.S. Census Bureau 1.
IT Applications Theory Slideshows By Mark Kelly Vceit.com Privacy Laws.
Fastpowerfulfree OnTheMap Yes, the data are complex. Matthew Graham Geographer Center for Economics Studies U.S. Census Bureau mappingsimpleconvenientweb-based.
The Alphabet Soup of Planning Data 3,596 datasets found Planning Design Construction Operations Performance Management 317 datasets found for transportation.
Local Employment Dynamics Getting in Touch with Your Local Workforce Earlene Dowell Longitudinal Employer-Household Dynamics Program Center for Economic.
Planning.Maryland.gov LEHD L ONGITUDINAL E MPLOYER – H OUSEHOLD D YNAMICS A LFRED S UNDARA, AICP M ARYLAND D EPARTMENT OF P LANNING.
Local Employment Dynamics Getting in Touch with Your Local Workforce Earlene Dowell Longitudinal Employer-Household Dynamics Program Center for Economic.
On the Map & Statistical Abstract South Dakota State University Demography Conference May 2013.
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
LED Local Employment Dynamics Bradley Keen Pennsylvania Department of Labor & Industry Center for Workforce Information & Analysis (CWIA)
Local Employment Dynamics Boom or Bust, Determining Which Industries Are Thriving in the Texas Economy 1 Earlene Dowell LEHD Program Center for Economic.
The LEHD Program and Employment Dynamics Estimates Ronald Prevost Director, LEHD Program US Bureau of the Census
Local Employment Dynamics: Partnership, Public-Use Data, and Innovative Web Tools Eric Coyle Data Dissemination Specialist U.S. Census Bureau 1.
LMI and You DWD – BLS. Federal – State Partnership DWD is charged to create the following monthly estimates of economic activity:DWD is charged to create.
INFO 7470 Statistical Tools: Edit and Imputation Examples of Multiple Imputation John M. Abowd and Lars Vilhuber April 18, 2016.
Measuring Data Quality in the BLS Business Register Richard Clayton Sherry Konigsberg David Talan WiesbadenGroup on Business Registers Tallin, Estonia.
Local Employment Dynamics: Getting in Touch with Your Local Workforce from a National Point of View 1 Earlene Dowell LEHD Program Center for Economic Studies.
Western Wisconsin Industry Projections through 2022 Source: Office of Economic Advisors, Wisconsin Department of Workforce Development, September 2015.
1 — U.S. B UREAU OF L ABOR S TATISTICS bls.gov BLS Data and the RDC Kristen Monaco April 2016.
Expanding the Role of Synthetic Data at the U.S. Census Bureau 59 th ISI World Statistics Congress August 28 th, 2013 By Ron S. Jarmin U.S. Census Bureau.
1 — U.S. B UREAU OF L ABOR S TATISTICS bls.gov QCEW Update: Acceleration Test & NAICS 2017 David Hiles Chief, Current Data Analysis Branch QCEW Program.
John M. Abowd and Lars Vilhuber February 16, 2011
Section 1: Trends of Hispanic Employment in Construction
An Update on Business Employment Dynamics
Agenda Other Sources We Frequent:
Tracking Business & Employment for Economic Development
IT Applications Theory Slideshows
Mapping National Definitions of Informal Employment to International Statistical Standards G.Raveendran.
Informal Sector Statistics
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
Informal Sector Statistics
Identifying Worker Characteristics Using LEHD and GIS
Tracking Businesses and Employment for Economic Development
Selected Components of the Health Care Delivery System
Martha Stinson. T. Kirk White. James Lawrence
UT-Austin FSRDC Grand Opening December 13, 2017
Lucia Foster Chief Economist U.S. Census Bureau December 5-6, 2013
GIS DATA SOURCES I NEVER KNEW
Using Data to Communicate Needs
Stephanie Bond Huie, Ph.D., Vice Chancellor
Local Employment Dynamics:
Census Business Builder: Version 2.5 and Updates for Version 2.6+
How to Measure and Monitor Outcomes in Opportunity Zones
Presentation transcript:

No Free Lunch: Working Within the Tradeoff Between Quality and Privacy Matthew Graham Product Coordination & Quality Assurance Branch LEHD Program, Center for Economic Studies U.S. Census Bureau June 6, 2017

Desiderata Disclaimer: Any opinions and conclusions expressed herein are those of the authors and do not necessarily represent the views of the U.S. Census Bureau. All results have been reviewed to ensure that no confidential information is disclosed. Additionally, these opinions and conclusions are not representative of other data products or programs within the Census Bureau.

Background LEHD Program: Administrative + census/survey data on firms/jobs/workers Research on the labor market, firms, and workers From research, data products are developed: Quarterly Workforce Indicators (QWI) LEHD Origin-Destination Employment Statistics (LODES) Job-to-Job Flows (J2J)

LEHD Data Infrastructure QCEW = Quarterly Census of Employment and Wages UI = Unemployment Insurance OPM = Office of Personnel Management UI* Wage Records OPM* Economic Survey Data Linked National Jobs Data Firm Data Jobs Data Person Data Business Register Federal Records Demographic Census/Survey Data QCEW* How the data are used: Public-use data products Internal research projects Research projects conducted through Federal Statistical Research Data Centers (FSRDC) Both public-use tables and research work product must be approved for release Quality: Issues that affect data quality/utility: Non-reporting (firms/establishments/jobs/workers) Item missingness Edit/imputation methods Confidentiality protection Job data cover over 96% of private employment and most state, local, and federal jobs Data availability: 1990-2016, start year varies by state, rolling end date Ongoing Research Confidentiality Protection Public-Use Data Products…

Mandates, Generally Speaking Publish statistics about people and economy Protect confidentiality of data collected by the Census Bureau (Title 13 of US Code) Individuals Operations of Businesses Protect confidentiality of Federal Tax Information (Title 26 of US Code) State partner sensitivity Note: Not going to distinguish between “privacy” and “confidentiality” in this talk. They are used variously in different settings and I will not differentiate here.

Example: LODES/OnTheMap LODES: Origin-Destination data on jobs 130m jobs, annual release Geography Home & Work Census Blocks (11m) Firm Characteristics Ownership (3), NAICS Sector (20), Firm Age (5), Firm Size (5) Person/Job Characteristics Age (3), Earnings (3), Race (6), Ethnicity (2), Sex (2), Education (4) Large & Sparse (1.3×108 / 1.6×1020) Distributed through map-based, analytical tool: OnTheMap http://onthemap.ces.census.gov/

Protection Systems Exact Employment: Permanent multiplicative noise distortion factor for employers and establishments. Synthetic methods for small cells. (Abowd, et al. 2006) Residential Location: Synthetic data methods using probabilistic differential privacy. (Machanavajjhala et al. 2008) Current Research: Develop provable protection for employers/establishments and attribute composition (Haney et al. 2017)

Tradeoffs (Primary) Privacy vs. Data accuracy Simply put, provable privacy sets a “budget” and we must decide how much privacy loss we are willing/able to trade for data accuracy (a dimension of quality). Social choice! But privacy loss and data quality are (usually) complex constructs…

Tradeoffs (Primary) Less Accuracy More Quality Less Privacy

Tradeoffs (Secondary) How is privacy defined in a complex dataset? Sometimes laws and policy speak directly Sometimes new policy must be made E.g. Are we free to trade off employment precision with attribute composition? How is quality defined in a complex dataset? Our quality budget can be directed at different parts of the dataset Which users/uses are favored with good quality? Public policy choice!

Takeaways Start with existing law/policy What are the mandates for privacy and data release? What is the technology? Is it provable? Provable privacy demands clarity, which is a virtue. How are the tradeoffs (quality vs. privacy and the components of quality and privacy) to be decided?

Contact/References LEHD Program: lehd.ces.census.gov Email: matthew.graham@census.gov Selected References: LEHD Origin-Destination Employment Statistics (LODES) Technical Document. http://lehd.ces.census.gov/data/lodes/LODES7/LODESTechDoc7.2.pdf J. M. Abowd, B. E. Stephens, and L. Vilhuber. Confidentiality protection in the Census Bureau’s Quarterly Workforce Indicators. Technical Report TP-2006-02, U.S. Census Bureau, LEHD Program, December 2006. A. Machanavajjhala, D. Kifer, J. M. Abowd, J. Gehrke, and L. Vilhuber. Privacy: Theory meets practice on the map. In ICDE, pages 277–286, 2008. S. Haney, A. Machanavajjhala, J. M. Abowd, M. Graham, M. Kutzbach, L. Vilhuber. 2017. Utility Cost of Formal Privacy for Releasing National Employer-Employee Statistics. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD ‘17). ACM, New York, NY, USA, 1339-1354. DOI: https://doi.org/10.1145/3035918.3035940