Statistical disclosure limitation: Balancing data confidentiality and data access.

Slides:



Advertisements
Similar presentations
Fourth National HIPAA Summit April 26, 2002 Implementation of a HIPAA Data Management Strategy Safeguarding privacy interests while making data available.
Advertisements

North Carolina and the 2010 Census Planning for the Next Decade NC SDC Annual Meeting June 18, 2008.
Balancing Access and Confidentiality Jenny Telford Australian Bureau of Statistics September 2008.
The Statistics Act and Research Access to Data Paul J Jackson Legal Services ONS.
The Perception of Privacy Risk Gerald W. Gates Chief Privacy Officer U.S. Census Bureau.
Tracking Meeting Khaled El Emam, CHEO RI & uOttawa.
TISSUE BANKING Challenging to Say the Least
Issues in Designing a Confidentiality Preserving Model Server by Philip M Steel & Arnold Reznek.
Evolution of Data Use and Stewardship Recent University-wide Data Stewardship Enhancements Integrated System Data Stewardship Shirley C. Payne, CISSP,
University Data Classification Table* Level 5Level 4 Information that would cause severe harm to individuals or the University if disclosed. Level 5 information.
Data linking – Project update 15 th May 2012 – Homecare & SDS event Atlantic Quay Ellen Lynch & Euan Patterson.
1 A View of the United States Federal Statistical System from OMB Katherine K. Wallman Chief Statistician U. S. Office of Management and Budget.
Federal Guidance on Statistical Use of Administrative Data Shelly Wilkie Martinez, Statistical and Science Policy, OIRA U. S. Office of Management and.
Data Collection in a Decentralized Statistical System – The U.S. Perspective Friends of the Chair Group on Integrated Economic Statistics, Work Group Meeting.
Research developments at the Census Bureau Roderick J. Little Associate Director for Research & Methodology and Chief Scientist Bureau of the Census.
Silicon Valley Apps for Kids Meetup Laura D. Berger October 22, 2012 The views expressed herein are those of the speaker, and do not represent the views.
Identity Management In A Federated Environment Identity Protection and Management Conference Presented by Samuel P. Jenkins, Director Defense Privacy and.
Data-Sharing and Governance Consultation ANALYSIS OF RESPONSES.
Census Data Archiving Seminar Presentation on Census Data Archiving: Confidentiality and Anonymizaton Ethiopia, Addis Ababa, 20-23rd September,2011 Etambuyu.
Farm Business and Farm Household Survey Data Customized Data Summaries from ARMS for Statistical Analysis Philip Friend USDA ‘s Economic Research Service.
Access routes to 2001 UK Census Microdata: Issues and Solutions Jo Wathan SARs support Unit, CCSR University of Manchester, UK
Semi-Permeable Boundaries Among Institutions: Non-Public Data and the Census RDC at Berkeley IASSIST 2009 – Tampere, Finland Jon StilesMay 27, 2009.
Protecting Participants in a Global Research Community Dr. Jane Kaye University of Oxford, UK.
CUMC IRB Investigator Meeting November 9, 2004 Research Use of Stored Data and Tissues.
The Nuffield Council on Bioethics Report : The collection, linking and use of data in biomedical research and health care: ethical issues. Martin Richards.
Dealing with confidential research information and consent agreements in research Louise Corti Associate Director UK Data Archive University of Glamorgan.
Administrative Lessons Learned Philadelphia Neighborhood Information System Presenter: Dr. Dennis Culhane, CML Faculty Co-Director.
Evidence-Based Practice Current knowledge and practice must be based on evidence of efficacy rather than intuition, tradition, or past practice. The importance.
1 Health Information Security and Privacy Collaboration (HISPC) National Conference HISPC Contributions to Massachusetts HIE Privacy and Security Progress:
NIH Data Sharing Policy University of Nebraska Medical Center.
Statistics Canada’s Real Time Remote Access Solution 2011 MSIS Meeting – Karen Doherty May 2011.
Confidentiality and Security Issues in ART & MTCT Clinical Monitoring Systems Meade Morgan and Xen Santas Informatics Team Surveillance and Infrastructure.
Health information that does not identify an individual and with respect to which there is no reasonable basis to believe that the information can be.
Curating and Managing Research Data for Re-Use Confidential Data Management Jared Lyle.
Disclosure Avoidance: An Overview Irene Wong ACCOLEDS/DLI Training December 8, 2003.
Providing the evidence…..linking social care, housing support and health data Gillian Barclay & Ellen Lynch Scottish Government.
Use of U.T. Austin Property Computers: Security & Acceptable Use The University of Texas at Austin General Compliance Training Program.
Daniel Beckler United States Department of Agriculture National Agricultural Statistics Service Timothy Mulcahy NORC at the University of Chicago Topic.
Census/NeSS Roadshows March 2003 Better Information Initiatives.
Privacy in Healthcare Challenges Associated with Implementing Privacy in an Electronic Health Records Environment John P. Houston, J.D. Vice President,
RESEARCH ETHICS AND DATA CONFIDENTALITY: ANONYMISATION AND ACCESS CONTROL ……………………………………………………………………………………………………………………………….…………………………….. ……………………………………………………………......…...
Achieving Anonymity in Micro Data Files 10th Symposium on Identity and Trust on the Internet April 6-7, 2011 Privacy: An Emerging Landscape Alvan O. Zarate,
Privacy & Confidentiality By Ann Richards, Ph.D. West Virginia University adapted from a presentation by By Joan Sieber California State University, Hayward.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Approved for Public Release. Distribution Unlimited. 1 Government Privacy Rick Newbold, JD, MBA, CIPP/G Futures Branch 28.
AASHTO & FHWA Appeal re: DRB “rule of three” decision before the Data Stewardship Executive Policy Committee 8/28/2008.
Evidence-Based Practice Evidence-Based Practice Current knowledge and practice must be based on evidence of efficacy rather than intuition, tradition,
Disclosure Risk and Grid Computing Mark Elliot, Kingsley Purdam, Duncan Smith and Stephan Pickles CCSR, University of Manchester
Nov 26, Health-y sharing of human data. 2 Plan ahead.. It can be done in many cases, to great success and benefit!
The 2011 Census: Estimating the Population Alexa Courtney.
1 Confidentiality and Data Access Committee Jacob Bournazian, Chair Energy Information Administration BTS Confidentiality Seminar Series June 11, 2003.
HETUS Pilot Group 8 Privacy procedures and ethical issues Kimberly Fisher, Centre for Time Use Research – co-ordinator External consultant Kai Ludwigs.
Biomedical Informatics Research Network DATA SHARING HIPAA Compliance & IRB Approvals Martha Payne, Jeffrey Grethe October 10, nd Annual All Hands.
Privacy and Security Considerations in Research and Clinical Trials February 28, 2013 Joanna K. Napp, J.D., M.P.H. Chief Privacy Officer and Compliance.
©Canada Health Infoway 2016 Health System Use Summit: Health Analytics for Informed Decision Making Technology and Infrastructure Enablers Joan Roch, Chief.
Protection of Personal Information Act An Analysis on the impact.
The United Kingdom experience in data collection and statistics on disability Ian Dale Head of Disability Analysis Department for Work and Pensions Steel.
Data Confidentiality and the Common Good.
Development of Strategies for Census Data Dissemination
Challenges in Implementing Data Sharing Principles
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
"Development of Strategies for Census Data Dissemination".
Ethical questions on the use of big data in official statistics
George Alter ICPSR Institute for Social Research
Information Handling Research Student Induction Day
Protecting Confidential Data
Disclosure Avoidance: An Overview
Privacy and Data Confidentiality Methods
Presentation transcript:

Statistical disclosure limitation: Balancing data confidentiality and data access

Enables evidence based policy-making Informs the general public on local and national concerns Advances scientific research Trains students in data analysis and decision making Access to high quality data is vital

Breach of confidentiality: May violate laws (e.g., CIPSEA, HIPAA) Undermine broadly held and highly valued ethical principles May lead data providers to withhold important information or refuse to participate in research Protecting confidentiality of data is essential

HIPAA Privacy Rule - Safe harbor - Statistical standard - Limited data sets 2008: Delaware Cancer Registry vs. press - Publics desire to learn cancer sites - State requirement to protect privacy - New legislation Example of tension between access and confidentiality

De-identification Strip unique identifiers like names, addresses, and tax IDs from shared files. Reducing potential for re-identifcation Seemingly innocuous information may reveal individual identities and information Protecting confidentiality while providing access

De-identification Original data name abcdefghijkl Name deleted abcdefghijkl Re-identification Shared data abcdefghijkl Other data abcdefmnop name Where: a = Day, month, year of birthd = County b = Gendere = Occupation c = State of residencef = Race Example: Re-identification by matching

Advances in statistical analysis and the collection of more detailed data enable researchers and policy makers to ask refined questions Enormous amounts of individual-level data are collected, processed, widely distributed … and linkable. Better matching technologies enable linkages Better data – opportunities and problems

Personal information available on the Internet, from private sources, and government surveys Individuals with the right skills and resources could link this personal information to publicly available data: – MIT student re-identifies Massachusetts governor – NIH scientists express caution in making genetic information available Problems – a closer look

Statisticians: Develop ways to identify risk of confidentiality breaches Develop methods for providing safe access to confidential data Conduct research on providing safe access to emerging, complex data types Statisticians can help find a satisfactory balance

General strategies for data protection: Modify data content Remove or alter sensitive or identifying values, and provide unrestricted access to modified data (e.g., public use files) Control data access Use technology and training to reduce chances of breaches, limit who can access the confidential data, the conditions under which the data can be accessed, and the purposes for which the data can be used Useful data can be shared and protected

Eliminate variables (geography) Aggregate sensitive data (age, income) Add random variation to numerical data values Exchange some values between selected records Replace sensitive data with values simulated from statistical models estimated with the original data Modified data: General techniques

Methods can be applied to all or some cases with varying degrees Wider application of methods improves confidentiality protection, but… …degrades usefulness of data Statisticians measure the tradeoffs between disclosure risk and analytic/policy priorities Key features of modified data

Restricted data enclaves (Census, NCHS) Remote access systems (NCHS, NORC) Licensing (NCES, BLS, ) Online tabulations/analysis (Census, NCHS, NCES) Restricted access increasingly provided - examples

Safe projects: Authorized projects, typically with data use agreement Safe people: Approved analysts from authorized institutions; trained in confidentiality issues Safe sites: Use actively monitored by data custodians Safe outputs: Data products subject to statistical and confidentiality review => Analysts have use of detailed data but do not own them which permits manipulations not possible with publicly available data Key features of restricted access

Data access and data confidentiality are intimately connected Statisticians play a central role in improving data usefulness while protecting data confidentiality Statisticians in government, academia, and industry can provide guidance to policy- makers on key issues related to privacy and confidentiality Summary

ASA Statement on Data Access and Personal Privacy ASAs Privacy and Confidentiality Committee RO02 RO02 ASAs Privacy, Data Security and Confidentiality Website OMB/FCSM Report on Statistical Disclosure Limitation Methodology Expanding Access to Research Data: Reconciling Risks and Opportunities Further information

American Statistical Association 732 N. Washington Street Alexandria, Virginia