Confidentiality and the SARs Update on SAR progress, and discussion of the disclosure work done for Scotland. Sam Smith

Slides:



Advertisements
Similar presentations
The ONS Longitudinal Study - plans for the 2011 Census and beyond
Advertisements

Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario.
Samples of Anonymised Records: a resource for ethnicity research Ed Fieldhouse Director, SARs Support team
Comparing Results from the England and Wales, Scotland and Northern Ireland Longitudinal Studies: Health and Mortality as a case study Census Microdata.
Constructing population time series with an ethnic breakdown for sub-national areas in England and Wales, Albert Sabater PhD student at CCSR.
Samples of Anonymised Records from the 2001 Census Five different microdata files - with varying amounts of detail Three different modes of access - with.
Statistical Disclosure Control (SDC) for 2011 Census Progress Update Keith Spicer – ONS SDC Methodology 23 April 2009.
2011 SARs Consultation: Analysing ethnicity and identity variables David Owen, University of Warwick.
Requirements for 2011 Cross-sectional Microdata SARs Support Team University of Manchester
Requirements for 2011 Cross-sectional Microdata Ed Fieldhouse SARs Support Team University of Manchester
ESDS Government Tel: (0161) Jo Wathan CCSR, University of Manchester.
The methodology used for the 2001 SARs Special Uniques Analysis Mark Elliot Anna Manning Confidentiality And Privacy Group ( University.
Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,
Output Consultation Plans and Statistical Disclosure Control Strategy developments Angele Storey and Jane Longhurst ONS.
Conference Programme Introduction to the Samples of Anonymised Records - Keith Spicer, ONS CCSR's role in providing SAR's support - Jo Wathan,
Infant Feeding and Carers Surveys Steve Webster NHS IC.
CAPRI CCSR Analysis of Information Loss: a Case Study From a UK Survey Mark Elliot Kingsley Purdam Confidentiality and Privacy Group (CAPRI) CCSR, University.
The Samples of Anonymised Records: Understanding Individual differences Mark Brown.
2001 Census Programme Using the Census for contemporary and historical research ESRC Research Methods Festival Oxford, July 2004.
Linking Vital Events to the Longitudinal and Mortality Studies Joanne Cartland & Karen McConnell Demography.
Issues in Designing a Confidentiality Preserving Model Server by Philip M Steel & Arnold Reznek.
1 ACS Data Products for Use in Transportation Planning: 2004 and Beyond By Phillip Salopek Chief, Journey to Work and Migration Statistics Branch Population.
Progress on the SDC Strategy for the 2011 Census 23 rd June 2008 Keith Spicer and Caroline Young.
Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.
Data linking – Project update 15 th May 2012 – Homecare & SDS event Atlantic Quay Ellen Lynch & Euan Patterson.
Let us Bring You to Your Census: Recent Developments in UK Census Data Provision Lucy Bell Census Registration Service Co-ordinator UK Data Archive
Using synthetic data to improve the accessibility of the SLS Susan Carsley, SLS Project Manager.
Statistical Disclosure Control Philip Johnston, Information Services Division, NHSNSS ScotPHO training course, 1 April 2011.
11 ACS Public Use Microdata Samples of 2005 and 2006 – How to Use the Replicate Weights B. Dale Garrett and Michael Starsinic U.S. Census Bureau AAPOR.
Access routes to 2001 UK Census Microdata: Issues and Solutions Jo Wathan SARs support Unit, CCSR University of Manchester, UK
Secondary Data Analysis Using the Census Stephen Drinkwater WISERD School of Business and Economics Swansea University.
General Register Office for S C O T L A N D information about Scotland's people Producing small area housing and household statistics from Council Tax.
Census Transportation Planning Products (CTPP) Data Products June 18, 2010.
Improving Quality in the Office for National Statistics’ Annual Earnings Statistics Pete Brodie & Kevin Moore UK Office for National Statistics.
GEOG3025 Census and administrative data sources 2: Outputs and access.
American Community Survey Presented at the Meeting of the National Neighborhood Indicators Partnership Susan Schechter May
U NITED KINGDOM OCCUPANCY SURVEY Serviced Accommodation Summary Report February 2014 the research solution.
The Population of the UK – © 2012 Sasi Research Group, University of Sheffield MAPS…A DIFFERENT VIEW OF THE UNITED KINGDOM Chapter 1 THE POPULATION OF.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
General Register Office for S C O T L A N D information about Scotland's people General Register Office for Scotland “Information about Scotland’s people”
Introduction to the Public Use Microdata Sample (PUMS) File from the American Community Survey Updated February 2013.
Integrating ACS with the World’s Census Data: ACS Microdata and the IPUMS Presented at the Pre-ALAP ACS/IPUMS Workshop November 16, 2010 Trent Alexander.
Plans for Access to UK Microdata from 2011 Census Emma White Office for National Statistics 24 May 2012.
Secondary data Relevance: A-Level Case study: 2011 UK census Topic: Geographical skills.
U NITED KINGDOM OCCUPANCY SURVEY Serviced Accommodation Summary Report January 2014 the research solution.
1 New Implementations of Noise for Tabular Magnitude Data, Synthetic Tabular Frequency and Microdata, and a Remote Microdata Analysis System Laura Zayatz.
2011 Census: Lessons learned from the Business Sector Dr Barry Leventhal MRS Census & Geodemographics Group CAG Meeting 8 th January 2015.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
Census.ac.uk The UK Census Longitudinal Studies Chris Dibben, University of St Andrews.
General Register Office for S C O T L A N D information about Scotland's people Comparison between NHSCR and Community health index sources of migration.
WP 19 Assessment of Statistical Disclosure Control Methods for the 2001 UK Census Natalie Shlomo University of Southampton Office for National Statistics.
Using Targeted Perturbation of Microdata to Protect Against Intelligent Linkage Mark Elliot, University of Manchester Cathie.
Disclosure Control in the UK Census Keith Spicer 11 January 2005.
1 Dissemination Michael J. Levin Harvard Center for Population and Development Studies
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Household Projections Dorothy Watson General Register Office for Scotland Household Estimates and Projections Branch.
The micro-geography of UK demographic change Paul Norman Cathie Marsh Centre for Census & Survey Research (CCSR), University of Manchester ESRC.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
The 2011 Census: Estimating the Population Alexa Courtney.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, 7-9 JULY 2010 DISSEMINATING THE RESULTS OF THE 2011 CENSUS IN ENGLAND AND WALES.
Census Office Fernando Casimiro Geneva, July 2010 Portugal – Census results tailored to user needs «
Census 2011 – A Question of Confidentiality Statistical Disclosure control for the 2011 Census Carole Abrahams ONS Methodology BSPS – York, September 2011.
JOINT UN-ECE/EUROSTAT WORK SESSION ON MIGRATION STATISTICS GENEVA, OCTOBETR 2012 COLLECTING MIGRATION DATA IN THE UK CENSUS IAN WHITE, Office for.
Samples of Anonymised Records from the U.K. Census 1991 and 2001 Integrating Census Microdata Workshop Barcelona th July 2005 Dr. Ed Fieldhouse Cathie.
ESDS Seminar Apr The EU Labour Force Survey Arturo de la Fuente, Estat-F2 “Labour Market Statistics”
The complexities of publishing gridded data for the UK European Forum for Geostatistics Krakow – October 2014 Ian Coady Geography Policy and Research Manager.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
Disclosure scenario and risk assessment: Structure of Earnings Survey
Samples of Anonymised Records: a resource for ethnicity research
Presentation 2b 2018 Census Products & Services Engagement.
Presentation transcript:

Confidentiality and the SARs Update on SAR progress, and discussion of the disclosure work done for Scotland. Sam Smith

Update 2001 SARs Newsletter published very recently: More delays Disclosure Control is ongoing by CAPRI Current estimate for Individual data to be with the SARs team in June In-house access at ONS for users with urgent need.

England and Wales For the release of 100% tables, England and Wales and Northern Ireland rounded small cell counts. It is not possible to match between the SAR and the tables for England, Wales and NI.

Scotland Scotland did not round their 100% tables. As a result, there are counts of 1 in the tables. If any of these individuals are present in the SAR, it is disclosive.

Background The following work has been carried out in collaboration with the General Register Office for Scotland, by the SARs team at CCSR. At time of writing, I have had no access to disclosive data. There is no geography below Scotland level.

Population Uniques Population Uniques are people who have one or more characteristics which are Unique in the Population. Sample Uniques are people who are unique on one or more characteristics in the Sample.

Scale There are 62 variables in both the SAR and 100% tables. GROS are interested in Tri-variate tables. Only concerned with uniques. We obtained 37,820 tables, covering all combinations of trivariate tables.

Request of the tables An example request for input to their system was provided by GROS We then replicated and modified it, one request for each table. The tables arrived on 4 CDs, a month later.

An example table Space-Time Research 2001 ED Based OSD - Test 1 Table 1 Cars - Number of by Ever worked Indicator and Number of Rooms for Person No code requiredNo code requiredNo code requiredNo code required No code required Not applicable None-53,323421,443232,33518,719 One-33,839577,499759,187188,235 Two-6,104174,884499,420368,657 Three-77220,02983,91584,619 Four or more-2224,62220,35329,984 Communal establishment50, Cars - Number of by Ever worked Indicator and Number of Rooms Only No Code Required shown for Ever Worked.

A Bigger Example Table Age, Industry, Occupation Add table here

Analysis Custom software written to parse each table, and list the file, variables and values locations of all uniques. List the Uniques. There are 2.4 million of them.

Implementation Step by Step process. Keep intermediate steps. Keep It Simple.

Target The Scotland Specification is as compatible as possible with the England and Wales specification. Use recodes to reduce the unique count to a level where they can be dealt with on an record by record basis.

Simple Suppression of Uniques All records with uniques must be perturbed. Approximately 96% of Uniques will be immediately suppressed by virtue of the sample being 4%. There are also reductions because of differences in the specifications.

Recodes Variables were recoded to coarser categories. Some used to aid E&W disclosure work including: Age, Hours of Work, Industry + others At time of writing, Occupation is the only additional recode for Scotland.

Running the recodes. The previous slide represents 6 weeks of iterative work. Each recode had the uniques analysis run, producing a list of uniques.

Moving forwards We now have a slightly more restrictive specification for Scotland. Age recoded to between 2 and 5 year bands (for age 16+) (possibly also for EWNI) Occupation in ?? categories Industry in 15 categories (applied to EWNI) Hours of Work banded (applied to EWNI)

So far… Everything has been done on publicly accessible data. The above process needs to be rerun on the SAR to find Sample Uniques This requires access to the disclosive microdata.

Future Work The 38,720 tables will be recreated for the records in the sample. The lists of Population Uniques and Sample Uniques will be compared. Where there is a Population Unique in the Sample, it will be flagged.

Applying this to the Microdata All the Population Uniques in the Sample will be peturbed by ONS. The method of peturbation will be the same as done for England, Wales and NI records. This method is likely to involve PRAMM. Discussion paper available from the SARs website?

The 100% Tables The 37,820 tables requested cost £2,000 - paid for by the SARs project. They will be made available to registered SARs/Census users for use in research.

And Finally…. Slides will be available on the seminars webpage tomorrow. Any questions?