Presentation on theme: "CAPRI CCSR Analysis of Information Loss: a Case Study From a UK Survey Mark Elliot Kingsley Purdam Confidentiality and Privacy Group (CAPRI) CCSR, University."— Presentation transcript:
CAPRI CCSR Analysis of Information Loss: a Case Study From a UK Survey Mark Elliot Kingsley Purdam Confidentiality and Privacy Group (CAPRI) CCSR, University of Manchester
CAPRI CCSR Outline Concepts General Method Results
CAPRI CCSR Concepts Analytical Completeness –Effects of Recodes Analytical Validity –Effects of Perturbations
CAPRI CCSR General Method Selected Sample of publications Contact Authors Phase 1 Questionnaire Phase 2 Rerun of Studies
CAPRI CCSR Example Recodes Age recoded from single years to Five-year bands. Area removed from data set but region left in. Ethnicity recoded from 10 to 4 categories: –a. White –b. Black –c. Asian –d. Other
CAPRI CCSR Number of recodes impacting on analyses per author.
CAPRI CCSR Number of recodes severely impacting on analyses per author
CAPRI CCSR Percentage of authors giving each category of response to whether removing area retaining region would affect their analyses.
CAPRI CCSR Percentage of authors giving to each category of response to whether recoding age into ten-year bands would affect their analyses
CAPRI CCSR utility index. Utility = %none+ (%moderate + %other)/2 No great claims made about this but useful way of summarising results and can be compared to disclosure risk impact (using DIS: Skinner and Elliot 2003).
CAPRI CCSR ARGUS: Problems and Resolutions Key Variable Selection problematic. –Not able to use Elliot and Dale(1999) scenarios keys. Individual risk model doesnt work on un-weighted data. Not able to block certain missing values from use.
CAPRI CCSR Perturbed File 1 File with suppressions. All two dimensional tables. Three dimensional tables under scenarios.
CAPRI CCSR Perturbed File 2 PRAMed file All Variables PRAMED levels set to maintain univariate distributions
CAPRI CCSR Perturbed File 3 1.Unperturbed! Control File.
CAPRI CCSR Perturbed File 4 1.PRAMed as file 2. 2. Suppressions All two dimensional tables.
CAPRI CCSR Overview of Results Basic analyses on the whole SAR: cross-tabs, correlations, simple regressions lead to fairly consistent interpretations. However still some problems. Problems arise for all three perturbed files for more complex analyses and/or those involving sub- sections of the file (e.g one geographical area).
Conclusions Study introduces methods for measuring the utility impact of disclosure control measures The relationship between utility measures and and disclosure risk measures represent the cost benefit equation of disclosure control.