Download presentation

Presentation is loading. Please wait.

Published byMatteo Deverell Modified about 1 year ago

1
Eurostat Statistical Disclosure Control

2
Presented by Peter-Paul de Wolf, Statistics Netherlands (CBS)

3
Content Introduction What’s the problem? –Specific for business statistics Formalising the problem What to do? –Methods –Software Summary

4
Introduction General definition of confidential data: Data can not be published “as is” »By law (e.g. statistical law) »Sensitive data (what’s sensitive?) »Respondent considers it confidential »…

5
Introduction Physical protection –Entrance –Network Legal protection –Oath Statistical Disclosure Control –Protection of statistical output

6
What’s the problem? Statistical output Microdata –Not often in case of business data –Obvious: each record represents a single respondent Tabular data –In business data often magnitude tables –Sometimes frequency tables –But: aggregated data?!?!?!?

7
Cell value itself not sensitive: –All contributions are equal (1) Spanning variables –Indentifying, e.g. NACE, Region –Sensitive, e.g. “environmental offence” (illegal dumping of waste, illegal fishing, oil spills, …) What’s the problem (frequency table)

8
Example: number of ship-owners Environmental offence RegionYes No Total … A

9
What’s the problem (frequency table) Example: number of ship-owners Environmental offence RegionYes No Total … B

10
What’s the problem (frequency table) Example: number of ship-owners Environmental offence RegionYes No Total … C

11
What’s the problem (magnitude table) Turnover (10 6 €) of instrument producing companies Region A B C Total Harps Organs Pianos Other Total

12
What’s the problem (magnitude table) Turnover (10 6 €) of instrument producing companies Region A B C Total Harps Organs Pianos Other Total ?

13
Formalising the problem Suppose cell (Piano, A) consists of Company X: 8110 6 € Company Y: 510 6 € Other three: 210 6 € each Total : 9210 6 € 92 – 5 = 87 is within 7.4%!

14
Formalising the problem General, objective rules needed Threshold rule Dominance rule or (n,k)-rule p%-rule p%-rule is favoured over (n,k)-rule and implies minimum of 3 contributors

15
What to do? Redesign table –Combine rows/columns –Define different categories Rounding Add noise Cell suppression

16
Region A B C D Total Harps Organs Pianos Other Total

17
Cell suppression Region A B C D Total Harps Organs Pianos Other Total X X X

18
Cell suppression Region A B C D Total Harps Organs Pianos Other Total X X X X XX

19
Cell suppression Region A B C D Total Harps Organs Pianos Other Total X X X XX X X X X

20
Cell suppression Region A B C D Total Harps Organs Pianos Other Total X X X XX X X X X

21
Software Latest version can be found on New Open Source version available end 2014

22
Contact/info Glossary, handbook, project info –http://neon.vb.cbs.nl/caschttp://neon.vb.cbs.nl/casc Wiley book

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google