Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Privacy (ongoing work) Shuchi Chawla, Cynthia Dwork, Adam Smith, Larry Stockmeyer, Hoeteck Wee.

Similar presentations


Presentation on theme: "Database Privacy (ongoing work) Shuchi Chawla, Cynthia Dwork, Adam Smith, Larry Stockmeyer, Hoeteck Wee."— Presentation transcript:

1 Database Privacy (ongoing work) Shuchi Chawla, Cynthia Dwork, Adam Smith, Larry Stockmeyer, Hoeteck Wee

2 Shuchi Chawla, Carnegie Mellon University 2 You are being watched!  Databases abound… Population Census Market Research Used for statistical analysis explaining phenomena making predictions  Prone to malicious use using an individual’s information for marketing, discrimination

3 Shuchi Chawla, Carnegie Mellon University 3 The Privacy vs. Utility trade-off  Inherent tension between Privacy and Utility One extreme – no information; complete privacy Other extreme – complete information; no privacy  We want a middle path: - Preserve macroscopic properties statistical/distributional information clustering information - “Disguise” individual identifying information

4 Shuchi Chawla, Carnegie Mellon University 4 What is privacy?  [Gavison] Protection from being brought to the attention of others inherently valuable attention invites further privacy loss  Each individual should blend in a sufficiently large crowd

5 Shuchi Chawla, Carnegie Mellon University 5 Application-oriented approaches  Statistical approaches Alter the frequency of particular features, while preserving means. Alternately, erase records that reveal too much Do not consider possible privacy breach from combining information from different records  Query-based approaches Disallow queries that reveal too much Combination of seemingly innocuous queries could reveal individual traits  Only good for specific applications

6 Shuchi Chawla, Carnegie Mellon University 6 Towards a general approach  Allow arbitrary tests and queries  Preserve macroscopic properties, but not individual records  Approach: “perturb” individual records appropriately and publish the entire dataset  Perturbation has to be probabilistic

7 Shuchi Chawla, Carnegie Mellon University 7 A geometric view  A first-attempt – an oversimplified abstract model  Simplifying assumption : each attribute is real-valued  Think metric space  Real Database (RDB) n unlabeled points in d-dimensional space.  Sanitized Database (SDB) n new points possibly in a different space.

8 Shuchi Chawla, Carnegie Mellon University 8 The adversary or Isolator  Using SDB and auxiliary information (AUX), outputs a point q  q “isolates” a real point x, if it is very close to x, but not to many other real points.  No way of obtaining privacy if AUX already reveals too much!  SDB compromises privacy if the adversary is able to increase his probability of isolating a point considerably by looking at it

9 Shuchi Chawla, Carnegie Mellon University 9 Isolation – a relative notion (c-1)   Tightly clustered points have a smaller radius of isolation  T-radius of x – distance to its T-nearest neighbor  x is isolated if B(q,c  ) contains less than T points  x is “safe” if distance between x and q is more than T-radius/(c-1) c – privacy parameter; constant q x  cc

10 Shuchi Chawla, Carnegie Mellon University 10 Our contribution  A precise definition of privacy using T-radii  A perturbation algorithm, closely linked to the definition of privacy  Prove that the algorithm preserves privacy under reasonable assumptions  Working towards showing that macroscopic properties are preserved

11 Shuchi Chawla, Carnegie Mellon University 11 What about the real world?  Lessons from the abstract model High dimensionality is our friend  Outliers Our notion of c-isolation deals with them – they get perturbed by a very large amount Existence of outlier may be disclosed  Put more on this slide…

12 Shuchi Chawla, Carnegie Mellon University 12 What about Outliers?  Bill Gates example here  Reconsider definition of privacy do not want to disclose existence of outlier do not want to disclose anything about outlier do not want to disclose identity of outlier c-isolation falls in the third category


Download ppt "Database Privacy (ongoing work) Shuchi Chawla, Cynthia Dwork, Adam Smith, Larry Stockmeyer, Hoeteck Wee."

Similar presentations


Ads by Google