Presentation is loading. Please wait.

Presentation is loading. Please wait.

2016.6.2. - A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has cancer. -suppose.

Similar presentations


Presentation on theme: "2016.6.2. - A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has cancer. -suppose."— Presentation transcript:

1 2016.6.2

2 - A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has cancer. -suppose an adversary is only allowed to use a particular form of query S(i) that returns the sum of the first i rows of the second column patienthas cancer Amy0 Tom1 Jack1 Differential privacy address the question of, given the total number of patients with cancer, whether or not an adversary can learn if a particular individual has cancer. -Suppose he also knows Jack is in the last row of the database -If Jack has cancer? S(3)-S(2)

3 Difference privacy model is derived from a very simple observation: When the dataset D contains an individual, for example, Alice. Then do arbitrary query f (for example, count, sum, average, median, or other queries etc.) and get result f (D). If after deleting Alice from D the result of the query is still f(D). This means Alice’s message won’t be leaked. Differential privacy aims to provide means to maximize the accuracy of queries from datasets while minimizing the chances of identifying its records. Differential Privacy

4 xi xi’ D1 D2 Database neighbors

5

6 k-anonymity and its expansion model (l-diversity 、 t-closeness…) can’t provide enough security Differential privacy doesn’t consider any possible background attackers have

7

8

9 Laplace Mechanism Gaussian Mechanism (probabilistic) Exponential Mechanism

10

11

12

13 A sports event is going to be held. The items are selected from the set {football, volleyball, basketball, tennis}. Participants voted for this. Now choosing an item and ensure that the entire decision-making process to meet the ε- difference privacy. Set the number of votes as the utility function, obviously Δu = 1. According to the exponential mechanism, given privacy budget ε, we can calculate the output probability of various items, as shown in the Table. itemuε=0ε=0.1ε=1 Football300.250.4240.924 Volleyball250.250.3300.075 Basketball80.250.1411.5e-05 Tennis20.250.1057.7e-07

14 Privacy Preserving Data Release (PPDR)  Interactive data release  Non-interactive data release Query 1 Query i … DP Raw data all query results Purified datasets

15 Gergely Acs INRIA gergely.acs@inria.fr Claude Castelluccia INRIA claude.castelluccia@inria.fr

16  Record linkage  Attribute linkage  Table linkage  A probabilistic attack

17  CDR dataset -- French telecom company Orange  1,992,846 users  1303 towers  989 IRIS cells  10/09/2007 - 17/09/2007 (one week)

18  Aim : Release the time series of IRIS cells without leaking privacy : the number of individuals at L in the (t+1)th hour of the week  Method: time series of all IRIS cells sanitized version satisfies Differential Privacy

19 for a given privacy level, the magnitude of noise can be substantially reduced by using several optimizations and by customizing the anonymization mechanisms to the public characteristics of datasets and applications.

20

21 Pre-sampling = l Computing the largest covering cells select at most l of visits per user

22 Perturbation  the similarity of geographically close time series → Clustering cells  periodic nature → add Gaussian noise to DCT low-frequency components

23


Download ppt "2016.6.2. - A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has cancer. -suppose."

Similar presentations


Ads by Google