Presentation is loading. Please wait.

Presentation is loading. Please wait.

Privacy Preserving Publication of Moving Object Data Joey Lei CS295 Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain 6/10/20151CS295.

Similar presentations


Presentation on theme: "Privacy Preserving Publication of Moving Object Data Joey Lei CS295 Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain 6/10/20151CS295."— Presentation transcript:

1 Privacy Preserving Publication of Moving Object Data Joey Lei CS295 Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain 6/10/20151CS295 - Privacy and Data Management

2 Outline Intro & Background Clustering and Perturbation Techniques Spatio-Temporal Cloaking (Generalization) Techniques Future Research 6/10/2015CS295 - Privacy and Data Management2

3 Location Privacy Growing prevalence of location aware devices – mobile phones and GPS devices Two Analysis Groups – Online Real-time monitoring of moving objects and motion patterns development of location based services (LBS) – Google Maps on the iPhone – Offline Collection of traces left by moving objects Offline analysis to extract behavioral knowledge – public transportation 6/10/20153CS295 - Privacy and Data Management

4 Privacy Concerns Location Data allows for intrusive inferences – Reveals habits – Social customs – Religious and sexual preferences – Unauthorized advertisement – User profiling 6/10/20154CS295 - Privacy and Data Management

5 Offline Analysis Traffic Management Application – Paths (trajectories) of vehicles with GPS are recorded Geographic Privacy-aware Knowledge Discovery and Delivery (GeoPKDD) – Traffic data published for the city of Milan (Italy) – Car identifiers were replaced with pseudonyms Daily Commute Example – Bob’s home and workplace are traceable by location systems (QIDs) – Join data with a telephone directory 6/10/20155CS295 - Privacy and Data Management

6 Definitions Anonymity Preserving Data Publishing of Moving Objects Databases – How to transform published location data while maintaining utility Moving Object Database (MOD) – A set of individuals, time points, and trajectories 6/10/20156CS295 - Privacy and Data Management

7 Background: Location Based Services Ideals – Provide service without learning user’s exact position – Location data is forgotten once service is provided k-anonymity definition – A response to a request for location data is k- anonymous when it is indistinguishable from the spatial and temporal information of at least k – 1 other responses sent from different users 6/10/20157CS295 - Privacy and Data Management

8 LBS: Location k-Anonymity Spatial Requirements – Ubiquity – that a user visits at least k regions – Congestion – number of users be at least k One Way to Achieve This: Mix Zones – An area where LBS providers cannot trace a specific users’ movement – Identity is replaced with pseudonyms Users entering these zones at the same time are mixed together 6/10/20158CS295 - Privacy and Data Management

9 LBS: Location Based Quasi-Identifier A spatio-temporal pattern that can uniquely identify one individual – set of spatial areas and time intervals plus a recurrence formula – AreaCondominium [7am, 8am],AreaOfficeBldg [8am, 9am], – AreaOfficeBldg [4pm, 6pm],AreaCondominium[5pm, 7pm] – Recurrence : 3.Weekdays ∗ 2.Weeks 6/10/20159CS295 - Privacy and Data Management

10 LBS: Historical k-Anonymity In the offline context – A set of requests satisfies historical k-anonymity if there exists k – 1 personal histories of locations (trajectories) belonging to k – 1 different users such that they are location-time consistent (undistinguishable) 6/10/201510CS295 - Privacy and Data Management

11 Outline Intro & Background Clustering and Perturbation Techniques Spatio-Temporal Cloaking (Generalization) Techniques Conclusions 6/10/2015CS295 - Privacy and Data Management11

12 Clustering and Perturbation C&P ignores the inherent problems with location QIDs: – each individual can have their own QIDs which makes it difficult to create a QID for all individuals – Area(Home,Office,??)[??am- ??pm] – Recurrence : 7.Weekdays ∗ 52.Weeks Solution: anonymize trajectories instead – Microaggregation / k-member anonymity 6/10/201512CS295 - Privacy and Data Management

13 Clustering and Perturbation Trajectories are not polylines, but instead a cylindrical volume with radius δ (or uncertainty radius) If another trajectory moves within the cylinder of the given trajectory, then the two trajectory are indistinguishable from each other ((k, δ)- anonymity set) 6/10/201513CS295 - Privacy and Data Management

14 Clustering and Perturbation a)Uncertainty trajectory b)Anonymity set for two trajectories 6/10/201514CS295 - Privacy and Data Management

15 Achieving (k, δ)-anonymity Achieved by Space Translation – slightly moving some observations in space Step One: cluster trajectories of similar sizes – NWA (Never Walk Alone) All equivalence classes have the same time span and special timestamp requirements π (ie. π = 60, only full hours, from 1:00PM-2:00PM) 6/10/201515CS295 - Privacy and Data Management

16 Achieving (k, δ)-anonymity Step Two: perturb trajectories within uncertainty radius δ (i.e. transformation into anonymity set) – Grouping and Reconstruction Finding the nearest matching points to group Reconstruct a generalization for utility Multi TGA and Fast TGA Algorithms 6/10/201516CS295 - Privacy and Data Management

17 Outline Intro & Background Clustering and Perturbation Techniques Spatio-Temporal Cloaking (Generalization) Techniques Conclusions 6/10/2015CS295 - Privacy and Data Management17

18 Trajectory Generalization Anonymization of three trajectories tr1, tr2 and tr3, based on point matching and removal, and spatio- temporal generalization 6/10/201518CS295 - Privacy and Data Management

19 Trajectory Reconstruction Reference: Aggarwal, C.C., Yu, P.S.: A condensation approach to privacy preserving data mining. 6/10/201519CS295 - Privacy and Data Management

20 Quasi-identifier Methods QIDs are a sequence of locations with multiple sensitive values (locations) – values are different from the perspective of each adversary Yet, must consider linkage attacks from all adversaries 6/10/201520CS295 - Privacy and Data Management

21 Quasi-identifier Methods Possible Attack – T 5 and t 5 A match! We know that person visited b 1 6/10/201521CS295 - Privacy and Data Management

22 Space Generalization Each position is an exact point on a grid Generalizations become rectangles of nearby points. 6/10/201522CS295 - Privacy and Data Management

23 Attack Graph Privacy Breach on prior example Definitions – I-Nodes (Individuals) – O-Nodes (Moving Object IDs) 6/10/201523CS295 – Data Privacy and Confidentiality

24 Attack Graph If I 1 is mapped to O 2, there is no clear mapping for I 2 or I 3 – Both I 2 and I 3 map to O3. Conclusion – O 1 must map to I 1 6/10/201524CS295 - Privacy and Data Management

25 Attack Graph Shortcomings on basic k-anonymity definition – Standard k-anonymity states there should be at least k paths originating from I (based on grouping). – What if we group O to have at least k paths? 6/10/201525CS295 - Privacy and Data Management

26 Attack Graph Privacy Breach – Assume I 2, O 5 are a pair – I 1 maps to both O 1, O 2, but this is impossible! I 5 must map to O 5 6/10/201526CS295 - Privacy and Data Management

27 Final k-Anonymity Definition Every I-node has degree k or more The attack graph is symmetric – For edge (I i, O j ) there is also an edge (I j,O i ) 2-anonymous attack graph: 6/10/201527CS295 - Privacy and Data Management

28 Future Research Ad-Hoc anonymization techniques for intended use of data Privacy Preserving Data Mining – Focus on the analysis methods instead of the publishing 6/10/2015CS295 - Privacy and Data Management28

29 Questions? 6/10/2015CS295 - Privacy and Data Management29


Download ppt "Privacy Preserving Publication of Moving Object Data Joey Lei CS295 Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain 6/10/20151CS295."

Similar presentations


Ads by Google