Baik Hoh Marco Gruteser Hui Xiong Ansaf Alrabady All images are credited to “ACM” Hoh et al (2007), pp. 161-170.

Slides:



Advertisements
Similar presentations
Cipher Techniques to Protect Anonymized Mobility Traces from Privacy Attacks Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip and Nageswara S. V. Rao.
Advertisements

Localization for Mobile Sensor Networks ACM MobiCom 2004 Lingxuan HuDavid Evans Department of Computer Science University of Virginia.
Coverage in Wireless Sensor Network Phani Teja Kuruganti AICIP lab.
Urban Computing with Taxicabs
The Role of History and Prediction in Data Privacy Kristen LeFevre University of Michigan May 13, 2009.
VTrack: Energy-Aware Traffic Delay Estimation Using Mobile Phones Lenin Ravindranath, Arvind Thiagarajan, Katrina LaCurts, Sivan Toledo, Jacob Eriksson,
Quality Aware Privacy Protection for Location-based Services Zhen Xiao, Xiaofeng Meng Renmin University of China Jianliang Xu Hong Kong Baptist University.
LASTor: A Low-Latency AS-Aware Tor Client
Virtual Trip Lines for Distributed Privacy-Preserving Traffic Monitoring Baik Hoh, Marco Gruteser WINLAB / ECE Dept., Rutgers University Ryan Herring,
Social Welfare gains from Community Forests In Orissa, India By, Jon Barnes.
Fast, Memory-Efficient Traffic Estimation by Coincidence Counting Fang Hao 1, Murali Kodialam 1, T. V. Lakshman 1, Hui Zhang 2, 1 Bell Labs, Lucent Technologies.
IntroductionIntroduction AbstractAbstract AUTOMATIC LICENSE PLATE LOCATION AND RECOGNITION ALGORITHM FOR COLOR IMAGES Kerem Ozkan, Mustafa C. Demir, Buket.
Detection and Measurement of Pavement Cracking Bagas Prama Ananta.
Mohamed F. Mokbel University of Minnesota
Hossein Ahmadi, Nam Pham, Raghu Ganti, Tarek Abdelzaher, Suman Nath, Jiawei Han Pallavi Arora.
Protection Values for VOR-Defined ATS Routes
POC, POD, POS Minnesota Wing Air Branch Director Course.
1 A Distortion-based Metric for Location Privacy Workshop on Privacy in the Electronic Society (WPES), Chicago, IL, USA - November 9, 2009 Reza Shokri.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
ITIS 3200 Intro to Security and Privacy Dr. Weichao Wang.
Anatomy: Simple and Effective Privacy Preservation Israel Chernyak DB Seminar (winter 2009)
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Malicious parties may employ (a) structure-based or (b) label-based attacks to re-identify users and thus learn sensitive information about their rating.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
Wireless “ESP”: Using Sensors to Develop Better Network Protocols Hari Balakrishnan Lenin Ravindranath, Calvin Newport, Sam Madden M.I.T. CSAIL.
Structure based Data De-anonymization of Social Networks and Mobility Traces Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology.
Rutgers: Gayathri Chandrasekaran, Tam Vu, Marco Gruteser, Rich Martin,
1 Preserving Privacy in GPS Traces via Uncertainty-Aware Path Cloaking by: Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady ACM CCS '07 Presentation:
Sensys 2009 Speaker:Lawrence.  Introduction  Overview & Challenges  Algorithm  Travel Time Estimation  Evaluation  Conclusion.
Unit 1 Accuracy & Precision.  Data (Singular: datum or “a data point”): The information collected in an experiment. Can be numbers (quantitative) or.
Hidden Markov Map Matching Through Noise and Sparseness Paul Newson and John Krumm Microsoft Research ACM SIGSPATIAL ’09 November 6 th, 2009.
Demo. Overview Overall the project has two main goals: 1) Develop a method to use sensor data to determine behavior probability. 2) Use the behavior probability.
Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and David H.C. Du Dept. of.
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.
DISCERN: Cooperative Whitespace Scanning in Practical Environments Tarun Bansal, Bo Chen and Prasun Sinha Ohio State Univeristy.
ENERGY-EFFICIENT FORWARDING STRATEGIES FOR GEOGRAPHIC ROUTING in LOSSY WIRELESS SENSOR NETWORKS Presented by Prasad D. Karnik.
Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security.
Xiaowei Ying, Xintao Wu Univ. of North Carolina at Charlotte PAKDD-09 April 28, Bangkok, Thailand On Link Privacy in Randomizing Social Networks.
Privacy-preserving rule mining. Outline  A brief introduction to association rule mining  Privacy preserving rule mining Single party  Perturbation.
Assessing the Marginal Cost of Congestion for Vehicle Fleets Using Passive GPS Data Nick Wood, TTI Randall Guensler, Georgia Tech Presented at the 13 th.
Preserving Privacy in GPS Traces via Uncertainty- Aware Path Cloaking Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady Presented by Joseph T. Meyerowitz.
Microsoft Research Faculty Summit John Krumm Microsoft Research Redmond, WA.
Preventing Private Information Inference Attacks on Social Networks.
Virtual Trip Lines for Distributed Privacy- Preserving Traffic Monitoring Baik Hoh et al. MobiSys08 Slides based on Dr. Hoh’s MobiSys presentation.
Characterizing Home Wireless Performance: The Gateway View Ioannis Pefkianakis* H. Lundgren^, A. Soule^, J. Chandrashekar^, P. Guyadec^, C. Diot^, M. May^,
1 Analysis of in-use driving behaviour data delivered by vehicle manufacturers By Heinz Steven
Privacy Protection in Social Networks Instructor: Assoc. Prof. Dr. DANG Tran Khanh Present : Bui Tien Duc Lam Van Dai Nguyen Viet Dang.
Preserving Privacy GPS Traces via Uncertainty-Aware Path Cloaking Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady Presenter:Yao Lu ECE 256, Spring.
Location Privacy Protection for Location-based Services CS587x Lecture Department of Computer Science Iowa State University.
Intelligent Traffic Environmental System (ITES-AIR)
Probabilistic km-anonymity (Efficient Anonymization of Large Set-valued Datasets) Gergely Acs (INRIA) Jagdish Achara (INRIA)
MultiModality Registration Using Hilbert-Schmidt Estimators By: Srinivas Peddi Computer Integrated Surgery II April 6 th, 2001.
ABJ60 – Spatial Data and Information Science – Operations and Congestion Operations and Congestion.
Draft-deoliveira-diff-te-preemption-02.txt J. C. de Oliveira, JP Vasseur, L. Chen, C. Scoglio Updates: –Co-author: JP Vasseur –New preemption criterion.
Shlomo Bekhor Transportation Research Institute Technion – Israel Institute of Technology Monitoring and analysis of travel speeds on the national road.
Cost Effective Mobile and Static Road Side Unit Deployment for Vehicular Adhoc Networks Presenter: Yesenia Velasco (Senior in Computer Science) Department.
Author : Tzi-Cker Chiueh, Prashant Pradhan Publisher : High-Performance Computer Architecture, Presenter : Jo-Ning Yu Date : 2010/11/03.
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.
WLTP-DHC Analysis of in-use driving behaviour data, influence of different parameters By Heinz Steven
Lessons learned from Metro Vancouver
ACHIEVING k-ANONYMITY PRIVACY PROTECTION USING GENERALIZATION AND SUPPRESSION International Journal on Uncertainty, Fuzziness and Knowledge-based Systems,
Crowd Density Estimation for Public Transport Vehicles
Location Cloaking for Location Safety Protection of Ad Hoc Networks
Inside Job: Applying Traffic Analysis to Measure Tor from Within
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
“Location Privacy Protection for Smartphone Users”
Unique in the shopping mall: On the reidentifiability of credit card metadata by Yves-Alexandre de Montjoye, Laura Radaelli, Vivek Kumar Singh, and Alex.
Presentation transcript:

Baik Hoh Marco Gruteser Hui Xiong Ansaf Alrabady All images are credited to “ACM” Hoh et al (2007), pp

Problem GPS traces are taken from “probe” vehicles to provide services Traffic Monitoring Application GPS location, heading, and speed data Other research has shown that even if this data is anonymized, individual routes can be identified.

Problem: Traffic Monitoring GPS points are mapped to a road segment Average speed of those vectors are calculated Congestion is inferred Requirements Spatial Accuracy Road Coverage Achieved by “penetration rate” Initial deployments fall short – and privacy suffers

Problem: How Privacy is compromised Individuals can be identified by starting and ending points in the GPS trace Data points can be linked together using target tracking and “Maximum Likelihood Detection” For a set of possible points, select the point with the highest probability of belonging to this route. Other research has shown that even if this data is anonymized, individual routes can be identified.

Problem: Existing Algorithms Existing anonymity algorithms cause severe degradation to the utility of the data

Problem: Existing Algorithms K-anonymity using CliqueCloak modifies trace data beyond usability Thought to be the most accurate system with any anonymity guarantee Even making the anonymity set as small as 3, location accuracy drops down to between m, even if they use 2000 probes. Increasing penetration rate would help, but: Higher penetration rates not possible in early deployment Lower density areas of the map would never be accurate.

Factors to Consider The longer the attacker can follow an individual trace, the better they are able to guess who you are, and where you are going

Relative Weighted Coverage Metric When samples are withheld, road coverage decreases Congestion monitoring is more important on popular routes Coverage is limited by the original data set, so coverage can’t get better; it can only go down. High Level: The metric measures the coverage delta between the original data set and the confused data set. It is a measure of data quality.

Time-to-confusion Metric The mean time-to-confusion (MTTC) is meant to be a measurement of privacy The lower the average trackable trip time, the more privacy you have as an individual in the overall system. How long an individual can be tracked is a time-to- confusion threshold. High Level: Time-to-confusion is the time you are able to be “tracked” after de-anonymization.

“Uncertainty-aware” algorithm Calculates the probability of a particular point belonging to a “trip” and verifies that the trip cannot be followed, due to the existence of other points which could just as probably fit that trip High Level: Ensures that a specific level of uncertainty is maintained for every “trip” in the trace data.

Put it all together Given all the points in a particular slice of time, if a single point could have been tracked longer than the time-to- confusion threshold, AND the point in this time slice can be correlated to that trace with high probability, that point is omitted from the set of published data. Allows tracking for a limited time, but prevents tracking the entire trip. The starting location and ending location are not connected, so it’s not possible to identify who the individual is or where they are going, thus privacy is preserved. Mean time-to-confusion is the average time between omitted points on a “trip”

Data Used data collected from 233 volunteer vehicles collected over 7 days Data covers a 70km by 70km metropolitan area (70km = 43.5 miles) Samples are taken every 1 minute while ignition is “on”

Data

Results: Off-Peak, High Density Off-Peak, High Density 10am – 11:30am Gray dots are released Black dots are excluded

Results: On-Peak, High Density On-Peak, High Density 5pm – 6:30pm Gray dots are released Black dots are excluded

Results: Comparison Off-Peak On-Peak

Results: Maximum TTC If UT = 40%, TTC=5m 92.5% of points may be published If UT is 99%, TTC=5m still over 65% of points may be published. If only 92.5% of points are published and randomly selected, at least one route is traceable for 35 minutes.

Results: Median TTC If UT is 40%, TTC = 5m MTTC is 1 minute for the data set. If UT is 99%, TTC = 5m MTTC is 1 minute for the data set. Publishing 80% of points randomly still identified 15% of routes for over 10 minutes. (median not specified)

Results: Relative Weighted Road Coverage When Uncertainty Threshold = 95% and TTC = 5min 81% of data samples are released Road coverage is still 95% If 20% of data samples are removed randomly 80% of samples are published Road coverage is only 79.3% As you can see, there is significantly more degradation in the case of randomly throwing out data.

Other Considerations The authors also consider algorithm modifications to address reacquisition. Maximum TTC is still preserved, but quality is only marginally better than when data points are randomly removed The authors also do not make their algorithm aware of real topography, which could be taken advantage of by an attacker If topography were also considered, this problem could be averted. There are many open research areas (in 2007).

Conclusion Intelligently removing data points to confuse a de- anonymization algorithm is successful for even low- penetration deployments. All images are credited to “ACM”, Hoh et al (2007), pp