Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 (Un)Trustworthy Wireless: What your wireless traffic says about you… Jeff Pang with Ben Greenstein, Ramki Gummadi, Tadayoshi Kohno, David Wetherall (UW/Intel.

Similar presentations


Presentation on theme: "1 (Un)Trustworthy Wireless: What your wireless traffic says about you… Jeff Pang with Ben Greenstein, Ramki Gummadi, Tadayoshi Kohno, David Wetherall (UW/Intel."— Presentation transcript:

1 1 (Un)Trustworthy Wireless: What your wireless traffic says about you… Jeff Pang with Ben Greenstein, Ramki Gummadi, Tadayoshi Kohno, David Wetherall (UW/Intel Seattle), and Srini

2 2 What are we trying to achieve? Time to rethink privacy implications of wireless networks –Identify the shortcomings of current designs and how an adversary might exploit them –Propose some directions for thwarting these attacks Initial focus on Wi-Fi, but aim is to address other protocols as well, e.g., Bluetooth, RFID, GSM

3 3 What is wireless privacy? Traditional notions: –data encryption –user authentication Anonymity is also important –Traditional notion not quite right: e2e privacy only –Data encryption doesn’t preserve anonymity 3 rd party can still track where a user goes, with whom he might be communicating, what sorts of data he might be exchanging, and what sorts of applications he might be running traditionally known as traffic analysis, but much easier to do with ubiquitous wireless

4 4 What information is being leaked? The link between a wireless card and its associated AP –Where a user has been Thug tracks a user from the bank’s network to the dark alley’s network –Who has been in an area Jealous boyfriend monitors girlfriend’s apartment network Timestamps of user transmissions –When are people talking and how much are they saying (chatter) –Who is talking to whom? (assumes monitors at both edges) A dermatologist shares records with an oncologist near patient X, ergo X may have melanoma

5 5 An initial problem statement Adversary: –Can passively sniff all 802.11 traffic at various locations (e.g., café, library, your home, conference) Goal: –Wants to know where you were at and when you visited Question: –Given a traffic sample, how accurately can an adversary accurately classify it as belonging to you or not? Assumptions: –Adversary has some traffic samples “known” to come from you (e.g., sitting next to you while he/she is collecting it) –Adversary has collected a library of traffic samples from other (random) users in the targeted locations

6 6 The obvious answer Yes! –Trivially, by looking at MAC addresses globally unique always transmitted in the clear But that is also trivially thwarted –Can change MAC address each time you associate to an AP –Suppose the next wireless driver patch does this Knowledgeable users can do this themselves, of course But is this a sufficient fix to advertise “improved privacy”? Revised question: –How accurately can the adversary classify a traffic sample if MAC addresses change, say, each hour?

7 7 Initial approach Fairly generic machine learning algorithm: –Compute a “profile” based on known traffic from target user –Based on profile, generate features for each traffic sample –Use known traffic samples to train a naïve bayes model (e.g., generate a probability table for each feature) –Given a new sample, model outputs a probability p that sample came from target user –Assume positive match if p > T, for some T Two types of profile features: –802.11 specific (ctrl pkt contents, driver timing behavior, etc.) Ben Greenstein working on this –802.11 agnostic (IP/application traffic features) I’m working on this

8 8 Initial features Conjecture: the sites you visit identify you –e.g., only you visit slashdot, cnn, joe’s blog, etc. Profile P: –Set of IP destinations we observe you talking to Feature: –Set similarity of the IPs seen in the traffic sample S and your profile; i.e., intersection(P, S)/union(P, S) –Higher scores mean the traffic sample visited more of the same sites

9 9 Initial features (2) Problem: User can mask IP packet contents –AP can use WEP/WPA –User might tunnel traffic through a VPN Attempt to use other exposed features –Object sizes: previous work shows object sizes from a website identifies it accurately use packet timings to group packets into “objects” feature: set similarity based on the set of object sizes users accessed challenges: overlapping flows, dynamic web content –Other possibilities: infer site RTT, site bandwidth, etc. –Question: how good can we do?

10 10 Initial results Setup –SIGCOMM ’04 Wireless traces Wireless traffic from ~200 users across 3 days at the conference Limitations: homogenous location, biased user population, limited timeframe Looking for volunteers to collect better data! –Build profiles and train model using traffic on the first day –For each hourly traffic sample in the 2 nd and 3 rd day: For each user: –Can we determine if a sample comes from that user or not? Metrics: –True positive rate the fraction of samples from that user that are correctly classified –False positive rate the fraction of samples not from that user that are misclassified –Tune the classification threshold T to trade-off one for the other

11 11 Mean accuracy

12 12 Some profiles better than others IP DestinationsObject Sizes

13 13 Summary + Near Future Work Using sites visited is one promising feature to identify users Current inference of object sizes is insufficient as a stand-in when IP traffic is encrypted –But for some users, does give positive information gain Next steps: –Combine with other application traffic features like inferred RTT –Combine with 802.11 specific features. E.g., SSID broadcasts: 43% of sources had at least one unique SSID


Download ppt "1 (Un)Trustworthy Wireless: What your wireless traffic says about you… Jeff Pang with Ben Greenstein, Ramki Gummadi, Tadayoshi Kohno, David Wetherall (UW/Intel."

Similar presentations


Ads by Google