Presentation is loading. Please wait.

Presentation is loading. Please wait.

De-anonymizing the Internet Using Unreliable IDs

Similar presentations


Presentation on theme: "De-anonymizing the Internet Using Unreliable IDs"— Presentation transcript:

1 De-anonymizing the Internet Using Unreliable IDs
By Yinglian Xie, Fang Yu, and Martín Abadi Presented by Yinzhi Cao, Ionut Trestian

2 Problem A free but troublesome network Problems we try to solve:
To what extent can we use IP addresses to track hosts? Can we use the binding information between hosts and IP addresses to strengthen network security?

3 Host-Tracking Graph Formally, we define the host-tracking graph G : H × T → IP, where H is the space of all hosts on the Internet, T is the space of time, and IP is the IP-address space.

4 Host-Tracking Graph

5 Host Representation Since we lack strong authentication mechanisms, we consider leveraging application-level identifiers such as user IDs, messenger login IDs, social network IDs, or cookies.

6 Goals We would like to generate two outputs
The first being an identity-mapping table that represents the mappings from unreliable IDs to hosts The second being the host-tracking graph that tracks each host’s activity across different IP addresses over time

7 Tracking Host Activities

8 Application-ID Grouping
To quantitatively compute the probability of two independent user IDs u1 and u2 appearing consecutively, let us assume that each host’s connection (hence the corresponding user login) to the Internet is a random, independent event.

9 Host-Tracking Graph Construction

10 Resolving Inconsistency
Proxy Identification To find both types of proxies/NATs, HostTracker gradually expands all the overlapped conflict binding windows associated with a common IP address. Guest Removal

11 Input Dataset A month-long user-login trace collected at a large Web- service provider in October, 2008 (about 330 GB). Each entry has 3 fields: (1) an anonymized user ID (550 million) (2) the IP address that was used to perform the login (220 million) (3) the timestamp of the login event For validation: A month-long software-update log collected by a global software provider during the same period of October, 2008. a unique hardware ID for each remote host that performs an update, the IP address of the remote host the software update timestamp.

12 Tracked events

13 Tracked Hosts vs. Active Hosts

14 Validation results

15 Tracked User Population

16 Signup Date

17 Email sending behavior

18 Applications – Detecting Malicious Activity
In a previous study we identified 5.6 million malicious IDs that are used to conduct spam campaigns Intersection between malicious IDs and tracked IDs (220 million) is small (50k)

19 Signup Date - Revisited

20 Host Tracking – Security

21 Country Code Comparison

22 Seed Size Analysis

23 Conclusions Although accesses provide only a limited view of the Internet one can use other information for tracking – social network IDs, cookies etc Hard to evade HostTracker and maintain attack effectiveness at the same time


Download ppt "De-anonymizing the Internet Using Unreliable IDs"

Similar presentations


Ads by Google