Presentation is loading. Please wait.

Presentation is loading. Please wait.

LASTor: A Low-Latency AS-Aware Tor Client

Similar presentations


Presentation on theme: "LASTor: A Low-Latency AS-Aware Tor Client"— Presentation transcript:

1 LASTor: A Low-Latency AS-Aware Tor Client
Masoud Akhoondi, Curtis Yu, Harsha V. Madhyastha Hello everyone, My name is Masoud Akhoondi from UCRiverside. I am gonna present our paper paper called LASTor: a low latency AS-aware Tor client In this talk I am gonna say (this paper is about ) how we can improve both latency and anonymity by only client side modification I have collaborated on this work with curits and our advisor harsha. ----- Meeting Notes (5/22/12 07:37) -----

2 Tor (The onion router) Anonymity Low latency communication
D R1 R2 R3 S Tor stands for The Onion Router. Tor has 300,000 users and around 2700 volunteer relays distributed around the world. Tor uses a source routing. When a source wants to communicate with a destinations, it picks 3 relays. Then it extends a path through these relays to the destination using onion routing Tor has two goals: The first goal is anonymity, it protects privacy of its user. In other words, each hop only knows previous and next hop on a path. For example R3 know D and R2 in this path. This is a property of onion routing. 90% of Tor traffic is interactive, for example browing the web, so another goal of Tor is low latency communication But Tor does not meet both of these two goals perfectly. In next couple of slides I am gonna talk about quality of these two goals in current Tor client Anonymity - Each hop only knows previous and next hop on a path Low latency communication - 90% of Tor traffic is interactive [Mccoy08] 400,000 users 2700 relays

3 How are latencies on Tor?
Experiment: Sources: 50 PlanetLab nodes spread across globe Destinations: Top 200 websites 5x inflation in median First I am gonna talk about latency. To quantify the extent of latency overhead, we measured the latency of visiting top 200 websites from 50 PlanetLab nodes. We measured the latency once over Tor and once directly using the Internet. This figure shows the latency overhead of both cases. Y axis is cdf of src, dst pairs and x axes shows latency. No Tor in this figure means that we directly measure latency of destination. We see in this figure that using Tor inflates 5x in median latency

4 Profiling attack on Tor
Green AS (Autonomous System) can eavesdrop on both end segments of path[Murdoch07] Entry segment Exit D S Relay 2 Now let’s talk about quality of another goal of Tor, anonymity. I am gonna show you that Tor does not achieve this goal well. I would like to remind you that onion routing gurantees that every hop on the path only know previous and next hop. However, this does not provide perfect anonymity due to profilling attack as follows: I define two terms here: entry segment is a part of path from client to first relay and exit segment is a part of path from third relay to destination Every path goes through a bunch of AS (Automimous system). For the purpose of this talk we can consider an AS is equivalent to an ISP If there is a common AS on both segments, for example green AS, it can profile traffic and de-anonymize path. In other words, green AS can figure out who is talking to whom ----- Meeting Notes (5/22/12 07:39) ----- Entry relay Exit relay

5 How severe is profiling attack?
65% of relays are in 20% of all ASes To quantify how sever is profiling attack on Tor We analyze distribution of relays across ASes. We map each relay to its AS and plot this figure. This figure shows that distribution of relays across ASes is not uniform. For instance, 65% of relays are in 20% of all Ases cotains relays So probability of choosing entry and exit relay in the same AS is pretty high Non-uniform distribution of relays across ASes

6 Potential solution for these problems
Measure latencies and routes from each relay to all end-hosts [Sherr09, Alsabah11, Mittall11] Requires modification of relays None of these proposals deployed yet Non-trivial to implement Now question here is what are solutions for these problems? one potential solution is to measure latencies and routes from each relay to all end-host. And then choose a path with low latency and no common Ases on both entry and exit segment. But it requires modification of relays to capture, distribute and update this information. Several proposals have been taken this approach which requires modfication of relays. I wanna point out that none of these proposals for changing relays have been incorporated on real Tor.

7 LASTor: A low-latency AS-aware Tor client
Main insight: Client modifications suffice Improve poor latency for interactive communications Mitigate profiling attack LASTor: A low-latency AS-aware Tor client What’s this paper all about? The key insight of our work is client modifications suffice both of these goals improving latency and mitigating profiling attack. We developed LASTor which is a modified version of Tor client. It improves latency and anonymity of Tor

8 Main insight: Client modifications suffice
Improve poor latency for interactive communications Mitigate profiling attack Solution: Modified path selection to reduce latency Solution: AS-aware path selection To improve latecny, we propose a path selection algorithm To mitigate profiling attack, we propose an algorithm to predict paths with common ASes on them and LASTor avoids choosing those paths. In the first half of my talk, I gonna demonstrate how we improve latency

9 Sources of latency on Tor
Goal: Improve latency Sources of latency on Tor Queuing and processing delay Congestion in relays [Panchenko09] Propagation delay Long paths D S To improve latency we need to know sources of delay on Tor. One source of delay is queuing and processing delay because of congestion in relays Another source of delay is propagation delay specially when path is very long. For example in this figure, although source and destination are almost close to each other, all packets should traverse half of the world several times before they get to the destinationglobe several times before they reach at destination although client and destination are close to each other. To investigate which of these factor is dominant factor,

10 Shortest path vs. Default Tor
Goal: Improve latency Shortest path vs. Default Tor Destinations: Top 200 websites Sources: 50 PlanetLab nodes spread across globe Map relays to geographical locations 50% improvement in median we repeat the experiment, We measure latency of visiting top 200 websites from 50 planetlab nodes. This time we run shortest path algorithm as well. Each source maps relays to geographical locations and finds the shortest paths from itself to destination which goes through three relays. Then it measures latency on this path This figure shows the result of this experiment, No Tor means measuring latency directly over the Internet. SP Tor is a modified version of Tor client which runs shortest path algorithm and default Tor is default Tor client. We see significant improvement in latency using shortest path algorithm, for example shortest path algorithm reduces 50% in median latency. So shorter path can greatly improve latency. It means propagation delay is one of the significant factor. But We can’t afford path to be deterministic since an adversary can de-anonymize the path between a source to a destination. That’s why we introduce weighted shortest path Shorter paths can greatly reduce latency Path should not be deterministic  Weighted Shortest Path (WSP)

11 Weighted Shortest Path (WSP)
Goal: Improve latency Weighted Shortest Path (WSP) WSP computes length of all possible paths Probability of choosing is inversely proportional to its length 3 3 1 1 1 Path Length Prob. Upper 8 0.56 Lower 10 0.44 WSP calculates length of all possible paths between a given source and destination and then it chooses a path with a probability inversely proportional to its length. For example, in this figure, length of uppper path 8 and lower path is 10. So the probability of choosing upper path is higher than lower path. At high level WSP provides lower latency by prefering shorter path, but unlike shortest path is probabilistic. That the case an attacker can not infer the path. However if we use this naïve version of WSP, an adversary can attempt to run an attack on it. 2 4 3

12 Attacker controls a relay
Goal: Improve latency An Attack on WSP Attacker controls a relay 3 3 1 1 1 2 4 3 For example an attacker who has control over a relays, replicates its relay 5 times by let’s say running several virtual machines. We see that the probability of choosing compromised paths will be increased. In general, if an adversary runs several malicious relays close to the direct line between source and destination, the chance of choosing compromised paths will be increase with some order of magnitude. Original prob. Prob. Compromised paths 0.56 0.8 Other paths 0.44 0.2

13 Solution: Clustering of relays
Goal: Improve latency Solution: Clustering of relays 3 3 1 1 1 2 4 To mitigate this attack, we cluster relays that are located in geographically nearby locations. Then we run WSP using clusters of relays instead of relays. And for a chosen cluster-level path, randomly pick a relay in each cluster. 3 Run WSP using clusters of relays For chosen cluster-level path, randomly pick a relay in each cluster

14 Solution: Clustering of relays
Goal: Improve latency Solution: Clustering of relays 3 3 1 1 1 2 4 By using clustering of relays, probability of choosing compromised paths comes back to the original values. Eventhough an adversary replicates malicious relays in one location, the probability of choosing compromised paths will be un-changed. As the result, burden on the adversary is increased, he has to run several relays in multiple location in order to attract more traffic. Another advantage of clustering relays is reducing running time of finding a path from the order of seconds to few hundred milliseconds. 3 Prob. Compromised paths 0.56 Other paths 0.44

15 Weighted Shortest Path (WSP)
Goal: Improve latency Weighted Shortest Path (WSP) Preprocessing Cluster all relays Path selection Computes length of possible paths using clusters Choose a path with a probability inversely proportional to its length Pick a relay randomly in each chosen cluster Other issues (see paper) Handling multi-location destinations Choosing entry relays In this slides I am gonna review how WSP works: It has a preprocessing step in which it clusters all relays. Then it chooses a cluster level path with a probability inversely proportional to its length. Finally it picks one random relay in each cluster. There are other issue that I refer you to read paper. We implemented WSP to account for handling multi locaiton destinations. For example, Google web servers are in several different geographical locations in the world. So when a client wants to computes end-2-end distance between source and destination, it needs to know where the destination is. WSP also handles the selection of entry relays. I leave it to you to read paper for details.

16 WSP reduces latency 50 PlanetLab nodes to top 200 websites
Goal: Improve latency WSP reduces latency 20% improvement in 80th percentile 25% improvement in median Now I have explained to you how WSP works, let’s see how well it works. We developed LASTor a modified version of Tor client to use WSP. We measure the latency visiting top 200 website from 50 planet lab nodes. We see significant improvement in latency using WSP. For example, WSP improves 25% in median latency and it improves 20% in 80th percentile. I wanna stress out here that all of these improvement is availble today by only client side modification. The key insight that enables this is that geographical distance is a good measure of latency and we don’t need to measure latencies from relays. 50 PlanetLab nodes to top 200 websites

17 Tunable path selection in LASTor
Goal: Improve latency Tunable path selection in LASTor Modify WSP to consider user’s preference towards: Anonymity Latency Single parameter α configured by user: Modified weight w to w(1-α) where 0 ≤α≤ 1 Since WSP’s preference for shorter paths naturally reduces the entropy of path selection, all users may not wish to reduce in entropy for lower latencies. That’s why we modified WSP to be tunable with a single parameter alpha which changes weight of paths in order to consider user’s preference towards anonymity or latency. For example If alpha is equal to 1, weight and probability of all paths are the same (highest anonymity). If alpha is zero, shorter paths have higher chances, lower latency. To quantify the effect of alpha on latency and anonymity, we ran some experiments. α 1 Lowest latency Highest anonymity

18 Tunable path selection in LASTor
Goal: Improve latency Tunable path selection in LASTor Lower α, lower latency Higher α, higher anonymity We choose different values for alpha from 0 to 1 The left side figure shows effect of alpha on latency. Again we measured latency of visiting top 200 website from 50 planet lab nodes. This figure shows that lower values of alpha leads to lower latency To quantify anonymity of different values of alpha, we use a metric called Gini coefficient. It measures inequality in a distribution. 0 is perfect equality and 1 is maximal inequality. The right figure shows the Gini coefficient of different values of alpha. We see higher values of alpha have higher anonymity, in other words, less gini coefficient. Gini Coefficient measure of inequality in a distribution 0: perfect equality 1: maximal inequality

19 Main insight: Client modifications suffice
Improve poor latency for interactive communications Mitigate profiling attack Solution: Modified path selection to reduce latency Solution: AS-aware path selection Now we are done with the first goal of LASTor, improving latency. Let’s talk about the second goal of LASTor, mitigating profiling attack.

20 Goal: Detect common ASes on entry and exit segments
Goal: AS-aware Profiling attack on a path Goal: Detect common ASes on entry and exit segments Green AS (Autonomous System) can eavesdrop on both end segments of path[Murdoch07] Entry segment Exit D Relay 2 S To remind you, if there is a common AS on both entry and exit segment, that AS can de-anonymize the path. Our goal here is that detect paths with common Ases on entry and exit segment and avoid choosing those paths. First, let’s talk about how Tor mitigates this attack. Entry relay Exit relay

21 Simple heuristic does not work
Goal: AS-aware Simple heuristic does not work 57% of common AS instances are missed To mitigate this attack, Tor avoid choosing entry relay and exit relay if they have same /16 IP prefix. This figure shows the false negative rate of this hurestic. We define false negative as fraction of paths with common AS which are not detected. For example 57% of paths with common Ases are missed for median of src and dst. So we need a solution for predicting Ases on each segment. Default Tor ensures no two Tor relays in same /16 False negative: fraction of paths with common AS not detected

22 Need for predicting AS paths
Goal: AS-aware Need for predicting AS paths Approach 1: Measure routes from relays to all end hosts Need to modify relays Approach 2: Infer AS-level routes Several techniques exist [Mao05, Madhyastha06, Madhyastha09, Lee11] At best 70% accuracy There are two approaches for this problem: One approach is that measuring routes from relays to all end host but it needs midification of relays Several techniques exits however their accuracy at best case is 70%. Given AS path prediction is hard, we are predicting set of ASes Exit relay D

23 Our solution: AS set prediction
Goal: AS-aware Our solution: AS set prediction Exit relay D we predict all Ases that traffic might go through and they are compliant with routing policies. Predict ASes on all paths compliant with routing policies

24 Our solution: AS set prediction
Goal: AS-aware Our solution: AS set prediction Exit relay D but we leave more room for error Predict ASes on all paths compliant with routing policies

25 Our solution: AS set prediction
Goal: AS-aware Our solution: AS set prediction Input [13MB initially, 1.5MB weekly] Topology graph at AS-level Estimate of AS path length Compact representation routing policies: Triple of (AS1, AS2, AS3) where AS1AS2AS3 Algorithm Modified version of Dijkstra’s algorithm Output Set of ASes on policy-compliant routes AS set prediction algorithm downloads three inputs, a topology graph of AS-level routing, estimate of AS path length and a compact representation of routing policies. Size of this data in 13MB initially and 1.5 MB weekly thereafter. Then we run a modified version of dijkstra’s algorithm to find the set of Ases which are on compliant with routing plicies and the length of path is equal to input path length. I refer you to read the paper for details of this algorithm. Let’s take a look at the accuracy of AS set predication.

26 AS set based prediction is accurate
Goal: AS-aware AS set based prediction is accurate 11% of common AS instances are missed 57% of common AS instances are missed We see in the figure that AS set prediction reduces false negative rate. For example it reduces false nagative rate from 57% ro 11% in median. I wanna point out here that Any path selection algorithm including current Tor algorithm can use AS-set prediction False negative: fraction of paths with common AS not detected Any path selection algorithm can use AS set predcition to avoid profiling attack

27 LASTor Latency 50 PlanetLab nodes to top 200 websites
Finally we compare latencies different algorithms. WSP refers to weighted shortest path algorithm. WSP+AS sets refers to WSP algorithm enhanced by AS set prediction and Default Tor algorithm. WSP+ AS sets increases latency in comparison with WSP but it is still better than Default Tor client. However it is safer for users. 50 PlanetLab nodes to top 200 websites

28 Summary Demonstrated client side changes are sufficient for:
Lower latency Higher anonymity Designed and implemented LASTor Reduces median latency by 25% Reduces median false negative of common AS from 57% to 11% To summarize this talk, Client side modifiction improves latency and anonymity of Tor It reduces median latency by 25% and it reducted mediand false negative rate of paths with common AS from 57% to 11%.

29 Thank you

30 How does Tor work? (Onion Routing)
Entry Relay (guard) Server Client Exit Relay R1 R3 How does Tor work? Tor uses a traditional concept called onion routing. Let me explain it through an example. this Client wants to connect to the server. Tor client software routes Internet traffic through a worldwide volunteer network of relays. Tor has currently 2700 relays. There is a key associated with each relay. Tor client usually chooses 3 relays. And then it extends a path through them. For example here client chooses R1, R2 and R3. The first relay in the circuit is called entry relay or guard, the second one is called middle relay and the last one is called exit relay. Client encrypts its data with public key of exit relay first, then it encrypts with public key of middle relay and after that it encrypts all data with public key of entry relay. When packets reach at encry relay, it decrypts them and relays them to next middle relay. Same thing happens in middle relay. Finally exit relay forwards packets to the destination. R5 R4 R2 - 300,000 users relays Middle Relay

31 Is distance a good estimation of latency?
Choose two different paths: WSP(latency) WSP(distance) Measure latency on these two paths 50 planetlab nodes as source and top 200 websites as destination Since we are not going to modify tor relays, we don’t know what is the latency between a source and all candidate relays and so forth we need an estimation of latency We run another experiments to know how much distance is a good estimation of latency There is no significant difference between these two metrics

32 Accuracy of AS-set prediction algorithm
Goal: AS-aware Accuracy of AS-set prediction algorithm This diagram shows the accuracy of prediction algorithm. The figure on the left shows the false …. We compare this algorithm with iplane which is one of the existing tools for predicting AS path. /8 and /16 prefixes are current tor heuristics for avoiding this attack The figure on the right shows the false negativ…

33 Attack on WSP Clustering of relays reduces: Probability of the attack
Goal: Improve latency Clustering of relays reduces: Probability of the attack Running time of WSP 50% reduction We conduct an experiment to demonstrate the improved resilience of WSP to this attack We emulate an adversary For each relay, we compute the probability of the chosen path traversing that relay. This is an upper bound on the fraction of cases in which the chosen path will traverse a relay controlled by the adversary It can be decreased more if we choose larger cluster We reduced the running time of this algirhtms to few hundred milli second Adversary replicates 10% most popular relays 25 times Compute probability of the chosen path traversing a malicious relay


Download ppt "LASTor: A Low-Latency AS-Aware Tor Client"

Similar presentations


Ads by Google