Presentation is loading. Please wait.

Presentation is loading. Please wait.

Internet Tomography and Geography What is this area all about? Related work in the area Main Paper WEBMAPPER’s features How WEBMAPPER works WEBMAPPER results.

Similar presentations


Presentation on theme: "Internet Tomography and Geography What is this area all about? Related work in the area Main Paper WEBMAPPER’s features How WEBMAPPER works WEBMAPPER results."— Presentation transcript:

1 Internet Tomography and Geography What is this area all about? Related work in the area Main Paper WEBMAPPER’s features How WEBMAPPER works WEBMAPPER results summarized

2 Internet Tomography / Geography Where, geographically, are my clients located (or servers)? What’s the best server for me to get content from? Given two IP’s, what’s the latency between them? What IP’s all come from the same geographic location? Generally, show me a complete and accurate map of the Internet

3 Related Work B. Krishnamurthy and J. Wang, “On network-aware clustering of webclients,” in Proceedings of SIGCOMM ’00, August 2000 Problem: Which IP’s are the same Does BGP-table based clustering Groups IP’s by common administrative control Passive S. Jamin, C. Jin, Y. Jin, D. Raz, Y. Shavitt, and L. Zhang, “On the placement of intenet instrumentation,” in Proceedings of INFOCOM ’00, March 2000, pp. 295–304. Problem: What’s latency between 2 IP’s Clusters the internet based on BGP address prefixes Places “Tracers” everywhere in the Internet that actively ping each other

4 Related work An Investigation of Geographic Mapping Techniques for Internet Hosts Padmanabhan, Subramanian, SIGCOMM 2001 Problem: geo-locate from IP 3 solutions  GeoTrack: infer location by DNS / traceroute  GeoPing: probe target from known locations, triangulate [active]  GeoCluster: do passive BGP, IP-prefix based clustering

5 Related work Predicting Internet Network Distance with Coordinate-Based Approaches Problem: Given 2 IP’s, what’s latency Ng, Zhang, CMU, INFOCOM 2002 Active network of landmarks, pinging each other Better math model (GNP) for interpolating distance than IDMaps Proprietary Solutions (Akamai, etc) Problem: What’s the closest server to an IP?

6 Clustering and Server Selection Using Passive Monitoring M. Andrews, B. Sheperd, A. Srinivasan, P. Winkler, F. Zane (Bell Labs), INFOCOM 2002

7 Problem: Server Selection Given a client, and a set of servers, all with identical content, tell me the “best” server For the client, “best” means lowest latency or highest throughput For the whole system, it may mean something else

8 What is Passive Monitoring? Content Servers don’t ping each other Content Servers don’t ping clients No pinging is done: no additional network traffic is introduced Instead, servers record the Round-Trip Time of TCP handshake from a client

9 WEBMAPPER

10 WEBMAPPER: Clusterer Clusterer uses the client-server latency pairs reported by the servers Determines which address prefixes correspond to the same network location, and which don’t Does more than just find the closest server to each cluster; assigns probabilities to each to balance the network flow

11 WEBMAPPER Output Client cluster Content server 1 Content server 2 Content server 3 25.135.64.0/190.90.05 132.0.0.0/80.8500.15

12 WEBMAPPER: Big Tree Giant Binary Tree of all IP addresses Well, not all… assume last 8 bits always clustered Root of tree: 0.0.0.0/0 Children 0.0.0.0/1, 128.0.0.0/1 Leaves 123.123.123.123/24

13 WEBMAPPER: Big Tree Leaves also store Sum of recorded distances (per server) Squared sum of recorded distances Number of recorded distances Leaf data is periodically aged exponentially (I = I * 0.9)

14 WEBMAPPER: Small Tree Clusters are formed from big tree by folding children into their parents when they’re “similar” to their siblings Uses statistical test to determine “similar” (two-sample t-test) Threshold based

15 Assigning Clusters to Servers Assigning servers to clusters is complicated 1. Testing Index  Futzes with the probability of a cluster being assigned to a server  Based on multi-armed bandit solution

16 Assigning Clusters to Servers

17 Calculate server capacity Each cluster assignment has a latency cost Try to minimize the total cost without breaking any server’s capacity Graph theory to the rescue: min-cost flow

18 Results Experiment 1: 28 day log of busy web traffic Recorded client IP, time, RTT Clustering produced 17,270 clusters Here are 12 example clusters for high traffic days

19

20 Results Experiment 2: Set up a west-coast and east-coast server Force clients to download something from each (to get actual measurments) Have clients download something from the “either-or” server (WEBMAPPER powered)

21

22 Opinions Statistics and weighing formulae are cool What’s a good way to tell if the clustering is any good aside from eyeing samples? Two server test is all we can get out of Bell Labs?


Download ppt "Internet Tomography and Geography What is this area all about? Related work in the area Main Paper WEBMAPPER’s features How WEBMAPPER works WEBMAPPER results."

Similar presentations


Ads by Google