Presentation is loading. Please wait.

Presentation is loading. Please wait.

On the Power of Off-line Data in Approximating Internet Distances Danny Raz Technion - Israel Institute.

Similar presentations


Presentation on theme: "On the Power of Off-line Data in Approximating Internet Distances Danny Raz Technion - Israel Institute."— Presentation transcript:

1 On the Power of Off-line Data in Approximating Internet Distances Danny Raz (danny@cs.technion.ac.il)danny@cs.technion.ac.il Technion - Israel Institute of Technology and Prasun Sinha (prasunsinha@lucent.com)prasunsinha@lucent.com Bell Labs., Lucent Technologies

2 Outline Internet Distance Off line metrics –Geographic distance, #hops, # AS, depth Linear Regression for Internet distance estimation Multi-variable linear regression Accuracy of picking closest mirror site The next step

3 Internet Distance Internet Distance: one way delay between hosts Components of Internet Distance –Dynamic Server Load Network Congestion / Router Load –Static propagation delay over the links Router processing delay Edge-router processing delay Goal: To study the power of estimating the Static Internet Distance using off-line metrics

4 Importance of Internet Distance Estimation Picking closest mirror-site/cache For use in Content Distribution Networks

5 Approaches Dynamic –Dynamic probing [Dykes et. al. Infocom ’00] –Passive monitoring [Andrews et. al. Infocom ’02] Static –Semi-active probing (IDMAPs) [Jamin et. al. Infocom ’00] Other relevant work: –Geographic Distance and RTT: [Padmanabhan Sigcomm ‘02]

6 Static Internet Distance Propagation delay: geographical distance Router processing delay: # hops Edge-router processing delay: # AS Static Internet Distance =  geo-distance +  hop-count +  AS-count ? AS #1AS #2AS #3 AS: Autonomous System Core Router Edge Router

7 Data Collection Clients: 2500 public libraries in US Servers (mirrors/caches): 8 traceroute locations in US The location (latitude, longitude) is known for every host. For every client-server pair –Run multiple (10) traceroutes –Pick the traceroute result with the smallest RTT –Compute Geo-distance: based on latitude and longitude Hop-count: from traceroute AS-count: from traceroute based on names of routers and IP Address Prefixes

8 Linear Regression (Geo-distance and Hop-count) minRTT vs. Geo-distance SE (Std. Error) = 26.93 minRTT vs. Hop-count SE (Std. Error) = 25.71

9 Multiple Linear Regression (Multiple metrics) minRTT vs. Geo-distance, Hop-count SE = 21.52 minRTT vs. Geo-distance, AS-count SE = 23.80

10 minRTT =  geo-distance +  hop-count +  AS-count ? TermCoefficientp-value Geo-distance12.53 (  ) <0.0001 Hop-count2.45 (  ) <0.0001 AS-count-0.64 (  ) 0.0387 High correlation between hop-count and AS-count (highest among any other pair of metrics) Hop-count and AS-count should not be used together

11 A new Off-line metric: Depth Hop-count: requires dynamic probing Introduce an alternate metric: Depth –Average Hop-count to the nearest backbone network (a hand-made list of 30 big core networks) –Constant per host (client/server) –Alternately, measure in units of time rather than hops –(Client depth + Server depth) as a metric

12 Linear Regression (Depth) minRTT vs. Depth SE = 41.02 minRTT vs. Depth and Geo-distance SE = 24.52

13 Squared Errors in Estimating minRTT Metric SE (Standard Error) Geo-distance, Hop-count21.52 Geo-distance, AS-count23.80 Geo-distance, Depth24.52 Hop-count25.71 Geo-Distance26.93 Depth41.02

14 Accuracy of picking the nearest mirror site Allowed Delta Random Geo- distance Hop-count Geo- distance, Hop-count Geo- distance, Depth 012.50%37.84%44.32%38.41%33.98% 10ms21.15%53.07%58.98%55.91%50.45% 20ms33.75%73.18%76.70%74.89%70.91% 30ms46.25%90.91%88.75%91.36%89.43% 880 clients and 8 servers

15 Summary Combination of hop-count and geographic distance improves over individual metrics Using Depth along with Geo-distance improves performance and is completely off-line For closest mirror selection with 30 ms allowed deviation, almost any metric gives 90% accuracy Is there much space to improve?

16 The Next Step Global Data –Collection and analysis of data based on clients and servers spread across the globe Using both off-line and on-line –Techniques to combine the power of off line estimation with on-line estimation.


Download ppt "On the Power of Off-line Data in Approximating Internet Distances Danny Raz Technion - Israel Institute."

Similar presentations


Ads by Google