Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.

Motivation C&W AT&T Web Server Sprint UUNet Qwest Earthlink AOL It’s so slow! Diagnosis engine Ethernet Why is it so slow?

Network Diagnosis Diagnosis engine Netmon/tcpdump traces Network topology Trouble spots location Diagnosis results: Qwest access link: 63.232.180.230->63.232.33.134 Peering between UUNET and AOL: 64.45.216.154->172.139.89.74

Network Diagnosis (Cont.) Goal: Determine internal network characteristics using passive end-to-end measurements Goal: Determine internal network characteristics using passive end-to-end measurements Primary focus: identifying lossy links Primary focus: identifying lossy links Applications Applications Trouble shooting Trouble shooting Server selection Server selection Server placement Server placement Overlay network path construction Overlay network path construction

Previous Work Active probing to infer link loss rate Active probing to infer link loss rate multicast probes multicast probes striped unicast probes striped unicast probes Pros & cons Pros & cons accurate since individual loss events identified accurate since individual loss events identified expensive because of extra probe traffic expensive because of extra probe traffic S AB S AB

Problem Formulation l1l1 l8l8 l7l7 l6l6 l2l2 l4l4 l5l5 l3l3 server clients p1p1 p2p2 p3p3 p4p4 p5p5 (1-l 1 )*(1-l 2 )*(1-l 4 ) = (1-p 1 ) (1-l 1 )*(1-l 2 )*(1-l 5 ) = (1-p 2 ) … (1-l 1 )*(1-l 3 )*(1-l 8 ) = (1-p 5 ) Challenges: Under-constrained system of equations Measurement errors

3 methods 1. Random sampling 2. Linear optimization 3. Bayesian Inference using Gibbs sampling (We’ll focus on the latter one)

Gibbs Sampling D observed packet transmission and loss at the clients observed packet transmission and loss at the clients  ensemble of loss rates of links in the network ensemble of loss rates of links in the network Goal Goal determine the posterior distribution P(  |D) determine the posterior distribution P(  |D) Approach Approach Use Markov Chain Monte Carlo with Gibbs sampling to obtain samples from P(  |D) Use Markov Chain Monte Carlo with Gibbs sampling to obtain samples from P(  |D) Draw conclusions based on the samples Draw conclusions based on the samples

Gibbs Sampling (Cont.) Applying Gibbs sampling to network tomography Applying Gibbs sampling to network tomography 1) Initialize link loss rates arbitrarily 1) Initialize link loss rates arbitrarily 2) For j = 1 : warmup for each link i compute P(l i |D, {l i ’}) where l i is loss rate of link i, and {l i ’} =  k  I l k 2) For j = 1 : warmup for each link i compute P(l i |D, {l i ’}) where l i is loss rate of link i, and {l i ’} =  k  I l k 3) For j = 1 : realSamples for each link i compute P(l i |D, {l i ’}) 3) For j = 1 : realSamples for each link i compute P(l i |D, {l i ’}) Use all the samples obtained at step 3 to approximate P(  |D) Use all the samples obtained at step 3 to approximate P(  |D)

Performance Evaluation Simulation experiments Simulation experiments Trace-driven validation Trace-driven validation

Simulation Experiments Advantage: no uncertainty about link loss rate! Advantage: no uncertainty about link loss rate! Methodology Methodology Topologies used: Topologies used: randomly-generated: 20 - 3000 nodes, max degree = 5-50 randomly-generated: 20 - 3000 nodes, max degree = 5-50 real topology obtained by tracing paths to microsoft.com clients real topology obtained by tracing paths to microsoft.com clients randomly-generated packet loss events at each link randomly-generated packet loss events at each link A fraction f of the links are good, and the rest are “bad” A fraction f of the links are good, and the rest are “bad” LM1: good links: 0 – 1%, bad links: 5 – 10% LM1: good links: 0 – 1%, bad links: 5 – 10% LM2: good links: 0 – 1%, bad links: 1 – 100% LM2: good links: 0 – 1%, bad links: 1 – 100% Link loss processes: Bernoulli and Gilbert Link loss processes: Bernoulli and Gilbert Goodness metrics: Goodness metrics: Coverage: # correctly inferred lossy links Coverage: # correctly inferred lossy links False positive: # incorrectly inferred lossy links False positive: # incorrectly inferred lossy links

Comparative Results Random sampling generates many false positives LP has a low cover rate (30-60%) Gibbs performs very well (80% with a 5% false positive rate)

Random topologies Confidence estimate for gibbs sampling works well and can be used to rank order the inferred lossy links.

Trace-driven Validation Validation approach Validation approach Divide client traces into two: tomography and validation Divide client traces into two: tomography and validation Tomography data set  loss inference Tomography data set  loss inference Validation set  check if clients downstream of the inferred lossy links experience high loss Validation set  check if clients downstream of the inferred lossy links experience high loss Experimental setup Experimental setup Real topologies and loss traces collected from traceroute and tcpdump at microsoft.com during Dec. 20, 2000 and Jan. 11, 2002 Real topologies and loss traces collected from traceroute and tcpdump at microsoft.com during Dec. 20, 2000 and Jan. 11, 2002 Results Results For the small subset of inferences that could be validated, all the inferences are correct For the small subset of inferences that could be validated, all the inferences are correct Likely candidates for lossy links: Likely candidates for lossy links: links crossing an inter-AS boundary links crossing an inter-AS boundary links having a large delay (e.g. transcontinental links) links having a large delay (e.g. transcontinental links) links that terminate at clients links that terminate at clients

Summary Passive network tomography is feasible Passive network tomography is feasible Gibbs sampling yields a high coverage (over 80%), and a low false positive rate (below 5-10%) Gibbs sampling yields a high coverage (over 80%), and a low false positive rate (below 5-10%) Future work: make loss inference in real time Future work: make loss inference in real time

Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.

Similar presentations

Presentation on theme: "Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.

Similar presentations

Presentation on theme: "Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang."— Presentation transcript:

Similar presentations

About project

Feedback