Presentation on theme: "D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007."— Presentation transcript:
D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007
A R ENAISSANCE IN N ETWORK M EASUREMENT Data centric view of networking Must arise from real problems, or observations on data Abstractions based on data Results validated by data In fact: the scientific method in networking Not just getting numbers, Discovery Knowing, understanding, improving network performance Not just monitoring Traffic patterns: link, path, node, applications, management Quality of Service: delay, loss, reliability Protocol dynamics: TCP, VoIP,.. Network infrastructure: routing, security, DNS, bottlenecks, latency..
T HE R ISE OF M EASUREMENT Now have dedicated conferences: Passive and Active Measurement Conference (PAM)2000 … ACM Internet Measurement Conference (IMC)2001... Papers between 1966-87 ( P.F. Pawlita, ITC-12, Italy ) Queueing theory:several thousand Traffic measurement:around 50
A CTIVE VERSUS P ASSIVE M EASUREMENT Typical Active Aims “End-to-End” or “Network edge” End-to-End Loss, Delay, Connectivity, “Discovery” ….. Long/short term monitoring: Network health; Route state Internet view: Application performance and robustness Typical Passive Aims “At-a-point” or “Network Core” Link utilisation, Link traffic patterns, Server workloads Long term monitoring: Dimensioning, Capacity Planning, Source modelling Engineering view: Network performance
A CTIVE P ROBING VERSUS N ETWORK T OMOGRAPHY Network Tomography Typically “network wide”: multiple destinations and/or sources Simple black box node/link models, strong assumptions Classical inference with twists Typical metrics:per link/path loss/delay/throughput Active Probing Typically over a single path Use tandem FIFO queue model Exploit discrete packet effects in semi-heuristic queueing analysis Typical metrics:link capacities, available bandwidth
T WO E NTREES Loss Tomography Removing the temporal independence assumption Arya, Duffield, Veitch, 2007 Active Probing Towards (rigorous) optimal probing Baccelli, Machiraju, Veitch, Bolot, 2007
N ETWORK T OMOGRAPHY The Metrics Link Traffic(volume, variance) (Traffic Matrix estimation) Link Loss(average, temporal) Link Delay(variance, distribution) Link Topology Path Properties(network `kriging’) Joint problems(use loss or delay to infer topology) Began with Vardi  “Network Tomography: estimating source-destination traffic intensities from link data” Classes of Inversion Problems End-to-end measurements internal metrics Internal measurements path metrics
T HE E ARLY L ITERATURE (INCOMPLETE!) Loss/Delay/Topology Tomography AT&T( Duffield, Horowitz, Lo Presti, Towsley et al. ) Rice( Coates, Nowak et al. ) Evolution: loss, delay topology Exact MLE EM MLE Heuristics Multicast Unicast (striping ) Traffic Matrix Tomography AT&T( Zhang, Roughan, Donoho et al. ) Sprint ATL( Nucci, Taft et al. )
L OSS T OMOGRAPHY USING M ULTICAST P ROBING 10 1 1 11 1 0 10 0 1... multicast probes... Loss Estimator Loss rates in logical Tree
Stochastic loss process on link acts deterministically on probes arriving to Node and Link Processes loss process on link probe `bookkeeping’ process for node T HE L OSS M ODEL 11 00 … 1 1 0 0 … 1 00 … Bookkeeping process Link loss process 0
Independent Probes Spatial: loss processes on different links independent Temporal: losses within each link independent A DDING P ROBABILITY: L OSS D EPENDENCIES Model reduces to a single parameter per link, the passage or transmission probabilities
F ROM L INK P ASSAGE TO P ATH P ASSAGE P ROBABILITIES Path probabilities: only ancestors matter Sufficient to estimate path probabilities
E STIMATION : JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE Aim: estimate 1 0 0 1 1 1 1 1 0 0 1 1 1 1 0 … …… 10 / 19 Original MINC loss estimator for binary tree [Cáceres,Duffield, Horowitz, Towsley 1999] A CCESSING I NTERNAL P ATHS Obtain a quadratic in
E STIMATION : JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE 10 / 19 O BTAIN P ATHS R ECURSIVELY
Before Spatial: link loss processes independent Temporal: link loss processes Bernoulli Parameters: link passage probabilities T EMPORAL I NDEPENDENCE: H OW F AR TO R ELAX ? After Spatial: link loss processes independent Temporal: link loss processes stationary, ergodic Parameters: joint link passage probabilities over index sets Full characterisation/identification possible!
Importance Impacts delay sensitive applications like VoIP (FEC tuning) Characterizes bottleneck links Pr 1 3 … 2 Loss-run length Loss/pass bursts … … Loss run-length distribution ( density, mean ) T ARGET L OSS C HARACTERISTIC
Sufficiency of joint link passage probabilities Mean Loss-run length Loss-run distribution e.g. A CCESSING T EMPORAL P ARAMETERS: G ENERAL P ROPERTIES 7
J OINT P ASSAGE P ROBABILITIES Joint link passage probability Joint path passage probability
E STIMATION: J OINT P ATH P ASSAGE P ROBABILITY computed recursively over index sets No of equations equal to degree of node k = 0 for
Estimation of in general trees Requires solving polynomials with degree equal to the degree of node k Numerical computations for trees with large degree Recursion over smaller index sets Simpler variants Subtree-partitioning Requires solutions to only linear or quadratic equations No loss of samples Also simplifies existing MINC estimators Avoid recursion over index sets by considering only subsets of receiver events which imply E STIMATION: J OINT P ATH P ASSAGE P ROBABILITY...
S IMULATION E XPERIMENTS Setup Loss process Discrete-time Markov chains On-off processes: pass-runs geometric, loss-runs truncated Zipf Estimation Passage probability Joint passage probability for a pair of consecutive probes Mean loss-run length: Relative error:
E XPERIMENTS Estimation for shared path in case of two-receiver binary trees Markov chain On-off process
E XPERIMENTS Markov chain On-off process Estimation for shared path in case of two-receiver binary trees
E XPERIMENTS Estimation of for larger trees Link 1 Trees taken from router-level map of AT&T network produced by Rocketfuel (2253 links, 731 nodes) Random shortest path multicast trees with 32 receivers. Degree of internal nodes from 2 to 6, maximum height 6
V ARIANCE Estimation for shared path in case of tertiary tree Standard errors shown for various temporal estimators
C ONCLUSIONS Estimators for temporal loss parameters, in addition to loss rates Estimation of any joint probability possible for a pattern of probes Class of estimators to reduce computational burden Subtree-partition: simplifies existing MINC estimators Future work Asymptotic variance MLE for special cases (Markov chains) Hypothesis tests Experiments with real traffic
FRANÇOIS BACCELLI, SRIDHAR MACHIRAJU, DARRYL VEITCH, JEAN BOLOT O PTIMAL P ROBING IN C ONVEX N ETWORKS
The Case Against Poisson Probing Zero-sampling bias not unique to Poisson (in non-intrusive case) PASTA talks about bias, is silent on variance PASTA is not optimal for variance or MSE PASTA is about sampling only, is silent on inversion P REVIOUS W ORK 10 Probe Pattern Separation Rule: select inter-probe (or probe pattern) separations as i.i.d. positive random variables, bounded above zero. Example: Uniform (i.i.d.) separations on [ 0.9μ, 1.1μ ] Aims for variance reduction Avoids phase locking (leads to sample path `bias’) Allows freedom of probe stream design
Results derived in context of delay only No optimality result (expected to be highly system and traffic dependent) L IMITATIONS 10
Extended all results to loss case (loss and delay in uniform framework) For convex networks, have universal optimum for variance But: sample-path bias Give family of probing strategies with Zero bias and strong consistency Variance as close to optimal as desired (tunable) Fast simulation Consistent with spirit of Separation Rule N EW W ORK 10 But are networks convex? Some systems when answer is known to be Yes virtual work of M/G/1 loss process of M/M/1/K Insight into when No Real data when answer seems to be Yes
Sampling For end-to-end delay of a probe of size x Only have probe samples T WO P ROBLEMS: S AMPLING AND I NVERSION 10 Inversion From the measured delay data of perturbed network, may want: Unperturbed delays Link capacities Available bandwidth Cross traffic parameters at hop 3 TCP fairness metric at hop 5 …. Here we focus on sampling only, do so using non-intrusive probing
Non-intrusive Delay Delay process using zero sized probes still meaningful Each probe carries a delay: samples available T HE Q UESTION OF G ROUND T RUTH 10 Non-intrusive Loss Losses with x=0 are hard to find.. meaningful but useless Lost probes are not available (where? How?) Probes which are not lost tell us ? about loss? Cannot base non-intrusive probing on virtual (x=0) probes
Approach Define end-to-end process directly in continuous time Must be a function of system in equilibrium - no probes, no perturbation! Non-intrusive probing defined directly as a sampling of this process: Interpretation: what probe would have seen if sent in at time. G ENERAL G ROUND T RUTH 10 Note Process may be very general, not just isolated probes but probe patterns! Applies equally to loss or delay This is the only way, even virtual probes can be intrusive!
L OSS G ROUND T RUTH E XAMPLES 10 1-hop FIFO, buffer size K bytes, droptail If packet based instead, becomes independent of x ! 2-hop Depends on x Other examples: Packet pair jitter Indicator of packet loss in a train Largest jitter in a chain, or number lost
NIPSTA: N ON- I NTRUSIVE P ROBING S EES T IME A VERAGES 10 Strong Consistency : Empirical averages seen by probe samples converge to true expectation of continuous time process, assuming: Stationarity and ergodicity of CT Stationarity and ergodicity of PT (easy) Independence of CT and PT (easy) Joint ergodicity between CT and PT Result follows just as in delay work, using marked point processes, Palm Calculus and Ergodic Theory. Not Restricted to Means : Temporal quantities also, eg jitter Probe-train sampling strategies
O PTIMAL V ARIANCE OF S AMPLE M EAN 10 Covariance function Sample Mean Estimator Periodic Probing
P ERIODIC P ROBING IS O PTIMAL ! 10 Compare integral terms : If R convex, then can use Jensen’s inequality: Problem : Periodic is not mixing, so joint ergodicity not assured (phase lock) If occurs, get `sample path bias’ (estimator not strongly consistent) No amount of probe train design can beat periodic!
G AMMA R ENEWAL P ROBING 10 Gamma Law Gamma Renewal Family: As increases, variance drops at constant mean Poisson is beaten once Process tends to periodic as Sensitivity to periodicities increases however Proof: Again uses Jensen’s inequality Uses a conditioning trick and a technical result on Gamma densities