Presentation is loading. Please wait.

Presentation is loading. Please wait.

Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001.

Similar presentations


Presentation on theme: "Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001."— Presentation transcript:

1 Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

2 Outline Definitions Why is LRD important? Heavy tails Producing self-similar traffic Physical interpretation in LAN and WAN networks –Different hypothesis from around 10 papers

3 On the Self-Similar Nature of Ethernet Traffic, W. Willinger, 1994

4 Definitions Long range dependent process –if its autocorrelation function is nonsummable Self-similar process –scaling behavior of finite dimensional distributions X=(m^(1-H))*X(m) in distribution Second order self-similar process –aggregated processes possess the same non-degenerate AC functions as the original process X and (m^(1-H))*X(m) have the same AC function Self-similar processes have hyperbolically decaying autocorrelation functions - LRD can be characterized by a single parameter H

5 Heavy tails (Noah effect) Heavy-tailed distributions –LLCD Pareto a typical example

6 Producing Self-Similar Traffic 1. Multiplexing ON/OFF sources that have a fixed rate in ON periods and ON/OFF period lengths that are heavy tailed. –Aggregate traffic is fBm with 2. queue model –implies that multiplexing constant-rate connections with Poisson connection arrivals and a heavy-tailed distribution for connection lifetimes would result in self-similar traffic 3. Inter-arrival packet times are i.i.d. Pareto with –and then consider the corresponding count process (the number of arrivals in consecutive intervals), we have “pseudo self- similar” traffic (Paxson, Floyd) (or even self-similar (L. Lipsky)?)

7 Questions we want to answer What physical activity causes LRD? What is the role of protocols (TCP and MAC layer protocols)? What is the role of limited resources (i.e. bandwidth)? What model fits best to each of the assumptions? What is the largest time-scale over which the correlation is present? Self-similarity vs. pseudo self-similarity and relevance

8 Statistical Analysis of Ethernet LAN Traffic at the Source Level, W. Willinger, 1997, I

9 Statistical Analysis of Ethernet LAN Traffic at the Source Level, W. Willinger, 1997, II Model 1 (heavy tailed ON/OFF activity at the source level) is widely accepted Result proven theoretically Noah effect (heavy-tailed periods) ON periods alpha = 1.7 OFF periods alpha = 1.2 TCP traffic measured most of the time... Higher load - H increases WAN measurements do not fit into this model connection typically do not stay long

10 Wide Area Traffic: The Failure of Poisson Modeling, V. Paxson, S. Floyd, 1995 Summary of ways to produce LRD traffic WAN (TCP) traffic for TELNET and FTP applications –TELNET connection arrivals appear to be Poisson, but packet arrivals are not –Single TELNET connection is LRD Model 3: Inter-arrival times are i.i.d. Pareto –Aggregate is also LRD, but there is no analytical proof (*) FTP traffic also LRD, yet non of the models fit because of limited resources. Aggregated traffic is not fBm (single H is not enough)

11 Explaining WWW Traffic Self-Similarity, M. Crovella, 1995 WWW traffic is self-similar –but only when load is high (i.e. in busiest hours) Authors force model 1 (ON/OFF model) –The distribution of: transfer times (alpha = 1.21) user requests for documents (alpha = 1.06) document sizes available in the Web (alpha = 1.05) user think times (alpha = 1.5) H increases as the load increases (same as in LAN)

12 On the Relationships betw. file sizes, tran. prot. and s-s netw. traffic, M. Crovella, 1996 Model 1: The success of this simple model is surprising given that it ignores non-linarities arising in real networks Hypothesis: –Heavy tailed file size distributions together with TCP is responsible for LRD if UDP is used, there is little or no LRD Explanation –“In some sense, the effect of the unaccounted for nonlinearity is reflected back as a stretching in time effect, thus conforming to the model’s original suppositions” Other interesting stuff: mix of Pareto and exp. background traffic

13 On the Propagation of LRD in the Internet, A. Veres, 2000, I Not about roots, but about propagation of self- similarity by TCP A(t) = C - B(t) TCP is a linear system beyond a characteristic time scale –if it adapts well to a background traffic, it itself becomes self-similar

14 On the Propagation of LRD in the Internet, A. Veres, 2000, II Experimental proof: –NY-Budapest file transfer, source is not LRD - traffic is LRD (H=0.76) –Max time scale = 8 min Also, if there is number of on-off TCP connections, they can spread LRD W. Willinger obviously does not like this paper: –“This is a fraud and has no relevance for LRD observed on link level...” –“Protocols have no impact on LRD, they just have to send the data generated by applications...”

15 TCP Congestion Control and Heavy- Tails, M. Crovella, 2000, I Switch to Model 3 (Heavy-tailed inter-packet arrivals) Although heavy-tailed flow lengths are commonly associated with heavy-tailed file sizes, there is no strong correlation between file sizes and transmission times It has been shown that TCP can show heavy-tailed inter- arrival times under some conditions Because most of the connections are short lived ( ! ) only slow start and exp. back-off were considered

16 TCP Congestion Control and Heavy- Tails, M. Crovella, 2000, II Simple Markov chain model for exp. backoff and slow start with pr. of loss parameter State probability with different loss rates For alpha to be between 1 and 2, p has to be between 1/8 and 1/4...but for different model p increases => H increases

17 TCP Congestion Control and Heavy- Tails, M. Crovella, 2000, III Pathological TCP connections: 15 packets Analytical model not that good (borders are loose) For this set-up, correlation up to 1000 sec For larger file sizes, up to 200-300 sec Under certain conditions, heavy tailed transmission times can occur even in the absence of any variability in file sizes Future work: to consider the variability in round-trip time estimation

18 On the Autocorrelation Structure of TCP Traffic, Don Towsley, 2000, I Answer to previous two papers: –TCP can create self-similarity but over finite range of time scales - “pseudo self similarity” but everything in nature is finite (thus “pseudo”) –Also criticize pathological model of previous paper, but they themselves use pathological model of different kind (always packets model) Separate Markovian models for Congestion avoidence (CA) and Time Out (TO) models Simulated these two models with different loss probability parameters

19 On the Autocorrelation Structure of TCP Traffic, Don Towsley, 2000, II Range of time scales observed from the simulation (2^6*RTT*(2.5 to 10)) => 2^9*RTT Explanation on why aggregate is self-similar –independent bottlenecks (at the edge) –aggregate of independent pseudo-self-similar flows should be self-similar itself (**)

20 On the Autocorrelation Structure of TCP Traffic, Don Towsley, 2000, III !About Veres paper –compute loss probability (0.08 to 0.14) –TO model predicts H=0.69-0.72 (really measured 0.74) –Time scale goes up to 2^6 RTO (also near measured value) Experiments (file transfers) –North-South America Measurements: p = 0.13, H = 0.77, ts = (2^7 to 2^8)*RTT TO model: p = 0.12, H = 0.72, ts = (2^7 to 2^9)*RTT –East - West Coast Measurements: p = 0.018, H = 0.86, ts = 2^6*RTT CA model: p = 0.018, H = 0.75, ts = 2^4*RTT One should be careful when attributing the origin of traffic characteristics to a specific cause

21 Protocols Can Make Traffic Appear Self- Similar, Jon Peha, 1997. I How basic retransmission mechanism can cause self- similarity No model, only experimental investigation Simple single queue (bottleneck) model Input traffic - Poisson; retransmissions are bursty As time-scale gets larger, burstiness from original Poisson traffic decreases, but burstiness from retransmissions stays the same! Unlikely that traffic from retransmission mechanism cause truly self similar traffic, rather pseudo self- similarity

22 Protocols Can Make Traffic Appear Self- Similar, Jon Peha, 1997. II Pictorial “proof”

23 Protocols Can Make Traffic Appear Self- Similar, Jon Peha, 1997. III Cut-off time scales observed: –150Mbps link rate, 500 bits packets, RTT 60 msec TS = 5 minutes –10Mbps Ethernet, No. of retransmissions=5, To=125 TS in range of minutes –For larger To, it is possible to reach time scales measured at Bellcore –I have computed cut-off time-scale for Veres paper 128 Kbps, Tout=10*RTT=2 sec, TS=8min If this effect is found to be as strong in more complex models, this could be a significant cause

24 The Second-order Characteristics of TCP, J.Y.Boudec, 1996, I Pseudo self similarity (TS=20-30 sec) –Minimum bottleneck bandwidth 34Mbps (?) Two main reasons (both heavy-tailed) –Burst length arrivals –Round trip time Real network measurements Figure - missing

25 The Second-order Characteristics of TCP, J.Y.Boudec, 1996, II Even for 34Mbps link and utilization of 25%, the arrival bursts are eliminated and the inter packet times are dependent on the round trip times The aggregate of TCP connections have the same H as a single TCP connection (***) “It seems likely that the heavy tailed distributions observed in Willinger’s work were a result of, among other things, the heavy tailed distribution of a round trip time”

26 More on RTTs Why are round trip times heavy-tailed? –Because of TCP congestion control? –Because of retransmissions? –Because of variety of destinations? It can be heavy-tailed even without any congestion protocol or different destinations! –Measurement and Analysis of LRD Behavior of Internet Packet Delay, M. Borella, Infocom 97 Constant UDP transmissions - LRD response Is cross-traffic heavy-tailed? Or multiple bottlenecks assumption? –Simple example (not through bandwidth adaptation, but through RTT adaptation)

27 Summary Heavy-tailed parameters –File sizes –Connection life-times –Inter-arrival packet times –Document sizes available in the web –User think times –TELNET packet arrivals –Round trip times Pseudo self-similarity –it should be clear that the range of time scales covered is far beyond dominant time scales, and as long as packet loss is concerned, this is relevant

28 Conclusions One should be careful when attributing the origin of traffic characteristics to a specific cause There is more than one physical activity causing LRD Protocols (TCP) influence is more than relevant –Time scales covered are relevant in both generation, time- stretching and propagation hypothesis Model 3 (inter-arrival times i.i.d. Pareto) plus heavy-tailed file sizes (introducing congestion) is promising Analytical proof for aggregate is missing (simulation proof reported in 3 papers) Round-trip times hypothesis might be promising - supports Veres idea in a slightly different way


Download ppt "Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001."

Similar presentations


Ads by Google