Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMPT 855Module 7.0 1 Network Traffic Self-Similarity Carey Williamson Department of Computer Science University of Saskatchewan.

Similar presentations


Presentation on theme: "CMPT 855Module 7.0 1 Network Traffic Self-Similarity Carey Williamson Department of Computer Science University of Saskatchewan."— Presentation transcript:

1 CMPT 855Module 7.0 1 Network Traffic Self-Similarity Carey Williamson Department of Computer Science University of Saskatchewan

2 CMPT 855Module 7.0 2 Introduction n A recent measurement study has shown that aggregate Ethernet LAN traffic is self-similar [Leland et al 1993] n A statistical property that is very different from the traditional Poisson-based models n This presentation: definition of network traffic self-similarity, Bellcore Ethernet LAN data, implications of self-similarity

3 CMPT 855Module 7.0 3 Measurement Methodology n Collected lengthy traces of Ethernet LAN traffic on Ethernet LAN(s) at Bellcore n High resolution time stamps n Analyzed statistical properties of the resulting time series data n Each observation represents the number of packets (or bytes) observed per time interval (e.g., 10 4 8 12 7 2 0 5 17 9 8 8 2...)

4 CMPT 855Module 7.0 4 Self-Similarity: The Intuition n If you plot the number of packets observed per time interval as a function of time, then the plot looks ‘‘the same’’ regardless of what interval size you choose n E.g., 10 msec, 100 msec, 1 sec, 10 sec,... n Same applies if you plot number of bytes observed per interval of time

5 CMPT 855Module 7.0 5 Self-Similarity: The Intuition n In other words, self-similarity implies a ‘‘fractal-like’’ behaviour: no matter what time scale you use to examine the data, you see similar patterns n Implications: u Burstiness exists across many time scales u No natural length of a burst u Traffic does not necessarilty get ‘‘smoother” when you aggregate it (unlike Poisson traffic)

6 CMPT 855Module 7.0 6 Self-Similarity: The Mathematics n Self-similarity is a rigourous statistical property (i.e., a lot more to it than just the pretty ‘‘fractal-like’’ pictures) n Assumes you have time series data with finite mean and variance (i.e., covariance stationary stochastic process) n Must be a very long time series (infinite is best!) n Can test for presence of self-similarity

7 CMPT 855Module 7.0 7 Self-Similarity: The Mathematics n Self-similarity manifests itself in several equivalent fashions: n Slowly decaying variance n Long range dependence n Non-degenerate autocorrelations n Hurst effect

8 CMPT 855Module 7.0 8 Slowly Decaying Variance n The variance of the sample decreases more slowly than the reciprocal of the sample size n For most processes, the variance of a sample diminishes quite rapidly as the sample size is increased, and stabilizes soon n For self-similar processes, the variance decreases very slowly, even when the sample size grows quite large

9 CMPT 855Module 7.0 9 Variance-Time Plot n The ‘‘variance-time plot” is one means to test for the slowly decaying variance property n Plots the variance of the sample versus the sample size, on a log-log plot n For most processes, the result is a straight line with slope -1 n For self-similar, the line is much flatter

10 Variance-Time Plot Variance m

11 Variance-Time Plot Variance m Variance of sample on a logarithmic scale 0.0001 0.001 10.0 0.01 100.0

12 Variance-Time Plot Variance m Sample size m on a logarithmic scale 11010010 4567

13 Variance-Time Plot Variance m

14 Variance-Time Plot Variance m

15 Variance-Time Plot Variance m Slope = -1 for most processes

16 Variance-Time Plot Variance m

17 Variance-Time Plot Variance m Slope flatter than -1 for self-similar process

18 CMPT 855Module 7.0 18 Long Range Dependence n Correlation is a statistical measure of the relationship, if any, between two random variables n Positive correlation: both behave similarly n Negative correlation: behave as opposites n No correlation: behaviour of one is unrelated to behaviour of other

19 CMPT 855Module 7.0 19 Long Range Dependence (Cont’d) n Autocorrelation is a statistical measure of the relationship, if any, between a random variable and itself, at different time lags n Positive correlation: big observation usually followed by another big, or small by small n Negative correlation: big observation usually followed by small, or small by big n No correlation: observations unrelated

20 CMPT 855Module 7.0 20 Long Range Dependence (Cont’d) n Autocorrelation coefficient can range between +1 (very high positive correlation) and -1 (very high negative correlation) n Zero means no correlation n Autocorrelation function shows the value of the autocorrelation coefficient for different time lags k

21 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient

22 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Maximum possible positive correlation

23 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Maximum possible negative correlation

24 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient No observed correlation at all

25 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient

26 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Significant positive correlation at short lags

27 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient No statistically significant correlation beyond this lag

28 CMPT 855Module 7.0 28 Long Range Dependence (Cont’d) n For most processes (e.g., Poisson, or compound Poisson), the autocorrelation function drops to zero very quickly (usually immediately, or exponentially fast) n For self-similar processes, the autocorrelation function drops very slowly (i.e., hyperbolically) toward zero, but may never reach zero n Non-summable autocorrelation function

29 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient

30 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Typical short-range dependent process

31 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient

32 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Typical long-range dependent process

33 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Typical long-range dependent process Typical short-range dependent process

34 CMPT 855Module 7.0 34 Non-Degenerate Autocorrelations n For self-similar processes, the autocorrelation function for the aggregated process is indistinguishable from that of the original process n If autocorrelation coefficients match for all lags k, then called exactly self-similar n If autocorrelation coefficients match only for large lags k, then called asymptotically self-similar

35 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient

36 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Original self-similar process

37 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Original self-similar process

38 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Original self-similar process Aggregated self-similar process

39 CMPT 855Module 7.0 39 Aggregation n Aggregation of a time series X(t) means smoothing the time series by averaging the observations over non-overlapping blocks of size m to get a new time series X (t) m

40 CMPT 855Module 7.0 40 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... n Then the aggregated series for m = 2 is:

41 CMPT 855Module 7.0 41 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... n Then the aggregated series for m = 2 is:

42 CMPT 855Module 7.0 42 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... n Then the aggregated series for m = 2 is: 4.5 4.5

43 CMPT 855Module 7.0 43 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... n Then the aggregated series for m = 2 is: 4.5 8.0 4.5 8.0

44 CMPT 855Module 7.0 44 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... n Then the aggregated series for m = 2 is: 4.5 8.0 2.5 4.5 8.0 2.5

45 CMPT 855Module 7.0 45 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... n Then the aggregated series for m = 2 is: 4.5 8.0 2.5 5.0 4.5 8.0 2.5 5.0

46 CMPT 855Module 7.0 46 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... n Then the aggregated series for m = 2 is: 4.5 8.0 2.5 5.0 6.0 7.5 7.0 4.0 4.5 5.0... 4.5 8.0 2.5 5.0 6.0 7.5 7.0 4.0 4.5 5.0...

47 CMPT 855Module 7.0 47 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 5 is:

48 CMPT 855Module 7.0 48 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 5 is:

49 CMPT 855Module 7.0 49 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 5 is: 6.0 6.0

50 CMPT 855Module 7.0 50 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 5 is: 6.0 4.4 6.0 4.4

51 CMPT 855Module 7.0 51 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 5 is: 6.0 4.4 6.4 4.8... 6.0 4.4 6.4 4.8...

52 CMPT 855Module 7.0 52 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 10 is:

53 CMPT 855Module 7.0 53 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 10 is: 5.2 5.2

54 CMPT 855Module 7.0 54 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 10 is: 5.2 5.6 5.2 5.6

55 CMPT 855Module 7.0 55 Aggregation: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1... Then the aggregated time series for m = 10 is: 5.2 5.6... 5.2 5.6...

56 Autocorrelation Function +1 0 lag k 0 100 Autocorrelation Coefficient Original self-similar process Aggregated self-similar process

57 CMPT 855Module 7.0 57 The Hurst Effect n For almost all naturally occurring time series, the rescaled adjusted range statistic (also called the R/S statistic) for sample size n obeys the relationship E[R(n)/S(n)] = c n E[R(n)/S(n)] = c nwhere: R(n) = max(0, W,... W ) - min(0, W,... W ) S (n) is the sample variance, and W =  X - k X for k = 1, 2,... n H 11nn 2 kin i =1 n

58 CMPT 855Module 7.0 58 The Hurst Effect (Cont’d) n For models with only short range dependence, H is almost always 0.5 n For self-similar processes, 0.5 < H < 1.0 n This discrepancy is called the Hurst Effect, and H is called the Hurst parameter n Single parameter to characterize self-similar processes

59 CMPT 855Module 7.0 59 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n There are 20 data points in this example

60 CMPT 855Module 7.0 60 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n There are 20 data points in this example n For R/S analysis with n = 1, you get 20 samples, each of size 1:

61 CMPT 855Module 7.0 61 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n There are 20 data points in this example n For R/S analysis with n = 1, you get 20 samples, each of size 1: Block 1: X = 2, W = 0, R(n) = 0, S(n) = 0 n1

62 CMPT 855Module 7.0 62 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n There are 20 data points in this example n For R/S analysis with n = 1, you get 20 samples, each of size 1: Block 2: X = 7, W = 0, R(n) = 0, S(n) = 0 n1

63 CMPT 855Module 7.0 63 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 2, you get 10 samples, each of size 2:

64 CMPT 855Module 7.0 64 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 2, you get 10 samples, each of size 2: Block 1: X = 4.5, W = -2.5, W = 0, R(n) = 0 - (-2.5) = 2.5, S(n) = 2.5, R(n)/S(n) = 1.0 n12

65 CMPT 855Module 7.0 65 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 2, you get 10 samples, each of size 2: Block 2: X = 8.0, W = -4.0, W = 0, R(n) = 0 - (-4.0) = 4.0, S(n) = 4.0, R(n)/S(n) = 1.0 n12

66 CMPT 855Module 7.0 66 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 3, you get 6 samples, each of size 3:

67 CMPT 855Module 7.0 67 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 3, you get 6 samples, each of size 3: Block 1: X = 4.3, W = -2.3, W = 0.3, W = 0 R(n) = 0.3 - (-2.3) = 2.6, S(n) = 2.05, R(n)/S(n) = 1.30 n12 3

68 CMPT 855Module 7.0 68 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 3, you get 6 samples, each of size 3: Block 2: X = 5.7, W = 6.3, W = 5.7, W = 0 R(n) = 6.3 - (0) = 6.3, S(n) = 4.92, R(n)/S(n) = 1.28 n12 3

69 CMPT 855Module 7.0 69 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 5, you get 4 samples, each of size 5:

70 CMPT 855Module 7.0 70 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 5, you get 4 samples, each of size 4: Block 1: X = 6.0, W = -4.0, W = -3.0, W = -5.0, W = 1.0, W = 0, S(n) = 3.41, R(n) = 1.0 - (-5.0) = 6.0, R(n)/S(n) = 1.76 n12 345

71 CMPT 855Module 7.0 71 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 5, you get 4 samples, each of size 4: Block 2: X = 4.4, W = -4.4, W = -0.8, W = -3.2, W = 0.4, W = 0, S(n) = 3.2, R(n) = 0.4 - (-4.4) = 4.8, R(n)/S(n) = 1.5 n12 345

72 CMPT 855Module 7.0 72 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 10, you get 2 samples, each of size 10:

73 CMPT 855Module 7.0 73 R/S Statistic: An Example n Suppose the original time series X(t) contains the following (made up) values: 2 7 4 12 5 0 8 2 8 4 6 9 11 3 3 5 7 2 9 1 n For R/S analysis with n = 20, you get 1 sample of size 20:

74 CMPT 855Module 7.0 74 R/S Plot n Another way of testing for self-similarity, and estimating the Hurst parameter n Plot the R/S statistic for different values of n, with a log scale on each axis n If time series is self-similar, the resulting plot will have a straight line shape with a slope H that is greater than 0.5 n Called an R/S plot, or R/S pox diagram

75 R/S Pox Diagram R/S Statistic Block Size n

76 R/S Pox Diagram R/S Statistic Block Size n R/S statistic R(n)/S(n) on a logarithmic scale

77 R/S Pox Diagram R/S Statistic Block Size n Sample size n on a logarithmic scale

78 R/S Pox Diagram R/S Statistic Block Size n

79 R/S Pox Diagram R/S Statistic Block Size n Slope 0.5

80 R/S Pox Diagram R/S Statistic Block Size n Slope 0.5

81 R/S Pox Diagram R/S Statistic Block Size n Slope 0.5 Slope 1.0

82 R/S Pox Diagram R/S Statistic Block Size n Slope 0.5 Slope 1.0

83 R/S Pox Diagram R/S Statistic Block Size n Slope 0.5 Slope 1.0 Self- similar process

84 R/S Pox Diagram R/S Statistic Block Size n Slope 0.5 Slope 1.0 Slope H (0.5 < H < 1.0) (Hurst parameter)

85 CMPT 855Module 7.0 85 Summary n Self-similarity is an important mathematical property that has recently been identified as present in network traffic measurements n Important property: burstiness across many time scales, traffic does not aggregate well n There exist several mathematical methods to test for the presence of self-similarity, and to estimate the Hurst parameter H n There exist models for self-similar traffic


Download ppt "CMPT 855Module 7.0 1 Network Traffic Self-Similarity Carey Williamson Department of Computer Science University of Saskatchewan."

Similar presentations


Ads by Google