Presentation is loading. Please wait.

Presentation is loading. Please wait.

Understanding and Predicting Host Load Peter A. Dinda Carnegie Mellon University

Similar presentations


Presentation on theme: "Understanding and Predicting Host Load Peter A. Dinda Carnegie Mellon University"— Presentation transcript:

1 Understanding and Predicting Host Load Peter A. Dinda Carnegie Mellon University http://www.cs.cmu.edu/~pdinda

2 2 Talk in a Nutshell Load is self-similar Load exhibits epochal behavior Load prediction benefits from capturing self-similarity Statistical analysis of two sets of week long, 1 Hz resolution traces of load on ~40 machines and evaluation of linear time series models for load prediction

3 3 Why Study Load? Load partially determines execution time We want to model and predict load [t min,t max ] ? Interactive Application Short tasks with deadlines Unmodified Distributed System

4 4 Load and Execution Time

5 5 Outline Measurement methodology Load traces Load variance New Results –Self-similarity –Epochal behavior Benefits of capturing self similarity in linear models Conclusions

6 6 Measurement Methodology Ready Queue RUN len t len t-T len t-2T len t-29T... len t-30T... Exponential Average (1 minute Load “Average”) avg t avg t-0.5T avg t-T... Our Measurements (1 Hz sample rate) Digital Unix KernelUser Level Measurement Tool T=2 seconds

7 7 Load Traces

8 8 Absolute Variation

9 9 Relative Variation

10 10 Load Autocorrelation Periodogram Time Lag Frequency

11 11 Visual Self-Similarity Here

12 12 The Hurst Parameter

13 13 Self-similarity Statistics

14 14 Why is Self-Similarity Important? Complex structure –Not completely random, nor independent –Short range dependence Excellent for history-based prediction –Long range dependence Possibly a problem Modeling Implications –Suggests models that can capture ARFIMA, FGN, TAR

15 15 Load Exhibits Epochal Behavior

16 16 Epoch Length Statistics

17 17 Why is Epochal Behavior Important? Complex structure –Non-stationary Modeling Implications –Suggests models ARIMA, ARFIMA, etc. Non-parametric spectral methods –Suggests problem decomposition

18 18 Linear Time Series Models Choose weights  j to minimize  a 2  a is the confidence interval for t+1 predictions Unpredictable Random Sequence Fixed Linear Filter Partially Predictable Load Sequence

19 19 Realizable Pole-Zero Models ARFIMA(p,d,q) ARIMA(p,d,q) ARMA(p,q) AR(p)MA(q) Self Similarity, d related to Hurst Non-stationarity, d integer p,q are numbers of parameters d is degree of differencing

20 20 Real World Benefits of Models  a is the confidence interval for t+1 predictions Map work that would take 100 ms at zero load axp0:  z =0.54,  =1.0,  a(ARMA(4,4)) = 0.109  a(ARFIMA(4,d,4)) = 0.108 no model: 1.0 +/- 1.06 (95%) => 100 to 306 ms ARMA:1.0 +/- 0.22 (95%) => 178 to 222 ms ARFIMA:1.0 +/- 0.21 (95%) => 179 to 221 ms axp7:  z =0.14,  =0.12,  a(ARMA(4,4)) = 0.041  a(ARFIMA(4,d,4)) = 0.025 no model:0.12 +/- 0.27 (95%) =>100 to 139 ms ARMA:0.12 +/- 0.08 (95%) =>104 to 120 ms ARFIMA:0.12 +/- 0.05 (95%)=>107 to 117 ms 1 % 40 %

21 21 t+1 prediction

22 22 t+8 prediction

23 23 Conclusions Load has high variance Load is self-similar Load exhibits epochal behavior Capturing self-similarity in linear time series models improves predictability

24 24 Load Traces Would a web-accessible load trace database be useful? Would you like to contribute?


Download ppt "Understanding and Predicting Host Load Peter A. Dinda Carnegie Mellon University"

Similar presentations


Ads by Google