Download presentation

Presentation is loading. Please wait.

Published byRosemary Croson Modified over 2 years ago

1
Tuan V. Dinh, Lachlan Andrew and Yoni Nazarathy Modelling a supercomputer with the model Australia and New Zealand Applied Probability Workshop

2
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 2 Australia and New Zealand Applied Probability Workshop Supercomputer clusters large scale simulation: climate, genome, astronomy, etc. foundation of cloud computing BIG DATA EXASCALE COMPUTING MORE COMPUTING POWER DESIRED Electricity bills Heat – thermal management Investment – cooling systems, hardware, etc.

3
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 3 Australia and New Zealand Applied Probability Workshop Power proportionality Load Power ideal reality 60% peak single server (1) ( 1) Bassoro, “The case for energy proportional”, 2007. idle server ~ 60% peak power turn off idle servers challenges: switching cost (setup, wear-and-tear), performance impacts ? Swinburne Supercomputer

4
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 4 Australia and New Zealand Applied Probability Workshop An energy saving framework CONTROL FRAMEWORK system congestion model number of active servers needed ? historical implications ? ongoing system states ? arrival characteristics ? job elapsed times ? min ( ) energy performance penalty switching ++ Objective:

5
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 5 Australia and New Zealand Applied Probability Workshop Congestion model CONTROL FRAMEWORK number of active servers needed ? historical implications ? ongoing system states ? arrival characteristics ? job elapsed times ? min ( ) energy performance penalty switching + + Objective:

6
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 6 Australia and New Zealand Applied Probability Workshop Congestion model - 1 2 3 … batch Poisson, rate function batch size distribution with c.d.f i.i.d service time WHY ? jobs arrive in “batch” manner, i.e within seconds, from same user system mostly under-utilized, using infinite server approximation substantial daily variations

7
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 7 Australia and New Zealand Applied Probability Workshop Discrete-time cost time T+tt : current running jobs t +k {jobs arriving in (t,t+k], still around at t+k} {jobs arriving before t, still around at t+k} C(k) = n(k) + |n(k) – n(k-1)| + C 1 (k):energyC 3 (k):performance penaltyC 2 (k):switching

8
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 8 Australia and New Zealand Applied Probability Workshop Optimization formulation C(k) = n(k) + |n(k) – n(k-1)|+ C 1 (k):energyC 3 (k):performance penaltyC 2 (k):switching (*) solving (*): load estimation in far future. the system can feedback the ACTUAL load U(s) for s < k

9
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 9 Australia and New Zealand Applied Probability Workshop A Model Predictive Control framework CONTROL FRAMEWORK number of active servers needed ? historical implications ? ongoing system states ? arrival characteristics ? job elapsed times ? min ( ) energy performance penalty switching ++ Objective: MPC

10
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 10 Australia and New Zealand Applied Probability Workshop Model Predictive Control execution time T+t t T Solve (**), obtain {n*(0), n*(1),…}.ONLY “execute” n*(0). t +1 T T+t+1 Solve (**), obtain {n*(0), n*(1),…}.ONLY “execute” n*(0). (**) Limited look-ahead 1.less sensitive to load estimation accuracy 2.Use “on-going” information know how many jobs actually arrived in (t,t+1]

11
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 11 Australia and New Zealand Applied Probability Workshop Solving the optimization problem { n(k) + |u(k)| } (***) s.t:, k =0,1…,K-1 Normal approximation C(k) = n(k) + |n(k) – n(k-1)|+ C 1 (k):energyC 3 (k):performance penaltyC 2 (k):switching k =0,1…,K-1 solved numerically using LP

12
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 12 Australia and New Zealand Applied Probability Workshop X(k): new arrivals [Carrillo,89]: is a compound Poisson RV, with batch rate:, where s = (k+1/2)Δ; Δ: slot-time. even if the arrival process is NOT Poisson, [Whitt,99]. {jobs arriving in (t,t+k], still around at t+k} N ~ Poisson( ) b i : i.i.d batch size, mean and variance

13
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 13 Australia and New Zealand Applied Probability Workshop U(k): existing jobs [Carrillo,91]: is a binomial RV, with parameters: and, where s = (k+1/2)Δ; Δ: slot-time. Hence: {jobs arriving before t, still around at t+k} one can use job elapsed runtimes to calculate [Whitt,99]

14
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 14 Australia and New Zealand Applied Probability Workshop Summary of analytical framework CONTROL FRAMEWORK number of active servers needed ? historical implications ? ongoing system states ? arrival characteristics ? job elapsed times ? Objective: MPC LP optimization Normal approximation min ( ) energy performance penalty switching ++

15
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 15 Australia and New Zealand Applied Probability Workshop Numerical evaluation supercomputer simulator CONTROLLER system states control decision Swinburne supercomputer logs cost performance

16
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 16 Australia and New Zealand Applied Probability Workshop Scheme 1: All up (no turn off) supercomputer simulator system states control decision cost performance NO CONTROL Swinburne supercomputer logs

17
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 17 Australia and New Zealand Applied Probability Workshop Scheme 2: t wait heuristic supercomputer simulator system states control decision cost performance t wait heuristic Server idle for t wait => turn OFF Swinburne supercomputer logs

18
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 18 Australia and New Zealand Applied Probability Workshop Scheme 3: predictive control supercomputer simulator system states control decision cost performance MPC estimated from historical data Swinburne supercomputer logs

19
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 19 Australia and New Zealand Applied Probability Workshop S.3: rate function time of day rate arrivals 20102011 use daily periodic rates

20
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 20 Australia and New Zealand Applied Probability Workshop S.3: service time & batch size [Lublin et al.,2003]: Hyper-Gamma, Log-uniform [Li et al.,2005]: Log Normal, Weibull Empirical (2010) Gamma time(sec) c.d.f size(CPU) c.d.f Our approximations only concern MEAN and VARIANCE of X X: batch size G: service time (2010)

21
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 21 Australia and New Zealand Applied Probability Workshop S.3: cost performance ε ~ service availability normalised cost Cost 1 = total cost when there is NO CONTROL (energy only) Simulation period: 1 year

22
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 22 Australia and New Zealand Applied Probability Workshop Cost performance: all schemes “offline” optimal cost [Lu et al., 12]. No perf. penalty S.1S.2S.3, ε = 0.58 consider predictive settings (S.3) whose demand penalty cost is the same as t wait heuristic (S.2) after all, model is to estimate θ(k)s. still > 20% to gain

23
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 23 Australia and New Zealand Applied Probability Workshop Remarks and considerations 1. Room for improvement: ~20% to gain! 2.Examining our estimations ? rate function not accurate Use job elapsed times Normal approximation ? 3. Fundamental bound on what to achieve given uncertainty ? [Dinh,Andrew and Branch,CCgrid13]

24
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 24 Australia and New Zealand Applied Probability Workshop Thank you CONTROL FRAMEWORK number of active servers needed ? historical implications ? ongoing system states ? arrival characteristics ? job elapsed times ? Objective: MPC LP optimization Normal approximation min ( ) energy performance penalty switching ++

25
http://caia.swin.edu.au/cv/tdinhhttp://caia.swin.edu.au/cv/tdinh 10 July 2013 Slide 25 Australia and New Zealand Applied Probability Workshop The objective cost CONTROL FRAMEWORK number of active servers needed ? historical implications ? ongoing system states ? arrival characteristics ? job elapsed times ? min ( ) energy performance penalty switching ++ Objective:

Similar presentations

OK

Managing Server Energy and Operational Costs Chen, Das, Qin, Sivasubramaniam, Wang, Gautam (Penn State) Sigmetrics 2005.

Managing Server Energy and Operational Costs Chen, Das, Qin, Sivasubramaniam, Wang, Gautam (Penn State) Sigmetrics 2005.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on south african culture dance Ppt on data collection methods aba Ppt on natural and artificial satellites wiki Ppt on any business plan Ppt on switching network device Ppt on earthquake for class 10 Cns anatomy and physiology ppt on cells Ppt on ac series motor Ppt on biogas power plant Ppt on near field communication