Presentation is loading. Please wait.

Presentation is loading. Please wait.

Power Control for Data Centers Ming Chen Oct. 8 th, 2009 ECE 692 Topic Presentation.

Similar presentations


Presentation on theme: "Power Control for Data Centers Ming Chen Oct. 8 th, 2009 ECE 692 Topic Presentation."— Presentation transcript:

1 Power Control for Data Centers Ming Chen Oct. 8 th, 2009 ECE 692 Topic Presentation

2 Why power control in Data Centers? Power is one of the most important computing resources.  Facility over-utilized −Dangerous −System failure and overheating −Power below the capacity  Facility under-utilized −Cost of power facilities −Economically amortize investment. −Provision to fully utilize power facility. 2

3 Xiaorui Wang, Ming Chen University of Tennessee, Knoxville, TN SHIP: Scalable Hierarchical Power Control for Large-Scale Data Centers Charles Lefurgy, Tom W. Keller IBM Research, Austin, TX 3

4 Introduction 4  Power overload may cause system failures. − Power provisioning CANNOT guarantee exempt of overload. − Over-provisioning may cause unnecessary expenses. Power control for an entire data center is very necessary.  Data centers are expanding to meet new business requirement. −Cost-prohibitive to expand the power facility. −Upgrades of power/cooling systems lag far behind. −Example: NSA data center

5 Challenges 5  Scalability: One centralized controller for thousands of servers?  Coordination: if multiple controllers designed, how do they interact with each other?  Stability and accuracy: workload is time-varying and unpredictable.  Performance: how to allocate power budgets among different servers, racks, etc.?

6 State of The Art  Reduce power by improving energy-efficiency : [Lefurgy], [Nathuji], [Zeng], [Lu], [Brooks], [Horvath], [Chen] − NOT enforce power budget.  Power control for a server [Lefurgy], [Skadron], [Minerick], a rack, [Wang], [Ranganathan], [Femal] −Cannot be directly applied for data centers.  No “Power” Struggles presents a multi-level power manager. [Raghavendra] − NOT designed based on power supply hierarchy − NO rigorous overall stability analysis − Only simulation results for 180 servers 6

7 What is This Paper About? 7  SHIP: a highly Scalable Hierarchical Power control architecture for large-scale data centers −Scalability: decompose the power control for a data center into three levels −Coordination: hierarchy is based on power distribution system in data centers. −Stability and accuracy: theoretically guaranteed by Model Predicative Control (MPC) theory. −Performance: differentiate power budget based on performance demands, i.e. utilization.

8 Power Distribution Hierarchy 8  A simplified example for a three-level data center −Data center-level −PDU-level −Rack-level  Thousands of servers in total

9 PM RPC PM RPC Utilization Monitor Frequency Modulator UMFM UMFM UMFM Power Monitor Rack Power Controller PDU Power Controller PDU-Level Power Monitor … Rack-levelPDU-levelData center-level Controlled variable The total power of the rack The total power of the PDU The total power of the data center Manipulated variable The CPU frequency of each server The power budget of each rack The power budget of each PDU Control Architecture 9 HPCA08 paper This paper

10 PDU-level Power Model 10  System model:  Uncertainties: g i is the power change ratio.  Actual model: the total power of PDU the power change of rack i the change of power budget for rack i

11 Model Predictive Control (MPC) 11  Design steps: −Design a dynamic model for the controlled system. −Design the controller. −Analyze the stability and accuracy.  Control objective:

12 MPC Controller Design 12 Least Squares Solver Reference Trajectory Cost Function Constraints System Model Power budget Measured power Budget changes Ideal trajectory to track budget Tracking errorControl penalty

13 Stability 13  Local Stability −g i is assumed to be 1 at design time. −g i is unknown a priori. −0 < g i < 14.8: 14.8 times of the allocated budget  Global Stability −Decouple controllers at different levels by running them in different time scales. −The period of upper-level control loop > the settling time of the lower-level −Sufficient but not necessary

14 System Implementation 14  Physical testbed −10 Linux servers −Power meter (Wattsup) error: sampling period: 1 sec −Workload: HPL, SPEC −Controllers: period: 5s for rack, 30s for PDU  Simulator (C++) −Simulate large-scale data centers in three levels. −Utilization trace file from 5,415 servers in real data centers −Power model is based on experiments in servers.

15 Precise Power Control (Testbed) 15  Power can be precisely controlled at the budget.  The budget can be reached within 4 control periods.  The power of each rack is controlled at their budgets.  Budgets are proportional to.  Tested for many power set points (See the paper for more results.)

16 Power Differentiation (Testbed) 16  Capability to differentiate budgets based on workload to improve performance  Take the utilization as the optimization weights.  Other differentiation metrics: response time, throughput Budget allocation proportional to estimated max consumptions ; Budgets differentiated by utilization; CPU: 100% CPU: 80% CPU: 50%

17 Simulation for Large-scale Data Centers 17  6 PDU, 270 racks  Real data traces  750 kW  Randomly generate 3 data centers  Real data traces

18 Budget Differentiation for PDUs 18  Power differentiation in large-scale data centers; −Minimize the difference with estimated max power consumption. −Utilization is the weight. −The difference order is consistent with the utilization order. PDU5 PDU2

19 Execution time of the MPC controller Vs. the # of servers Scalability of SHIP 19 CentralizedSHIP LevelOne levelMultiple Computation overheadLargeSmall Communication overheadLongShort ScalabilityNOYES Overhead of SHIP The max scale of centralized

20 Conclusion  SHIP: a highly Scalable HIerarchical Power control architecture for large-scale data centers − Three-levels: rack, PDU, and data center − MIMO controllers based on optimal control theory (MPC) − Theoretically guaranteed stability and accuracy − Discussion on coordination among controllers  Experiments on a physical testbed and a simulator − Precise power control − Budget differentiation − Scalable for large-scale data centers 20

21 Xiaobo Fan, Wolf-Dietrich Weber, Luiz Andre Barroso Power Provisioning for a Warehouse-sized Computer 21 Acknowledgments: The organization order and contents of some slides are based on Xiaobo Fan’s slides in pdf.

22 Introduction  Strong economic incentives to fully utilize facilities − Investment is best amortized. − Upgrades without any new power facility investment 22 Power facilities $10-$20/watt years utilization ~10~18 Electricity < $0.8/watt-year  Run risk of outages or costly violations of SLA.  Power provisioning given the budget 0.85 0.5

23 Reasons for Facility Under-utilization 23  Staged deployment −new facilities are rarely fully populated  Fragmentation  Conservative machine power rating (nameplate)  Statistical effects −Larger machine population, lower probability of simultaneous peaks  Variable load

24 What is This Paper About? 24  Investigate over-subscription potential to increase power facility utilization. −A light-weight and accurate model for estimating power −Long-term characterization of simultaneous power usage of a large number of machines  Study of techniques for saving energy as well as peak power. −Power capping (physical testbed) −DVS (simulation) −Reduce idle power (simulation)

25 Data Center Power Distribution Transformer Main Supply ATS Switch Board UPS STS PDU STS PDU Panel Generator … 1000 kW 200 kW 50 kW Rack Circuit 2.5 kW Rack level 40-80 servers PDU level 20-40 racks Data center level 5-10 PDUs 25

26 Power Estimation Model  Model is predicted for each family of machines.  Greater interest is for a group of machines. 26  Direct measurements are not always available.  Input: CPU utilization  Models: −P idle +(P busy – P idle )u −P idle +(P busy – P idle )(2u-u r ) −Measure and derive

27 Model Validation  PDU-level validation example (800 machines)  Almost constant offset −Loads not accounted in the model: networking equipments.  Relative error is below 1%. − 27

28 28 Analysis Setup  Data center setup −Pick up more than 5,000 servers for each workload. −Rack: 40 machines, PDU: 800 machines, Cluster: 5000+  Monitoring period: 6 months every 10 mins  Distribution of power usage −Aggregate power at each time interval at different levels. −Normalized to aggregated peak power WorkloadDescription WebsearchOnline servicing correlating with time of day Computation-intensive WebmailDisk I/O intensive. MapreduceOffline batch jobs Less correlation between activities and time of day Real data centerRandomly pick any machines from data centers

29 Webmail 65% 92% 88%86% 72% 29  Higher level, narrower range −More difficult to improve facility utilization in lower levels.  Peak lowers as more machines are aggregated. −16% more machines can be deployed.

30 Websearch 45% 98% 93% 52%  Peak lowers as more machines are aggregated. −7% more machines can be deployed. 98%93% 30  Higher level, narrower range −More difficult to improve facility utilization in lower levels.

31 Real Data Centers  Clusters have much narrower dynamic range compared to racks.  Clusters peak at 72%. − 39% more machines  Mapreduce has the similar results. 31

32 Summary of Characterization WorkloadAvg powerPower rangeMachine increase Websearch68%52%-93%7% Webmail78%72%-86%16% Mapreduce70%54%-90%11% Real data center60%51%-72%39% 32  Average power: utilization of the power facilities  Dynamic range: difficulty to improve facility utilization  Peak power: potential of deployment over-subscription

33 CDF 1.0 Time in power capping Power saving Time Power CDF 1.0 Power Power Capping 33  Small fraction of time in power capping  Substantial saving in peak power  Provide a safety valve when workload is unexpected.

34 Results for Power Capping 34  For workload with loose SLA or low priority  Websearch and Webmail are excluded;  De-scheduling tasks or DVFS

35  Motivation −A large portion of dynamic power is consumed by CPU. −DVS is widely available in modern CPUs. CPU Voltage/Frequency Scaling utilization CPU power threshold  Method −Oracle-style policy −Threshold: 5%, 20%, 50% −Simulation −CPU power is halved when DVS is triggered. 35

36  Energy saving is larger than peak power reductions.  Biggest saving in data centers.  Benefits vary with workloads Results for DVS 36

37 Lower Idle Power  Motivation −Idle power is high. (more than 50% of peak) −Most of time is in non-peak activity level. −What if idle power is 10% of peak?  keeping peak power unchanged. −Simulation utilization CPU power Peak 0.6 0.1 37

38 Conclusions 38  Power provisioning is important to amortize facility investment.  Load variation and statistical effects lead to facility under-utilization.  Over-subscribing deployment is more attractive in cluster level than rack level.  Three simple strategies to improve facility utilization: power capping, DVS, and lower idle power

39 Comparison of the Two Papers SHIPPower Provisioning TargetPower capacity of data centers GoalControl power to the budget to avoid facility over-utilization Give power provisioning guidelines to avoid facility under-utilization MethodologyMIMO optimal controlStatistical analysis SolutionsA complete control-based solution Some strategies suggested based on real data analysis ExperimentsPhysical testbed and simulation based on real trace files Detailed analysis on real trace files and simulations 39

40 Critiques 40  Paper 1 −Workload is not typical in real data centers. −Power model may include CPU utilization. −No convincing baseline is compared.  Paper 2 −Power provisioning Vs. performance violations −Power model is workload-sensitive. −Estimation accuracy in rack-level? −Quantitative analysis on idle power and peak power reduction

41 41 Thank you !


Download ppt "Power Control for Data Centers Ming Chen Oct. 8 th, 2009 ECE 692 Topic Presentation."

Similar presentations


Ads by Google