Presentation is loading. Please wait.

Presentation is loading. Please wait.

CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1.

Similar presentations


Presentation on theme: "CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1."— Presentation transcript:

1 CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1

2 Hybrid: typical + free cooling Typical datacenter cooling Filters Evaporative cooler Fans Server racks Outside air Cooling tower Water chiller Air handling unit Server racks Microsoft DC in Chicago 2 Free cooling

3 Potentially negative impact on hardware reliability, especially disks High temperature Wide temperature variation High humidity Free cooling limitations 3 Outside Disk Inlet Outside temp directly impacts inlet and disk temps Daily temperature variation can be large

4 Roadmap Motivation and background CoolAir: Managing free-cooled datacenters Cooling modeling Cooling management Compute management CoolAir for Parasol Evaluation and general lessons Conclusions 4

5 Energy-aware management of cooling & workload Minimize hardware reliability issues Limit temperature and relative humidity Reduce temperature variation Major tasks 1.Predict conditions and energy 2.Select best cooling settings 3.Apply cooling settings 4.Place and schedule load CoolAir Datacenter CoolAir: Managing free-cooled datacenters Cooling Servers Cooling Manager Compute Manager Cooling Modeler Weather Forecast 5

6 Predictions based on linear regression model Datacenter Cooling modeling Historic Data Cooling Learner Cooling Model Temperature inside Humidity inside Cooling power Temperature outside Location in the datacenter Datacenter utilization Cooling setting Temperature inside/outside Humidity outside Cooling setting Cooling operation 6

7 Use predictions from cooling model Reduce variation with a temp band based on expected outside temp Maintain temperature within the band Middle: forecast outside temp + offset Periodically Predict environmentals and energy Select best settings using utility Apply cooling settings Cooling management Band selection example Temperature 7 Average Outside temperature forecast Hour 06 12 2418 Offset

8 Compute management Spatial placement Distribute load to servers Group servers into “pods” of similar behavior Reduce solving and modeling complexity Favor pods with higher heat recirculation Against common practice in non-free-cooled DCs Lower recirculation pods are closer to cooling → temperature variation Temporal scheduling When to execute deferrable loads (see paper) 8 Sensors Server Pod Front view of Parasol’s racks Rack 1 Rack 2

9 Roadmap Motivation and background CoolAir: Managing free-cooled datacenters CoolAir for Parasol Evaluation and general lessons Conclusions 9

10 Case study: Parasol Default cooling controller: Outside temperature ≤ 30⁰C → Free cooling with variable fan speed Outside temperature > 30⁰C → AC cycling with hysteresis External view Internal layout (top view) Exhaust Free Cooling Rack 1 Rack 2 Cold aisleHot aisle Cooling Controller Door Relays Partition Air duct 10 Air Conditioner

11 CoolAir for Parasol Data collection and model learning Historical sensor info for two months Generated extreme settings to learn faster Cooling configurer Interface with Parasol’s “thermostat” Control fan speed and AC Compute configurer for Hadoop Send idle worker nodes to sleep while keeping data available >90% with <0.5⁰C errors 11

12 Example of CoolAir on Parasol 12

13 Evaluation methodology Parasol as the baseline system 64 Atom servers: 8 pods in 2 racks Hadoop workloads: Non-deferrable Facebook (see paper for others) Real experiments and validated simulations Evaluated policies (see paper for others) 13 PolicyTemperatureHumidityEnergySpatial placement BaselineReactive <30⁰C ✔✔✘ CoolAirAdaptive band ✔✔ High recirculation

14 Baseline vs CoolAir 14 Warmer locations are more inefficient Up to 4⁰C reduction Up to 60% reduction

15 Multiple geographical locations Power efficiency (PUE) improvement Reduction in max temperature range Improves PUE in warmer locations where PUE is worse Reduces variation the most in colder locations where variation is highest Sacrifices PUE slightly in cold locations 15 -0.02 to -0.01

16 Principles and lessons learned Variation management requires fine-grain cooling and load control Management challenges depend on the climate Warm: managing absolute temperature costs more than variation Cold: managing temperature variation is more critical and successful Temp band and spatial placement are key; temporal scheduling is not Other lessons in the paper 16

17 Conclusions CoolAir successfully manages Absolute temperature and temperature variation Relative humidity Energy CoolAir broadens the set of areas where free cooling can be used Principles should apply to larger datacenters 17

18 CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 18

19 Motivation Typical cooling in datacenters Chillers, cooling tower, air handlers Very energy hungry Bring cool air from outside: free cooling Reduces cooling energy Typically used in cooler and drier climates Warmer locations: hybrid Free cooling when external temperature and humidity are suitable 19 Microsoft DC in Chicago

20 Validation of models and simulations ~80% with <0.5⁰C errors Real behavior Simulation Simulation close to real behavior 20

21 Example of CoolAir on Parasol 21

22 Principles and lessons learned (long version) Absolute temperatures and variations are high in many locations Variability management requires fine-grain cooling and load control Management challenges depend on the climate Warm: managing absolute temperature costs more than variation Cold: managing temperature variation is more critical and successful Temp band and spatial placement are key; temporal scheduling is not Management is “easier” when allowed temperatures can be higher Weather forecast inaccuracy is not a problem (temp band) Energy cost of CoolAir is low even in hot climates 22

23 Why simulation? Limitations of a real system  simulation External conditions change → can’t compare runs Results for a whole year at multiple locations around the world Average simulation errors < 6% Parasol cooling changes are too abrupt  variable-speed AC 23

24 Absolute temperature violation No violations Small violations 24 Large violations CoolAir

25 Maximum daily temperature variation Reduced variation 25 CoolAir Show only maximum

26 Power efficiency Low efficiency PUE (Power Usage Efficiency) Warmer locations are more inefficient Small increase over the Energy version 26 CoolAir

27 Evaluated policies Policies isolate the impact of CoolAir characteristics PoliciesTemperatureHumidityEnergySpatial BaselineReactive <30⁰C ✔✔✘ TemperaturePredictive <30⁰C ✔✔ Low recirculation VariationAdaptive band ✔✘ High recirculation EnergyPredictive <30⁰C ✔✔ Low recirculation CoolAirAdaptive band ✔✔ High recirculation 27


Download ppt "CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1."

Similar presentations


Ads by Google