Presentation on theme: "Matt Warner Future Facilities Proactive Airflow Management in Data Centre Operation - using CFD simulation to improve resilience, energy efficiency and."— Presentation transcript:
Matt Warner Future Facilities Proactive Airflow Management in Data Centre Operation - using CFD simulation to improve resilience, energy efficiency and utilisation
Design Intent Mid life Expected end of life 100 % Design Capacity Utilisation Time Design intent
Operational Reality… Mid life Expected end of life Typical Operation Design intent Lost Lifespan Stranded Capacity The biggest challenge for 80%+ of Owner/Operators is obtaining the right balance between Space, Power and Cooling. Gartner 65 % 100 % Utilisation Time
The Physics of Cooling through the Data Centre Supply Chain Chip ManufacturerIT Deployment Data Centre Manager Lack of communication between the equipment suppliers and the Data Centre industry causes inefficiency in operation. Device Manufacturer
Efficient Data Centre Management is all about Airflow Management Grilles to equipment inlets Equipment exhaust to ACU ACU to floor grilles
Typical changes in a Data Centre Infrastructure IT Equipment Cabinets
Introducing the Virtual Facility Infrastructure IT Equipment Cabinets The Virtual Facility The Virtual Facility enables the data centre designer and operator to understand the consequence of any physical change before committing to it.
What is the Virtual Facility? The Virtual Facility is a full 3D mathematical representation of the data center that simulates and visualises the physical impact of any change in the data centre.
Resilience at cabinet level Data centre management often monitor temperatures and react to the problems. Cheap and simple: brush or foam supply temperature 15 o C max inlet temperatures 32 o C 1. Restack supply temperature 15 o C max inlet temperatures 19 o C 2. Turn down ACU set points 4 o C supply temperature 11 o C max inlet temperatures 28 o C 3. Block gap under cabinet supply temperature 15 o C max inlet temperatures 16 o C expensive IT operation higher energy costs The Virtual Facility illustrates 3 ways of achieving resilience:
Resilience at cabinet level Problem: Each IT device has its own airflow and heat characteristics. Each IT device has the potential to effect the resilience of every other device in the rack. Therefore the stacking of the IT Equipment determines resilience. Symptom: Some rack configurations will cause IT Equipment to overheat. Reaction: A typical reaction to overheating IT devices is to reduce the cooling set points in the area of the devices. This reduces the efficiency of ACUs and reduces cooling capacity. Solution: A simpler and cheaper solution can usually be found using simulation to visualise internal cabinet problems and test potential fixes… but the ideal solution is to be pro-active and simulate the deployment and avoid any problems and the knock-on energy costs
Resilience at room level – Deploying IT devices in a room Scenario: Small room at 65% capacity, space needs to be allocated for two Sun SPARC Enterprise M5000 servers: 10U 3738W (nameplate) 4x power supplies 2x RJ45 LAN ports (+ 1 management port?) 283 l/s After checking available Space, Power, Networking and rack inlet temperatures there are at least 3 options: Option 1 Like for Like: Install one in each of the two racks that already have Sun M5000s installed. Option 2Mix server types: Install one in each of the server racks with the most available space. Option 3Install both in the empty rack in the row allocated to blades
Servers Switch Storage Same equipment, 2 layouts: one layout is resilient another layout overheats Different designs, different technologies, different power densities, different cooling requirements disrupt airflow and lead to hotspots… Resilience at room level – Laying out cabinets in a group
Resilience at room level Problem: Each IT device has its own airflow and heat characteristics. Each IT device has the potential to effect the resilience of every other device in the room. Therefore the physical configuration of the IT Equipment determines resilience. Symptom: Some configurations of the IT Equipment will cause hot-spots. Reaction: A typical reaction to thermal hot-spots is to reduce the cooling set points for the entire room. This reduces the efficiency of ACUs and increases energy costs of chiller units. Solution: The ideal solution is to be pro-active and simulate cabinet deployments and avoid any problems and the knock-on energy costs
Thermal Resilience and Efficiency The purpose of a data centre is to provide space, power, cooling and networking for every IT device. The challenge is to provide these in the most energy efficient manner without giving up the required resilience. To reduce energy costs of cooling, air temperatures in the data halls must be raised. Hotspots at rack or room level will prevent air temperatures being raised (and in practice often lowers them compared to the design). Data centres typically supply air at about 15°C IT devices are typically resilient to 30°C There are many hotspots caused by poor airflow management that are masked by low supply air temperatures. These must be fixed before air temperatures can be raised.
100 % Utilization Time Maximising utilisation of the data centre 80kW 60kW 200kWTotal 80kW The original design assumes hot aisle cold aisle and front to back breathing equipment with 2 medium density zones and 1 higher density zone.
Lost Lifespan 45% Stranded Capacity 55 % 100 % Utilization Time Maximising utilisation of the data centre 80kW 200kWTotal 80kW 10kW 20kW 110kW The original design assumes hot aisle cold aisle and front to back breathing equipment with 2 medium density zones and 1 higher density zone. Option 1 is to locate the new cabinets in one end of the room. but after 55% power load 60kW hotspots will start to develop.
Maximising utilisation of the data centre 60kW 200kWTotal 35kW 110kW The original design assumes hot aisle cold aisle and front to back breathing equipment with 2 medium density zones and 1 higher density zone. Option 2 is to locate the new cabinets in the centre of the room. but after 75% power load 80kW 60kW hotspots will start to develop. Lost Lifespan 25% Stranded Capacity 75 % 100 % Utilization Time
Utilisation and Stranded Capacity Problem: Any IT deployed in a data centre will disrupt the airflow and cooling even in empty zones of the room. Symptom: The simple example we just looked at illustrates a common feature of data centres that is not well understood: If your data centre is running at 40% of design load, you do not have 60% capacity left! Thermal hotspots will occur before the data centre reaches capacity. Reaction: Resilience concerns prevent further installation of IT devices Confusion between Facilities, IT and Management. More data centres are built to gain more capacity. Solution: Alternate configurations of IT load can be pre-tested using CFD simulation to evaluate whether they will result in stranded capacity in a data centre. Proactive airflow management is required to maximise the utilisation of a data centre
The benefits of the Virtual Facility Reclaimed Capacity Using a Virtual Facility Reclaimed Lifespan Data centers seldom meet the operational and capacity requirements of their initial design. Gartner The Virtual Facility enables data centre operators to reclaim stranded capacity and extend the life of their existing data centres Mid life Expected end of life Year 1 Lost Lifespan Stranded Capacity Utilization 100 % Typical Operation Design intent
The choice: reclaim stranded cooling capacity or add to estate Option 1: Add to estate and continue with typical operation + 60% Existing estate Expanding the estate Utilization Time Utilization Time +20% = 60% Option 2: Reclaim lost capacity in the existing estate
The greenest data centre is The greenest data centre is the one you dont need to build... the one you dont need to build... Matt Warner Future Facilities Continuous improvement of tools and processes has enabled us to see where to push beyond perceived limits and how to reclaim capacity in our existing data centres – Ashley Davis, JPMChase