Energy Efficiency in Data Centers

Energy Efficiency in Data Centers
noted, “After pouring millions of dollars into in-house data centers, companies may soon find that it’s time to start shutting them down. IT is shifting from being an asset companies own to a service they purchase.” -- Andrzejak, A., Arlitt, M., & Rolia, J. (2002, November 27). Bounding the resource savings of utility computing models. “What matters most to the computer designers at Google is not speed, but power - low power, because data centers can consume as much electricity as a city” – Eric Schmidt, CEO of Google Diljot Singh Grewal

Some Facts Data centers consumed 235 billion KWH of energy 2 worldwide(2010). Datacenters consumed 1.3% of total electricity consumption of the world(as on august 2011) In 2000 DC used 0.53% , which almost doubled to 0.97% in 2005, by 2010 it rose only to 1.3% A rack drawing 20 KWH at 10cents per KWH uses more than 17000$ in electricity. A study by Hewlett-Packard Lab of six corporate data centers found that most of their 1,000 servers were using only 10 to 25 percent of their capacity - Andrzejak, A., Arlitt, M., & Rolia, J. (2002, November 27). Bounding the resource savings of utility computing models. Asset Sprawl still pervades US Data Centers 3 Amazon cloud is backed by 400,000 servers ( 40 servers per rack is average)

Energy Efficiency Run a DC wide workload and measure energy consumed
𝑃𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒 𝐸𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑐𝑦 = 𝐶𝑜𝑚𝑝𝑢𝑡𝑎𝑡𝑖𝑜𝑛𝑎𝑙 𝑊𝑜𝑟𝑘 𝑃𝑒𝑟𝑓𝑜𝑟𝑚𝑒𝑑 𝐸𝑛𝑒𝑟𝑔𝑦 𝑢𝑠𝑒𝑑 = 1 𝑃𝑈𝐸 ∗ 1 𝑆𝑃𝑈𝐸 ∗ 𝐶𝑜𝑚𝑝𝑢𝑡𝑎𝑡𝑖𝑜𝑛 𝑇𝑜𝑡𝑎𝑙 𝐸𝑛𝑒𝑟𝑔𝑦 𝑡𝑜 𝐸𝑙𝑒𝑐𝑡𝑟𝑜𝑛𝑖𝑐 𝐶𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡𝑠

Power Usage Effectiveness (PUE)
𝑃𝑈𝐸= 𝑇𝑜𝑡𝑎𝑙 𝐵𝑢𝑖𝑙𝑑𝑖𝑛𝑔 𝑃𝑜𝑤𝑒𝑟 𝐼𝑇 𝑃𝑜𝑤𝑒𝑟 In 2006, 85 % of DC had PUE of greater than Another study estimated it at 2.0 6 In the state of Art Facility the PUE of 1.1 is achievable.7

Reasons: Staged Deployment Fragmentation Following Nameplate Ratings
Variable Load Excessive/Inefficient Cooling Excessive/ Inefficient humidity controls…

8 According to a study done by Lawrence Berkeley national laboratory in 2007

ASHRAE

Chillers consume 30 – 50% of IT Load. Loss in Wires ~1-3%
115kV to 13.2kV Loss ~0.5% 6-12% loss Chillers consume 30 – 50% of IT Load. Loss in Wires ~1-3% CRAC units consume 10-30% of IT Load PDU rated kW each transform 480V to 110V Circuits power a rack or a fraction of rack

8 Delta conversion: n=input AC output Ac no conversion To DC..
Analogy: package from floor 4-floor 5 Delta = Floor 4-5 Double Floor 4 – Ground– Floor 5 Flywheel – No batteries but a wheel that is rotated when charging and it slows as it discharges

8 • Sealing cable or other openings in under-floor distribution systems. • Blanking unused spaces in and between equipment racks. • Careful placement of computer room air conditioners and floor tile openings, often through the use of computational fluid dynamics (CFD) modeling. • Collection of heated air through high overhead plenums or ductwork to efficiently return it to the air handler(s). • Minimizing obstructions to proper airflow

Improving Infrastructure
Increasing Temperature to 27 ◦C from 20◦C. Isolate hot exhaust air from intake Using High Efficiency UPS and other gear Google Achieved a PUE of 1.1 by 9 Better air flow and Exhaust handling. Temperature of Cold Aisle at 27 ◦ C Cooling Tower uses Water evaporation Per server UPS that has Efficiency of 99.99% instead of facility wide UPS Using Liquid cooling can be more efficient as it avoids the mixing of hot and cold air -- about 70% of conventional air systems

Google’s PUE over the years

Humidity Control Condensation on Cooling coils can reduce the humidity
Low (<40% rH) humidity levels can lead to static buildup (sparks that can damage chips). Steam Humidifiers are Energy Expensive Energy Savings?? Using evaporative cooling on incoming air . Using evaporative cooling to humidify the hot output air and cool it( which is then used to cool the incoming air)

SPUE 𝑆𝑃𝑈𝐸= 𝑇𝑜𝑡𝑎𝑙 𝑆𝑒𝑟𝑣𝑒𝑟 𝐼𝑛𝑝𝑢𝑡 𝑃𝑜𝑤𝑒𝑟 𝑐𝑜𝑛𝑠𝑢𝑚𝑒𝑑 𝑏𝑦 𝑐𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡𝑠
Losses due to power supplies, fans, voltage regulators Maximum Efficiency Power supplies 80% Motherboard VRM 70% A recent evaluation of a small sample of data center–grade server power supplies with single 12V outputs found peak efficiencies in the range of 85 to 87 percent. - Fortenbery, B., Vairamohan, B., & May-Ostendorp, P. (2006, September). Power supply efficiency in high density data center servers Inside the server, only about one-half of the power (100 to 200 watts) actually makes it to the processor, even when operating at the manufacturer’s rated capacity. – 𝑇𝑜𝑡𝑎𝑙 𝑃𝑈𝐸=𝑃𝑈𝐸∗𝑆𝑃𝑈𝐸 If both stand at 1.2 then only 70% of the energy is actually used for computation.

18 Energy efficiency of several server power supplies

Efficiency of Computing
𝐶𝑜𝑚𝑝𝑢𝑡𝑎𝑡𝑖𝑜𝑛 𝑇𝑜𝑡𝑎𝑙 𝐸𝑛𝑒𝑟𝑔𝑦 𝑡𝑜 𝐸𝑙𝑒𝑐𝑡𝑟𝑜𝑛𝑖𝑐 𝐶𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡𝑠 Hardest to measure. How Do we Benchmark? New benchmarks : Joule-sort and SPEC power No benchmarks for Memory or Switches Different systems consume different amount of energy for different applications. Measure by : Load the facility and measure Energy at Cluster level / Server level and do calculations. SPECpower has a suite of apps to get a total average an runs at different utilization levels

Breakdown

CPU Uses up to 50% at peak but drops to 30% at low activity
Dynamic Ranges CPU 3.5x Memory : 2x Disks 1.3x Switches 1.2x CPU scales the most any of the other components. Actually CPU can scale better if some of the performance conditions are relaxed(CPU perform( in efficiency) much better in consumer space)

the performance-to-power ratio drops dramatically as the target load decreases because the system power decreases much more slowly than does performance. SPEC ® and the benchmark name SPECpower_ssj TM are trademarks of the Standard Performance Evaluation Corporation

10 Activity profile of 5000 Google servers over 6 months
Low periods of idleness just a few ms… when low loads even then hundreds of queries per sec

Energy Proportional Computing.
Low Idle Power and proportional afterwards energy spent will be halved by energy proportionality alone if the system idles at 10%.11 Might be fine if peak is not that good Computer utilization rates of 10 to 15 percent are not uncommon 4

10 Load level(%) of peak Red : normal System
Green : Proportional system with Blue sublinear power vs. load b/w 0-50% of peak Load level(%) of peak

Savings by Energy proportional computing (green line)
11 Gray : savings In peak power Black: energy savings by energy proportional computing with idle at 10%

Dynamic Voltage and Frequency Scaling
𝑃𝑜𝑤𝑒𝑟=𝑐𝑣2𝑓+𝑃 𝑙𝑒𝑎𝑘 The time to wake up from low voltage state depends on voltage differential Not useful on Multicore Architectures? Switching speed i.e. frequency depends on voltage. So low voltage means low frequency i.e. low speed of instruction issue which translates to slower speed. Pstatic = kv2 Pdynamic= kv2f

The CPU States ACPI States: States:
Power management component of kernel sends a signal to the Processor Driver to switch to a state States: C0 Normal Operation C1 ,C2: Stops Clocks C3 : C2+ reduced Voltage C4 : C3 + Turns off memory Cache In CPU used BIOS which was replaced by ACPI

Stop main Internal clock via Software, bus and APIC keep running
Mode Name What it does C0 Operating State CPU fully turned on C1 Halt Stop main Internal clock via Software, bus and APIC keep running C1E Enhanced Halt C1 + reduced Voltage C2 Stop Grant / Stop Clock Stops clock via Hardware. Bus and APIC Keeps running C2E Extended S.C. C2 + Reduced Voltage C3 Sleep Stops clock (Internal or both) C4 Deeper Sleep Reduces CPU Voltage C4E/C5 Enhanced Deeper Sleep Reduces CPU voltage even more and turns off the cache C6 Deep Power Down Reduces voltage even more(~0V) ACPI defines C0-C3 only. Others are defined by different manufacturers. Extra states might be present in some architectures but missing in others while C0-C3 are present in all ACPI compliant Hardware

12 Figure shows power consumed by HP DL360G4 based on XEON Processors

Energy Savings 10 The system is not scaled below 1GHz as it violates the Service Level Agreements as latency falls below the tolerable limits This is one of the reasons why the voltage scaling does not work well for Servers

Results of scaling at Datacenter Level
11 50% The power is halved when the cpu usage drops below 50% 20% The power is halved when the cpu usage drops below 20% 5% The power is halved when the cpu usage drops below 5%

The Multicore problem Clock Gating Voltage Gating?
Core level Clock gating Voltage Gating? Voltage depends on core with high utilization Lower Wake Up Penalty by using the Cache New architectures have penalties of 60µs down from 250µs. Power Gating (Power Control Unit) Separate Power planes for Core and Un-core part Frequency Scheduling Per Core: Each core can run on different clock rate. Separate Power planes : CPU can sleep While DMA works. Clock Gating: Stop the clock for a part of the circuit Ex. since we don’t use Flops in server operations we can gate that part of the circuit Core level Clock gating: Stop the clock for a particular core Power Gating: Stops power to a circuit to avoid the static power loss(uses PCU) PCU a hardware switch by intel. When OS tells to go to C2 the switch overrides and sends it to C6(gates power) Leakage currents canbe contolled by using strained silicon On chip multicore voltage regulator : under research(to gate the voltage).

The Leakage power As the feature sizes decrease, the dynamic power consumption also decreases, but some of those savings are lost due to increase in leakage --- ongoing research problem The leakage happens due to very small size of the isolating layer . The isolating layer is so small that quantum tunneling occurs and the electrons disappear from one side and appear on the other side of isolation. This leakage increases as sizes decreases and represents the physical limits of scaling of transistor size.

Software’s Role Well Tuned Code can reduce the consumption.
Code that generates excessive interrupts or snoop requests is not good. OS Power Manager speculates the future processing requirements to make a decision according to the settings selected by user.

CPU isn’t the only culprit
10

Lets talk Storage Consumes about 27% power
High Performance Disks to match the µP Speed According to IDC report in 2008, total cost to power and cool a drive is 48 watts. 13 12 watts for running HDD 12 watts for storage shelf (HBAs, fans, power supply) 24 watts to cool the HDDs and storage shelf Power consumption is high in data intensive servers Data Intensive : such as Proxy, File and Database Servers. Which is possible by using Disks with Multiple platters and High rotational speeds

Power Consumption of a 2.5” drive
Due to SLA we want our systems to have low latency, So in most Data centers, the disks run at performance idle, i.e. they spin at full rpm even when not reading/writing

Electronics & Software
Adaptive Voltage Frequency Reduction in Low Power Modes Queuing Algorithms to minimize rotational delays Algorithm to manage transitions between low and high power modes

Mechanical Lighter Materials Better motor Design
Using Helium in a sealed case to reduce air drag WD claims energy savings of 23% with higher capacity(40%) Load/Unload Load/Unload: The slider can be removed while disk is spinning … this reduces aerodynamic drag and reduces power required… but there might be some latency when access starts Power management states: Ready Partial Slumber Energy saved in standby or sleep mode should be greater than energy required to spin up drives

Tiered System 14 Manage workloads efficiently among multiple RPMs in a storage system Tiered storage Tier 0 with solid state drives (5%), Tier 1 with high performance HDDs (15%) Tier 2 with low power HDDs (80%) Energy saved in standby or sleep mode should be greater than energy required to spin up drives Multiple drives in same rack spinning up from standby mode simultaneously raise peak current requirements

Tiered Storage

Mixed Approach Mirror HP Disk on Low Power Disk and use the low power disk under light load.14 The Low performance disks use significantly low energy than HP Disks. Other approaches Multispeed Disks: ability to change spin speed.14 Lower Rotational speed but multiple heads Point 2 but use more disks that are slower Not uncommon to have a 2 GB DB but writes of 1-2MBps

Solid State Disks require up to 90% less power 15
offer up to a 100 times higher performance 15 Life span of the SSD depends on the I/O ops and it is not good enough for server yet. MLC vs. SLC We need fewer SSD if the extra disks are not being deployed for storage but for speed.

File system problems? Google File system:
Distribute data chunks across large number of systems (entire cluster) for resiliency.. But that means all machines run at low activity and do not go idle.

Memory SRAM: Requires constant voltage
DRAM : Since capacitors leak charge, we need to refresh them every 64 ms (JEDEC) Suppose we have 213 rows, then we need to refresh a row every 7.8µs. DRAM : capacitor charged = 1 if discharged : 0. Capacitor leaks charge so we need to refresh every 64 ms SRAM: requires 6 transistors per bit DRAM : requires 1 capacitor and 1 resistor(to discharge the capacitor) per bit.

Alternatives Low Voltage RAM (LoVo) SSD as RAM17 Future:
Runs at 1.25V (DDR2 -1.8V and DDR V) 2-3W per RAM(2GB) SSD as RAM17 Future: Ferroelectric RAM Magnetoresistive RAM (MRAM) Ferroelectric RAM: Charge causes Polarizations of atoms which is retained when charge is taken away. SPEED: Ram: speed to discharge the capacitor… Ferro : time to align the atom ---- faster (used by T.I. and Fujitsu… 32.7 million in sales) MRAM: 2 ferroelectric materials separated by a very thin insulation .. So the electrons tunnel from one to another (Quantum Mech.) 2 different magnetizations that can be switched by magnetic field if Ferro-magnet and the external magnetization is parallel {tunneling i.e. low resistance} else {no tunneling, high resistance} 17 : slides and paper available at SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy

Is Performance Per Watt all we need?
Are few ‘Bulls’ better than a flock of ‘Chickens’? If it is, then we should Buy ARM Servers. Smaller RAM and Laptop HDD’s 20 times lower power but at 5 times lower performance : High Response times. Acc. to Google’s Study, The users prefer 10 results in 0.4 sec over 25 in 0.9 sec. Most of the major server and processor companies are now touting improved performance per watt—performance improvements of 35 to 150 percent or more over previous product generations - This is a rough range (by me) based on reviews.

Power Provisioning Costs
Building a Datacenter that can provide power to servers can be costlier than Electricity costs. $10-22 per deployed IT Watt(provisioning cost) Cost of 1 Watt of IT Power = ∗0.07 ∗2.0=$1.227(per year per watt) Cost savings from efficiency can save more in provisioning. Assume $0.07 per kilowatt-hr and PUE 2.0 8766 hours in a year (8766 / 1000) * $0.07 * 2.0 = $

100%,90%,11% 11 98%,93%,7.5% Peak 85%, slack 17% 92%, 86%, 16%
Cdf starts at .45 of power Websearch – some racks reach 98% entire cluster 93% - 7.5%more machines Webmail – 92% , 86% % more machines Mapreduce – 100% , 90% - 11% more machines Mix – peak 85% 17% more machines Real System – % and 39% more machines ( not well tuned applications) 52% - 72%,39%

Safety Mechanism and over subscription
So a proper study needs to be done for the datacentre to ensure that it is properly provisioned More machines raise risk of exceeding the capacity but DVFS, Power Capping can ensure safety. Since CDF intercepts the top at a flat slope few intervals when close to full load Remove these intervals – even more machines De-scheduling tasks DVFS (also Power Capping)

Virtualization the energy cost can be minimized by launching multiple virtual machines. Virtualized servers have an associated overhead Different Types have different behaviors Para virtualized (XEN) Full Virtualization(VMware Server) consolidation of applications onto a single machine can often be done without sacrificing performance or reliability - Hiller, op. cit., note 21. Xen is a type-1 hypervisor, which directly interfaces with the underlying hardware Xen allows guests to make system calls without invoking the kernel of host OS, whereas KVM incurs additional kernel operations to support I/O behaviors. The additional operations will probably translate to extra CPU cycles and memory access, which further lead to extra energy usage. VirtualCPU : Xen uses vCPU but KVM uses simple linux OS CPU Networking: XEN uses Virtual firewall router to make all domain appear as hosts on network but the KVM uses linux kernel – KVM = less energy efficient I/O : XEN requires the Guest OS to be modified to allow host todo systemcalls without invoking kernel of host OS KVM uses the linux kernel – normal linux process thus incurs additional kernel operations.

Para Virtualization (XEN)

Virtualization Overheads
16 SPEC INT 2000 : measure the performance of a system’s processor, memory system, and compiler quality. The suite performs little I/O and has little interaction with the OS. Linux Build : build linux with gcc 2.96 OSDB : Open Source Database Benchmarkusing PostgreSQL ( Online transaction Processing) SPEC WEB99 30% require dynamic content generation, 16% are HTTP POST operations and 0.5% execute a CGI script.(includes Disk Writes) Usermode Linux: run multiple VM as an application within general OS L : Native Linux X : XEN V: VM-Ware workstation U: User mode Linux

Performance on Java Server Benchmark
16 Virtualization Performance on SPECjbb 2005 Number of VMs CPUs per VM SUSE SLES 10 Xen 3.0.3 VMWare ESX 3.0.2 1 1% 3% 4 7% 2 5% 15% 19%

Power Management in Virtualized Systems

Concluding: Power efficiency in datacenters is constrained by the performance requirements imposed. High efficiency gear, Smart design and proper consolidation can lead to huge gains Efficiency in server components is an ongoing research problem. Data Centers have many components that affect the overall consumption and synchronization across them is needed to ensure performance and efficiency.

References Morgan, T. P. (2006, February 28). The server market begins to cool in Q4. The Linux Beacon. EPA Report in 2006 Hiller, A. (2006, January). A quantitative and analytical approach to server consolidation. CiRBA White Paper, p. 4. Personal correspondence with Dale Sartor of LBNL (August 9, 2006). M. Kalyanakrishnam, Z. Kalbarczyk, and R. Iyer, “Failure data analysis of a LAN of Windows NT based computers,” Reliable Distributed Systems, IEEE Symposium on, vol. 0, no. 0, pp. 178, 18th IEEE Symposium on Reliable Distributed Systems, 1999 Green Grid, “Seven strategies to improve datacenter cooling efficiency”. X. Fan, W. Weber, and L. A. Barroso, “Power provisioning for a warehouse-sized computer,” in Proceedings of the 34th Annual International Symposium on Computer Architecture, San Diego, CA, June 09–13, ISCA ’07 S. Greenberg, E. Mills, and B. Tschudi, “Best practices for datacenters: lessons learned from benchmarking 22 datacenters,” 2006 ACEEE Summer Study on Energy Efficiency in Buildings. Google Inc., “Efficient Data Center Summit, April 2009”. Luiz André Barroso, Urs Holzle, The Data Center as a Computer: An Introduction to the Design of Warehouse, 2009 , Morgan & Claypool Publishers.

References Technology brief on Power Capping in HP Systems. Available at International Data Corporation, Annual Report, 2008 Enrique V. Carrera, Eduardo Pinheiro, and Ricardo Bianchini, Conserving Disk Energy in Network Servers, International Conference on Supercomputing,2003 “Solid State Drivers for Enterprise” Data Center Environments Whitepaper HGST Paul Barham , Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer , Ian Pratt, Andrew Wareld, "Xen and the Art of Virtualization", University of Cambridge Computer Laboratory Anirudh Badam and Vivek S. Pai, SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy , 8th USENIX conference on Networked systems design and implementation ,2011, Pg 16 M. Ton and B. Fortenbury, “High performance buildings: datacenters—server power supplies,” Lawrence Berkeley National Laboratories and EPRI, December 2005.

ARM Server (Calxeda) More of a cluster in size of a server
Currently holds 12 Energy Cards (in 1 server) Each Energy card has 4 Energy cores(1.1 – 1.4 GHz) Larger L2 Cache Runs Linux (Ubuntu Server or Fedora 17) Don’t need to virtualize but give each application its own node (Quadcore, 4MB L2 4GB RAM) Node cannot support more than 4GB RAM (32 bit) ARM : 32bit address bus and does not support more than 4GB RAM A57 coming with 64 bit support and single threaded performance is increasing quickly

ECX 1000 is ARM Server, others are Intel
ARM has 12 nodes each with 4 SoC (48 threads, 4GB ram per node, 4MB L2 per core). Xeon Rated at 1280 (*2)W , ARM at 750 W Xeon has linux running on Vmware (some overhead might reduce the performance) Results : AnandTech ECX 1000 is ARM Server, others are Intel

Energy Efficiency in Data Centers

Similar presentations

Presentation on theme: "Energy Efficiency in Data Centers"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Energy Efficiency in Data Centers

Similar presentations

Presentation on theme: "Energy Efficiency in Data Centers"— Presentation transcript:

Similar presentations

About project

Feedback