Presentation is loading. Please wait.

Presentation is loading. Please wait.

1K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing Challenges in Distributed Energy Adaptive Computing K. Kant NSF and GMU.

Similar presentations


Presentation on theme: "1K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing Challenges in Distributed Energy Adaptive Computing K. Kant NSF and GMU."— Presentation transcript:

1 1K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing Challenges in Distributed Energy Adaptive Computing K. Kant NSF and GMU

2 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing2 Information & communication Technology (ICT) has a problem Performance Centric Energy & Sustainability centric How do we get there?

3 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing3 ICT Power Growth until 2020 Increase in spite of power efficient designs –Clients: 8x in number, 3X in power –Data Centers: > 2X increase –Network: 3X increase Network Clients Data Center Transmission, conversion & distribution

4 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing4 Current State Unsustainable Computing

5 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing5 Data Center Infrastructure Resource intensive: Water, cabling, metal, … ~50% power wasted before getting to racks

6 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing6 13.2kv 115kv 13.2kv 480V 208V 0.3% loss 99.7% efficient 0.5% loss 99.5% efficient 1.0% loss 99.0% efficient 6% loss 94% efficient ~1% loss in switch gear and conductors UPS: 2.5MW Generator ~180 Gallons/hour IT LOAD ~10% distribution loss + High carbon impact Distribution Infrastructure

7 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing7 ~50% Rack Power Wasted ComponentTotalUsedComments CPU8060 Operating at 100% utilization Fans5025 Temp. directed fan at 100% util Memory (32 GB)8824 2GB DIMMS, 4W idle, 19W active Hard drives SATA drives, 25% busy I/O adapters204 25% disk, 15% network Motherboard2212 N/S bridges & devices, VRs, … Total DC power Power supply loss507 14% 5% loss of AC input pwr AC input power > 50% of power is wasted

8 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing8 Sustainable Computing

9 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing9 Renewable Energy Push Limit energy draw from grid –Less infrastructure –Less losses –but variable supply Need better power adaptability

10 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing10 High Temperature DCs Chiller-less operation –Less energy/materials, but space inefficient High temperature operation –Smaller T outlet – T inlet –More throttling –More failure prone (?) X Need smarter thermal adaptability

11 Overdesign Overdesign is the norm today –Huge power supplies, fans, heat sinks, server cases, high rack capacity, UPS capacity, … –Engineered for worst case Rarely encountered –Huge power wastage, waste of materials, energy, … 11 Better energy adaptability to deal w/ frugal design What if we right-size everything? Highly energy efficient but need smarter control

12 Energy Adaptive Computing EAC strives to do dynamic end to end adjustment to –Workload adaptation for graceful QoS degradation under energy limitations –Infrastructure adaptation to cope with temporary energy deficiencies. Requires coordinated power/thermal mgmt of computation, network & storage. Enhances sustainability of IT infrastructure 12

13 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing13 EAC Instances

14 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing14 Client-server EAC Transparently adapt to client energy states –State = {on-AC, normal, low-battery, …} –Service contract Ci = {setup QoS, operational QoS} Adaptation Challenges –Communicating & enforcing contracts. –Group adaptation of clients forced by network/servers ?

15 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing15 Cluster EAC Adaptation to intra & inter-DC limits –Multi-level: Server, rack & DC levels Adaptation Challenges –Estimate & collect power deficits/surplus at multiple levels –Coordination across large range of devices Location based services Coordination across levels –Simultaneously handle client-server loop

16 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing16 P2P EAC Adaptation based on available energy Content: video resolution, audio coding, … Network: modulate wireless radio usage (?) Energy proportional use of peer resources Energy driven content replication & reorganization Adaptation Challenges –Satisfying QoS ? –Balancing src/dest usage vs. relay node energy usage ?

17 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing17 Challenges Some specific Issues

18 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing18 Power Estimation Challenges Notion of effective power? –Additive relationship: Workload power –Why is this hard? Interference Available power –Determined by power, thermal & perhaps other issues (noise). –Required at multiple levels: facility, enclosure, machine, …

19 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing19 Network Role in EAC Energy Adaptation –Aggressive control of switch/router ports Speed, state & width controls –Traffic consolidation across paths Adaptation induced congestion –Propagation (e.g., ECN, EBCN) & response Computation – communication tradeoff ? Redirection ? Network protocol support for adaptation?

20 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing20 Other Issues EAC Security –Attacks on power sources –Energy Attacks on IT, e.g., Demanding too much, cyclic demands, … Storage adaptation –Storage devices, controllers & network. Coordinated end to end control is hard! Formal models to understand impact of energy adaptation.

21 Energy Adaptation in Data Centers K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing21

22 Adaptation Methods Workload Adaptation –Coarse grain: Shut down low priority tasks –Fine grain: Graceful QoS degradation, e.g., Batched service, poorer resolution, … Infrastructure Adaptation –Operation at lower speeds (DVFS) –Effective use of low power modes & width control. Workload adaptation always done first 22

23 Infrastructure Adaptation Need a multilevel scheme – –Individual assets up to entire data center Need both supply & demand side adaptations

24 Supply Side Adaptation Supply side Limits –Hard caps at higher levels (true limit) vs. soft (artificial) caps at lower levels. –Limits may be a result of thermal/cooling issues. Load consolidation –An essential part of energy efficient operation –Load consolidation vs. soft capping Need to address workload adaptation changes as a result of supply increase & decrease.

25 Demand Side Adaptation Adaptation to fluctuating demand –Transactional workload: Migrate queries or app VMs? Issues w/ combined supply & demand side adaptations –Imbalance: One node squeezed while other has surplus power –Ping-pong Control: Oscillatory migration of workload –Error accumulation down the hierarchy.

26 A Proposed Algorithm Unidirectional control –Load migration moves up the hierarchy, from local to global. –Local migrations are temporary & do not trigger changes to soft caps on supply. Target Node selection –Based on bin packing (best-fit decreasing) –Allows for more imbalance, which can be exploited for workload consolidation Properties –Avoids ping-pong, attempts to minimize imbalance

27 Experimental Results Scenario –3 levels, 18 identical servers ( ) –3 applications, total of 25 app instances –Any app can run on any server –Demand Poisson (active power utilization)

28 Migration Frequency Migration drivers: consolidation vs. energy deficiency –Low util Consolidation, High util Energy deficiency Other characteristics –Migration frequency low in all cases –No ping-pong observed

29 Thermal Impacts Additional Issues –Energy consumption limited by thermal/cooling issues, not energy availability –Migrations required to limit temperature Temperature & power have nonlinear relationship Need to account for both power & thermal effects

30 Results w/ Thermal Effects Imbalanced cooling –Servers 1-14: T a =25 o C, Servers 15-18: T a =40 o C –Temperature limit: 65 o C Power demand is adjusted by the alg. to account for higher temperature

31 Conclusions Need to go beyond energy efficiency –Design devices/systems to minimize life-cycle energy footprint –Creatively adapt to available energy to operate at the edge Ongoing/future work –Coordinated server, network & storage mgmt. –Explore tradeoffs between QoS, power savings and admission control performance 31

32 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing32 Thank you!

33 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing33 Power Inefficiencies Server PSU Rack supply 70-90% efficient ±12, ±5V Voltage Regulators 90-95% efficient CPU Wasted leakage & clock power Fans DRAM & Mem controller AdaptersStorage 280V 95% efficientIdle wasted power

34 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing34 Operating Regimes

35 K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing35 So, Whats the Problem Local constraints & controls end-to-end impacts –DC to DC load shift Service disruption & post-shift impact –Client request to alter content Less or more work for server Potential conflicting controls Client Networ k Server1 storage DC1 Server2 storage DC2 Core Networ k


Download ppt "1K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing Challenges in Distributed Energy Adaptive Computing K. Kant NSF and GMU."

Similar presentations


Ads by Google