Thermal-aware Task Placement in Data Centers (part 4)

Slides:



Advertisements
Similar presentations
Data Center Design Issues Bill Tschudi, LBNL
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
University of Minnesota Optimizing MapReduce Provisioning in the Cloud Michael Cardosa, Aameek Singh†, Himabindu Pucha†, Abhishek Chandra
Class-constrained Packing Problems with Application to Storage Management in Multimedia Systems Tami Tamir Department of Computer Science The Technion.
1 * Other names and brands may be claimed as the property of others. Copyright © 2010, Intel Corporation. Data Center Efficiency with Optimized Cooling.
MENG 547 LECTURE 3 By Dr. O Phillips Agboola. C OMMERCIAL & INDUSTRIAL BUILDING ENERGY AUDIT Why do we audit Commercial/Industrial buildings Important.
A Cyber-Physical Systems Approach to Energy Management in Data Centers Presented by Chen He Adopted form the paper authors.
Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,
Chandrakant Patel, Ratnesh Sharma, Cullen Bash, Sven Graupner HP Laboratories Palo Alto Energy Aware Grid: Global Workload Placement based on Energy Efficiency.
Utility-Function-Driven Energy- Efficient Cooling in Data Centers Authors: Rajarshi Das, Jeffrey Kephart, Jonathan Lenchner, Hendrik Hamamn IBM Thomas.
SLA-aware Virtual Resource Management for Cloud Infrastructures
GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.
Effect of Rack Server Population on Temperatures in Data Centers CEETHERM Data Center Laboratory G.W. Woodruff School of Mechanical Engineering Georgia.
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1.
Optimal Fan Speed Control for Thermal Management of Servers UMass-Amherst Green Computing Seminar September 21 st, 2009.
Thermal Aware Resource Management Framework Xi He, Gregor von Laszewski, Lizhe Wang Golisano College of Computing and Information Sciences Rochester Institute.
Green IT and Data Centers Darshan R. Kapadia Gregor von Laszewski 1.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
Thermodynamic Feasibility 1 Anna Haywood, Jon Sherbeck, Patrick Phelan, Georgios Varsamopoulos, Sandeep K. S. Gupta.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.
Network Aware Resource Allocation in Distributed Clouds.
Energy-Efficient Soft Real-Time CPU Scheduling for Mobile Multimedia Systems Wanghong Yuan, Klara Nahrstedt Department of Computer Science University of.
Energy Usage in Cloud Part2 Salih Safa BACANLI. Cooling Virtualization Energy Proportional System Conclusion.
Temperature Aware Load Balancing For Parallel Applications Osman Sarood Parallel Programming Lab (PPL) University of Illinois Urbana Champaign.
Summer Report Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering.
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Joint Power Optimization Through VM Placement and Flow Scheduling in Data Centers DAWEI LI, JIE WU (TEMPLE UNIVERISTY) ZHIYONG LIU, AND FA ZHANG (CHINESE.
A Node and Load Allocation Algorithm for Resilient CPSs under Energy-Exhaustion Attack Tam Chantem and Ryan M. Gerdes Electrical and Computer Engineering.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab ( Department.
TSV-Constrained Micro- Channel Infrastructure Design for Cooling Stacked 3D-ICs Bing Shi and Ankur Srivastava, University of Maryland, College Park, MD,
An Energy-efficient Task Scheduler for Multi-core Platforms with per-core DVFS Based on Task Characteristics Ching-Chi Lin Institute of Information Science,
Thermal Aware Data Management in Cloud based Data Centers Ling Liu College of Computing Georgia Institute of Technology NSF SEEDM workshop, May 2-3, 2011.
Green Computing Metrics: Power, Temperature, CO2, … Computing system: Many-cores, Clusters, Grids and Clouds Algorithm and model: task scheduling, CFD.
ATAC: Ambient Temperature- Aware Capping for Power Efficient Datacenters Sungkap Yeo Mohammad M. Hossain Jen-cheng Huang Hsien-Hsin S. Lee.
1 Thermal Management of Datacenter Qinghui Tang. 2 Preliminaries What is data center What is thermal management Why does Intel Care Why Computer Science.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Thermal-aware Task Placement in Data Centers Qinghui Tang Sandeep K S Gupta Georgios Varsamopoulos IMPACT Lab Arizona State University.
Accounting for Load Variation in Energy-Efficient Data Centers
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Thermal Management in Datacenters Ayan Banerjee. Thermal Management using task placement Tasks: Requires a certain number of servers (cores) for a specified.
1 1 Thermal-Aware Scheduling in Environmentally Coupled Cyber-Physical Distributed Systems Qinghui Tang Committee Dr. Sandeep Gupta Dr. Martin Reisslein.
Adaptable Approach to Estimating Thermal Effects in a Data Center Environment Corby Ziesman IMPACT Lab Arizona State University.
All content in this presentation is protected – © 2008 American Power Conversion Corporation Row Cooling.
1 PCE 2.1: The Co-Relationship of Containment and CFDs Gordon Johnson Senior CFD Manager at Subzero Engineering CDCDP (Certified Data Center Design Professional)
Ruihong Lin 1, Yuhui Deng 1,2, Liyao Yang 1 1 Department of Computer Science, Jinan University, Guangzhou, , China 2 State Key Laboratory of Computer.
DENS: Data Center Energy-Efficient Network-Aware Scheduling
GreenCloud: A Packet-level Simulator of Energy-aware Cloud Computing Data Centers Dzmitry Kliazovich, Pascal Bouvry, Yury Audzevich, and Samee Ullah Khan.
OPERATING SYSTEMS CS 3502 Fall 2017
Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1
Using Heat to Increase Cooling George Hannah BEng (Hons) CEng MIMechE
Georgios Varsamopoulos, Zahra Abbasi, and Sandeep Gupta
Thermal-aware Task Placement in Data Centers
Ching-Chi Lin Institute of Information Science, Academia Sinica
System Control based Renewable Energy Resources in Smart Grid Consumer
Nithin Michael, Yao Wang, G. Edward Suh and Ao Tang Cornell University
Faraz Ahmad and T. N. Vijaykumar Purdue University
CPU SCHEDULING.
By: Greg Boyarko, Jordan Sutton, and Shaun Parkison
Creating a Dynamic HPC Infrastructure with Platform Computing
IIS Progress Report 2016/01/18.
Presentation transcript:

Thermal-aware Task Placement in Data Centers (part 4) SANDEEP GUPTA Department of Computer Science and Engineering School of Computing and Informatics Ira A. Fulton School of Engineering Arizona State University Tempe, Arizona, USA sandeep.gupta@asu.edu

Thermal-aware Task Placement Problem Given an incoming task, find a task partitioning and placement of subtasks to minimize the (increase of) peak inlet temperature P = a U + b Tin Tsup D U XInt Algorithm Approximation solution (genetic algorithm) Take a feasible solution and perform mutations until certain number of iterations bb b b b b (a + ) = + × heat distribution inlet temperatures supplied air temperatures utilization vector

Recirculation coefficients: a fast thermal model Reduce/Simplify the “thermal map” concept to points of interest: equipment air inlets Can be computed from CFD models/simulations A Matrix A aij: portion of heat exhausted from node i that directly goes to node j recirculation coefficients

Scheduling Impacts Cooling Setting Inlet temperature distribution without Cooling Inlet temperature distribution with Cooling Different demands for cooling capacity Scheduling 1 25C Scheduling 2 25C

Contrasted scheduling approaches Uniform Outlet Profile (UOP) Assigning tasks in a way that tries to achieve uniform outlet temperature distribution Assigning more task to nodes with low inlet temperature (water filling process) Minimum computing energy Assigning tasks in a way that keeps the number of active (power-on) chassis as few as possible Server with coolest inlet temperature first Uniform Task (UT) Assigning all chassis the same amount of tasks (power consumptions) All nodes experience the same power consumption and temperature rise Outlet Temperature Inlet Temperature

Simulated Environment Used Flometrics Flovent Simulated a small scale data center physical dimensions 9.6m  8.4m  3.6m two rows of industry standard 42U racks arranged CRAC supply at 8 m3/s There are 10 racks each rack is equipped with 5 chassis 1000 processors in data center. 232KWatts at full utilization

Results(1) Recirculation Coefficients Consistent with datacenter observations Large values are observed along diagonal Strong recirculation among neighboring servers, or between bottom servers and top servers 1 2 3 4 5 10 diagonal 9 8 7 6

Performance Results Xint outperforms other algorithms Data Centers almost never run at 100% Plenty of room for benefits! diagonal

Power Vector Distribution Xint contradicts “rule of thumb” placement at bottom key

Supply Heat Index (SHI) Metric developed by HP Labs quantifies the overall heat recirculation of data center Xint consistently has the lowest SHI

Conclusions Thermal-aware task placement can significantly reduce heat recirculation XInt performance thrives at around 50% CPU utilization Not much can be done at 100% utilization Cooling savings can exceed 30% (in comparison to other schemes) Cost of operation reduces by 15% (if initially 1:1 ratio of computing-2-cooling)

Related Work in Progress Waiving simplifying assumptions Equipment heterogeneity [INFOCOM 2008] Stochastic task arrival Thermal maps thru machine learning Automated, non-invasive, cost-effective [GreenCom 2007] Implementations Thermal-aware Moab scheduler Thermal-aware SLURM SiCortex product thermal management

Algorithm Assumptions HPC model in mind Long-running jobs (finish time is the same — infinity) One-time arrival (starting time is the same) Utilization homogeneity (same utilization throughout task’s length) Non preemptive/movable tasks Data Center equipment homogeneity power consumption computational capability Cooling is self-controlled

Thank You Questions? Comments? Suggestions? http://impact.asu.edu/

References 1) AMD – Power and Cooling in the Data Center. 34146A_PC_WP_en.pdf 2) HP Labs - Going beyond CPUs: The Potential of Temperature-Aware Solutions for the Data Center. 3) HP Labs - Making Scheduling Cool: Temperature-Aware Workload Placement in Data Centers.

Additional Slides

Contributions Developed Thermal models of Data Centers Developed analytical thermal models using theoretical thermodynamic formulations Developed online thermal models using machine learning techniques Designed thermal aware task placement algorithms Designed genetic algorithm based task placement algorithm that minimizes the heat recirculation among the servers and the peak inlet temperature. Created a software architecture for dynamic thermal management of Data Centers Developed CFD Models for Real World Data Centers for testing and validation of thermal models and task placement algorithms

Data Center Thermal Management Increasing need for thermal awareness Power density increases Circuit density increases by a factor of 3 every 2 years Energy efficiency increases by a factor of 2 every 2 years Effective power density increases by a factor of 1.5 every 2 years [Keneth Brill: The Invisible Crisis in the Data Center] Maintenance/TCO rising Data Center TCO doubles every three years By 2009, the three-year cost of electricity will exceed the purchase cost of the server Virtualization/Consolidation is a 1-time/short term solution Thermal management corresponds to an increasing portion of expenses Thermal-aware solutions becoming prominent Thermal-aware solutions at various levels IC Case/chassis room firmware O/S Application (middleware) Dynamic voltage scaling Dynamic frequency scaling Circuitry redundancy Fan speed scaling CPU Load balancing Thermal-aware VM Data center job scheduling software dimension physical dimension A dynamic thermal-aware control platform is necessary for online thermal evaluation Thermal issues Heat recirculation Increases as equipment density exceeds cooling capacity as planned Hot spots Effect of Heat Recirculation Impact: Cooling has to be set low enough to have all inlet temperatures in safe operating range Opportunities & Challenges Data centers don’t run at fulll unitilization Can choose among multiple CPUs to allocate a job Different thermal impact per CPU Need for fast thermal evaluation Temporal and spatial Heterogeneity of Data Centers In equipment In workload downplaying the advantages of dense server deployment without thermal-aware management With thermal-aware management $100M cooling $10M $1M computation year