Coordinated Performance and Power Management Yefu Wang.

Slides:



Advertisements
Similar presentations
Ramya (UCSB), Parthasarathy et al (HP Labs). Overview Power delivery, consumption and cooling problems in a data center are being tackled currently by.
Advertisements

Capacity Planning in a Virtual Environment
SLA-Oriented Resource Provisioning for Cloud Computing
System Center 2012 R2 Overview
Power Aware Virtual Machine Placement Yefu Wang. 2 ECE Introduction Data centers are underutilized – Prepared for extreme workloads – Commonly.
Power Management (Application of Autonomic Computing Concepts) Omer Rana.
Enabling High-level SLOs on Shared Storage Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, Ion Stoica Cake 1.
A Cyber-Physical Systems Approach to Energy Management in Data Centers Presented by Chen He Adopted form the paper authors.
Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds School of Computer Engineering Nanyang Technological University,
CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
SLA-aware Virtual Resource Management for Cloud Infrastructures
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
IPOEM: A GPS Tool for Integrated Management in Virtualized Data Centers Hui Zhang 1, Kenji Yoshihira 1, Ya-Yunn Su 2, Guofei Jiang 1, Ming Chen 3, Xiaorui.
COMS E Cloud Computing and Data Center Networking Sambit Sahu
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
New Challenges in Cloud Datacenter Monitoring and Management
Energy Aware Network Operations Authors: Priya Mahadevan, Puneet Sharma, Sujata Banerjee, Parthasarathy Ranganathan HP Labs IEEE Global Internet Symposium.
Adaptive Server Farms for the Data Center Contact: Ron Sheen Fujitsu Siemens Computers, Inc Sever Blade Summit, Getting the.
VMware vSphere 4 Introduction. Agenda VMware vSphere Virtualization Technology vMotion Storage vMotion Snapshot High Availability DRS Resource Pools Monitoring.
Presented by : Ran Koretzki. Basic Introduction What are VM’s ? What is migration ? What is Live migration ?
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant.
Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Virtual Machine Hosting for Networked Clusters: Building the Foundations for “Autonomic” Orchestration Based on paper by Laura Grit, David Irwin, Aydan.
Bargaining Towards Maximized Resource Utilization in Video Streaming Datacenters Yuan Feng 1, Baochun Li 1, and Bo Li 2 1 Department of Electrical and.
How to Resolve Bottlenecks and Optimize your Virtual Environment Chris Chesley, Sr. Systems Engineer
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,
Power Control for Data Centers Ming Chen Oct. 8 th, 2009 ECE 692 Topic Presentation.
Cloud Computing Energy efficient cloud computing Keke Chen.
Improving Network I/O Virtualization for Cloud Computing.
1 © Copyright 2010 EMC Corporation. All rights reserved.  Consolidation  Create economies of scale through standardization  Reduce IT costs  Deliver.
USTH Presentation Power-aware Scheduler for Virtualization TRAN Giang Son Prof. Daniel HAGIMONT Oct 19th, 2011.
Power and Performance Modeling in a Virtualized Server System M. Pedram and I. Hwang Department of Electrical Engineering Univ. of Southern California.
1 Server-level Power Control Ming Chen. 2 Motivations(1) Clusters of hundreds, even thousands of servers; Occupy one room of a building or even a whole.
Challenges towards Elastic Power Management in Internet Data Center.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Profiling and Modeling Resource Usage.
Applying Control Theory to the Caches of Multiprocessors Department of EECS University of Tennessee, Knoxville Kai Ma.
1 ECE692 Topic Presentation Power/thermal-Aware Utilization Control Xing Fu 22 September 2009.
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Joint Power Optimization Through VM Placement and Flow Scheduling in Data Centers DAWEI LI, JIE WU (TEMPLE UNIVERISTY) ZHIYONG LIU, AND FA ZHANG (CHINESE.
Energy Management in Virtualized Environments Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD) Inside Xen Hypervisor Online.
Automated Control in Cloud Computing: Challenges and Opportunities Harold C. Lim, Shivnath Babu, Jeffrey S. Chase, and Sujay S. Parekh ACM’s First Workshop.
Managing the Performance Impact of Administrative Utilities Paper by S. Parekh,K. Rose, J.Hellerstein, S. Lightstone, M.Huras, and V. Chang Presentation.
A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,
The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers Di Xie, Ning Ding, Y. Charlie Hu, Ramana Kompella 1.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab ( Department.
VTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella,
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
Present by Sheng Cai Coordinating Power Control and Performance Management for Virtualized Server Clusters.
1 Thermal Management of Datacenter Qinghui Tang. 2 Preliminaries What is data center What is thermal management Why does Intel Care Why Computer Science.
Profiling, Prediction, and Capping of Power in Consolidated Environments Bhuvan Urgaonkar Computer Systems Laboratory The Penn State University Talk at.
Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.
Accounting for Load Variation in Energy-Efficient Data Centers
Dynamic Placement of Virtual Machines for Managing SLA Violations NORMAN BOBROFF, ANDRZEJ KOCHUT, KIRK BEATY SOME SLIDE CONTENT ADAPTED FROM ALEXANDER.
ECE692 Course Project Proposal Cache-aware power management for multi-core real-time systems Xing Fu Khairul Kabir 16 September 2009.
ECE 692 Power-Aware Computer Systems Final Review Prof. Xiaorui Wang.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
1 Implementing a Virtualized Dynamic Data Center Solution Jim Sweeney, Principal Solutions Architect, GTSI.
PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,
1 Automated Power Management Through Virtualization Anne Holler, VMware Anil Kapur, VMware.
Energy Aware Network Operations
PA an Coordinated Memory Caching for Parallel Jobs
Comparison of the Three CPU Schedulers in Xen
Presentation transcript:

Coordinated Performance and Power Management Yefu Wang

2 ECE Power/Performance Problems in Datacenters Power related problems – Power/thermal control (Capping) – Power optimization Performance related problems – Performance control – Performance optimization Problem Scale – Datacenter level – Cluster level – Server level – Application level

Co-Con: Coordinated Control of Power and Application Performance for Virtualized Server Clusters Xiaorui Wang and Yefu Wang Department of EECS University of Tennessee, Knoxville

4 ECE Power and Performance Control Most prior work on power/performance control: control one and optimize the other – Power control: Power capping to avoid power overload or thermal failures due to increasing high server density. – Performance control: provide guarantees for Service-Level Agreements Performance-oriented [Chase’01], [Chen’05], [Elnozahy’02], [Sharma’03], [Wang’08], etc. Power-oriented [Minerick'02], [Lefurgy'08], [Wang'08], [Juang'05],etc. Performance- Oriented Controller Performance measurement Performance target Control Decision (Minimize Power) Power- Oriented Controller Power measurement Power target Control Decision: (Maximize performance) May violate power constraint Performance is not guaranteed

5 ECE Coordinated Control of Power and Performance … Power Controller [HPCA’08] Power Budget Performance Requirements VM1VM2VM3VM4 Performance Monitors Performance Controllers Performance Monitors CPU allocation Cluster-level CPU Resource Coordinator

6 ECE Response Time Controller VM CPU allocation Response time Response Time Controller Response time set point PID (Proportional-Integral-Differential) controller  System modeling  Controller design  Controller analysis 750ms 700ms Error: 50ms Increase 2.4%  Workload variation  Frequency variation 700ms

7 ECE Response Time Model PID controller  System modeling  Controller design  Controller analysis Response time model System identification – Model orders – Parameters Model Orders and Error

8 ECE System Identification in Practice Operational point – Linearize the systme model locallly White noise – Generating a white noise Least squares method – Given find which makes the model best fits the measured data open T, "white_noise.log"; while( ){ chomp; $rand = int( * $p ); $cpu = $rand; allocate $cpu; $t=get_response_time; log $cpu, $t; sleep $step; }

9 ECE Controller Design PID controller – Proportional – Integral – Differential Design: Pole placement PID controller  System modeling  Controller design  Controller analysis CPU allocation Response time VM Response time set point Error

10 ECE Coordination Coordination of the two control loops PID controller  System modeling  Controller design  Controller analysis 1GHz3GHz Power control loop works CPU frequency changes Response time model changes Response time control loop still works? Stability range: Settling time < 24s The control period of the power control loop is selected to be longer than the settling time of the response time control loop.

11 ECE System Implementation Servers – 2 Intel servers – 2 AMD servers – Storage server (NFS) VMs – 512Mb RAM, 10Gb storage via NFS, 2 VCPUs – Xen 3.1 with Credit scheduler – CPU allocation: cap in credit scheduler Workload: – PHP + Apache benchmark Server2 Server1 Server4 Storage (NFS)

12 ECE Response Time Control Workload increase on VM2 Response time of VM2 is controlled to 700ms by increasing its CPU resource allocation. 700ms

13 ECE Response Time Control Change CPU frequency Change CPU frequency Change workload  Set point: 700ms  Standard deviation: 51  Set point: 700ms  Standard deviation: 57

14 ECE Coordination: Power Budget Reduction Compare with baseline: Power control only Co-Con Baseline Power and response time guarantee Power control only: Violation of performance requirements [Minerick'02], [Lefurgy'08], [Wang'08], [Juang'05],etc. Performance control only :  Power budget violation  Undesired server shutdown Performance control only :  Power budget violation  Undesired server shutdown

15 ECE Conclusion Co-Con: Coordinated control of power and application performance – Simultaneous control of power and performance Cluster-level power budget guarantee for server racks Application-level performance guarantee – Effective control despite workload/ CPU frequency variations

No “Power” Struggles: Coordinated Multi-level Power Management for the Data Center Ramya Raghavendra*, Parthasarathy Ranganathan†, Vanish Talwar†, Zhikui Wang†, Xiaoyun Zhu† *University of California, Santa Barbara †HP Labs, Palo Alto

17 ECE Average power Peak thermal power Peak electrical power CPU Server Enclosure Rack X X X X X X X OS-wlm OS-gwlm SIM Vmotion VM-res.all LSF X The Problem VM heterogeneity Local optima global optima performance X X X X X X X X X X CHAOS!! (“Power” Struggle) X X X X

18 ECE Research Questions Co-ordination Design – How to ensure correctness, stability, efficiency? – How to make local decisions with incomplete global info? – How to build in support for dynamism? Implications of Co-ordination – Can we simplify or consolidate controllers? – Do we revisit policies and mechanisms of the controllers? – How sensitive is the design to apps and systems considered?

19 ECE A “Representative” Subset of Problems Overlap in objective functions Overlap in actuators Different time constants Different problem formulations

20 ECE Solution in This Paper First unified architecture for data center power management – Interfaces and information exchange between loops – Leverages feedback control theory – Evaluation on real-world traces: significant savings Insights on design trade-offs – Architectural alternatives for various objective functions – Implementation alternatives (time constants and hw/sw) – Mechanisms (p-states, VMs) & policies (pre-emptive, fair-share, …)

21 ECE System Models Power model: Performance model:

22 ECE Unified and Extensible Architecture

23 ECE Coordination SM:Expose API to EM and GM to change power budget EC:Expose API to SM to change r_ref EM: Expose API to GM to change power budget VMC: Use "real utilization"; use power budgets as constraints

24 ECE Implementation Not implemented in hardware testbed – Requires many servers – Requires DVFS support – Each controller must be individually configured – Requires real world applications Simulation – Trace-driven simulation – Power / performance models from real hardware

25 ECE Results : Benefits from coordination: Compared by a baseline without control

26 ECE VM Migration vs. Local Power Control Coordinated solution provides the most power savings

27 ECE Guaranteeing Stability (1) This paper provides stability guarantee for EC and SM – Server-level performance and power control Stability of EC – Assumptions CPU frequency is continues Frequency demand of workloads is a constant CPU utilization is defined as – Control law – Stability proof Since, this paper proves

28 ECE Guaranteeing Stability (2) Stability of SM – Assumptions The settling time of EC is shorter than the control period of SM Power consumption can be modeled as – Controller – Close loop system – System is stable

29 ECE Conclusions Coordination architecture for five individual solutions Simulations based on close to 200 server traces from realworld enterprise deployments Compared with non-coordinated solution – Less constraint violations – More power efficient

30 ECE Critiques to Co-Con Average response time is not an ideal performance metric – Can be extended to 90-percentile response time The response time monitor is not perfectly implemented Only CPU resource is considered – Extension to IO, network, etc. Evaluation is based on simple workloads – A simple PHP script – Single tier – No IO/database operations

31 ECE Critiques to No “Power” Struggles Controllers are highly coupled Performance model is over simplified Coordination between VMc and EC is over simplified – How can CPU be allocated to VMs? – How will DVFS affect the performance of multiple VMs? – How about hetorogenous servers? Lack of implementation in real hardware

32 ECE Comparison of Two Papers Co-ConNo “Power” Struggles Performance metricResponse timePercentage of work done Number of levels35 CoordinationTwo control loops are designed independently with coordination analysis Control loops are coupled with APIs EvaluationTestbedSimulation Power aware VM consolidation NoYes Stability proofTime domain + z-domainTime domain

33 ECE Q&A Acknowledgments: Some slides are adapted based on the slides of Vanish Talwa

Backup Slides

Cluster-level CPU Resource Coordinator

Response Times and CPU Allocation of the VMs Under Different CPU Frequencies

Response Times and CPU Allocation of the VMs Under Different Workloads

38 ECE VMC in No “Power” Struggles

39 ECE Controllers in No “Power” Struggles