Benchmarking for Power and Performance

Slides:

Advertisements

Similar presentations

© 2010 IBM Corporation What computer architects need to know about memory throttling WEED 2010 June 20, 2010 IBM Research – Austin Heather Hanson Karthick.

Advertisements

Conserving Disk Energy in Network Servers ACM 17th annual international conference on Supercomputing Presented by Hsu Hao Chen.

Key Metrics for Effective Storage Performance and Capacity Reporting.

Sabyasachi Ghosh Mark Redekopp Murali Annavaram Ming-Hsieh Department of EE USC KnightShift: Enhancing Energy Efficiency by.

New Paradigms for Measuring Savings

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.

Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.

Transforming Business with Advanced Analytics: Introducing the New Intel® Xeon® Processor E7 v2 Family Seetha Rama Krishna Director, APAC HPC Solutions.

SLA-Oriented Resource Provisioning for Cloud Computing

Process Selection.

Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.

1 Highly Accelerated Life Test (HALT) Wayne Bradley 8 April 2014.

Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached Bohua Kou Jing gao.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.

Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.

On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.

1 The Problem of Power Consumption in Servers L. Minas and B. Ellison Intel-Lab In Dr. Dobb’s Journal, May 2009 Prepared and presented by Yan Cai Fall.

By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and

Cutting the Electric Bill for Internet-Scale Systems Andreas Andreou Cambridge University, R02

New Challenges in Cloud Datacenter Monitoring and Management

1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University

Usage Centric Green Metrics for Storage Doron Chen, Ealan Henis, Ronen Kat and Dmitry Sotnikov IBM Haifa Research Lab Most of the metrics defined today.

Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.

Low Power Techniques in Processor Design

Energy Profiling And Analysis Of The HPC Challenge Benchmarks Scalable Performance Laboratory Department of Computer Science Virginia Tech Shuaiwen Song,

Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu

Low-Power Wireless Sensor Networks

November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.

August 01, 2008 Performance Modeling John Meisenbacher, MasterCard Worldwide.

| Philadelphia | Atlanta | Houston | Washington DC SO 2 Data Requirements Rule – A Proactive Compliance Approach Mark Wenclawiak, CCM |

Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.

Temperature Aware Load Balancing For Parallel Applications Osman Sarood Parallel Programming Lab (PPL) University of Illinois Urbana Champaign.

Critical Power Slope Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi, Charles Lefurgy, Eric Van Hensbergen Ram Rajamony Raj Rajkumar.

Energy Aware Consolidation for Cloud Computing Srikanaiah, Kansal, Zhao Usenix HotPower 2008.

Energy Management in Virtualized Environments Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD) Inside Xen Hypervisor Online.

1 MARKETING RESEARCH Week 5 Session A IBMS Term 2,

Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.

Dr. Joy Pixley Project Manager, Social Sciences California Plug Load Research Center California Institute for Telecommunications and Information Technology.

Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.

VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.

Dynamic Voltage Frequency Scaling for Multi-tasking Systems Using Online Learning Gaurav DhimanTajana Simunic Rosing Department of Computer Science and.

Authors – Jeahyuk huh, Doug Burger, and Stephen W.Keckler Presenter – Sushma Myneni Exploring the Design Space of Future CMPs.

Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.

CS 3204 Operating Systems Godmar Back Lecture 21.

Critical Power Slope: Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi †,Charles Lefurgy ‡, Eric Van Hensbergen ‡, Ram Rajamony ‡,

June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.

1 Thermal Management of Datacenter Qinghui Tang. 2 Preliminaries What is data center What is thermal management Why does Intel Care Why Computer Science.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

Hardware Architectures for Power and Energy Adaptation Phillip Stanley-Marbell.

Accounting for Load Variation in Energy-Efficient Data Centers

An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)

Dynamic Placement of Virtual Machines for Managing SLA Violations NORMAN BOBROFF, ANDRZEJ KOCHUT, KIRK BEATY SOME SLIDE CONTENT ADAPTED FROM ALEXANDER.

Department of Electrical and Computer Engineering University of Wisconsin - Madison Optimizing Total Power of Many-core Processors Considering Voltage.

Best detection scheme achieves 100% hit detection with

Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.

CS203 – Advanced Computer Architecture

If you have a transaction processing system, John Meisenbacher

Copyright 2010, The World Bank Group. All Rights Reserved. Producer prices, part 2 Measurement issues Business Statistics and Registers 1.

Test Loads Andy Wang CIS Computer Systems Performance Analysis.

PipeliningPipelining Computer Architecture (Fall 2006)

Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.

Lecture 2: Performance Today’s topics:

OPERATING SYSTEMS CS 3502 Fall 2017

Green cloud computing 2 Cs 595 Lecture 15.

Flavius Gruian < >

Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle

Where Does the Power go in DCs & How to get it Back

By: Greg Boyarko, Jordan Sutton, and Shaun Parkison

The University of Adelaide, School of Computer Science

Jacob R. Lorch and Alan Jay Smith University of California, Berkeley

Presentation transcript:

Benchmarking for Power and Performance Heather Hanson (UT-Austin) Karthick Rajamani (IBM/ARL) Juan Rubio (IBM/ARL) Soraya Ghiasi (IBM/ARL) Freeman Rawson (IBM/ARL) IBM Austin Research Lab

UNITED STATES ENVIRONMENTAL PROTECTION AGENCY The Future UNITED STATES ENVIRONMENTAL PROTECTION AGENCY WASHINGTON, D.C. 20460 OFFICE OF AIR AND RADIATION December 28, 2006 Dear Enterprise Server Manufacturer or Other Interested Stakeholder, “The purpose of this letter is to inform you that the U.S. Environmental Protection Agency (EPA) is initiating its process to develop an ENERGY STAR specification for enterprise computer servers. In the coming months, EPA will conduct an analysis to determine whether such a specification for servers is viable given current market dynamics, the availability and performance of energy-efficient designs, and the potential energy savings…” IBM Austin Research Lab

Current EPA protocol for power/performance benchmarking EPA Server Energy Measurement Protocol http://www.energystar.gov/ia/products/downloads/Finalserverenergyprotocol-v1.pdf Recommendation is that system vendors provide curves showing power consumption under different loads Run at maximum load (100%) Repeat with runs at reduced loads, until load reaches 0% Expect consumers to use this curve to estimate their own overall energy consumption Multiply average utilization with appropriate point on the curve to get costs Considerations End-to-end numbers can give the wrong results Averaged utilization doesn’t necessarily correlate to average power Distribution of utilization? When does utilization peak? Which power management techniques are used? Commercially available processors and systems are beginning to incorporate much more aggressive dynamic power management which isn’t captured well by the existing protocol. Figure1 from EPA Server Energy Measurement Protocol document IBM Austin Research Lab

Current EPA protocol for power/performance benchmarking EPA Server Energy Measurement Protocol http://www.energystar.gov/ia/products/downloads/Finalserverenergyprotocol-v1.pdf Recommendation is that system vendors provide curves showing power consumption under different loads Run at maximum load (100%) Repeat with runs at reduced loads, until load reaches 0% Expect consumers to use this curve to estimate their own overall energy consumption Multiply average utilization with appropriate point on the curve to get costs Considerations End-to-end numbers can give the wrong results Averaged utilization doesn’t necessarily correlate to average power Distribution of utilization? When does utilization peak? Which power management techniques are used? Commercially available processors and systems are beginning to incorporate much more aggressive dynamic power management which isn’t captured well by the existing protocol. Sample aggressive management savings Modified to show the behavior of a power-managed system IBM Austin Research Lab

Overview Power and thermal problems in computer systems are becoming more common. Power and thermal management techniques have significant implications for system performance. Researchers rely on benchmarks to develop models of system behavior and experimentally evaluate new ideas. Recent EPA announcement and SPEC OSG activities add urgency to resolving the power/performance benchmarking issues. We present our experiences with adapting performance benchmarks for use in power/performance research. We focus on two problems: Variability and its effect on system power management Collecting correlated power and performance data. Benchmarking for combined power and performance analysis has unique features distinct from traditional performance benchmarking. IBM Austin Research Lab

What should power/performance benchmarks expose? Intensity Variation Workload variation in system utilization Workloads differ from one another A single workload may vary over time Nature of activity variation Workload variation in program characteristics A single workload may change how it uses system components over time The rate at which this change occurs also varies over time. System response How quickly does the system respond to changing characteristics? How well does the system respond to changing characteristics? Specifically for power-managed systems, how well does the system Meet power constraints Meet performance constraints (response time? throughput?) Manage the power versus performance trade-off Maximize its energy efficiency Component-level variation “Identical” components are not actually identical in power IBM Austin Research Lab

Utilization variation Workload intensity varies over time. Workload intensity also varies with purpose. The EPA protocol would suggest that these sites should expect to pay for their average utilization. For example, the NYSE profile averages 42% of peak utilization, but a system provisioned this way may be unable to pay for peak load. IBM Austin Research Lab

A closer look at the NYSE trace How can the changes in utilization be exploited to reduce power consumption without harming performance unduly? DVFS is the most typical solution Lower the frequency and voltage while still meeting the response time criterion Other techniques can be employed instead Throttling CPU Packing with deep power saving Different techniques will give different results depending on the system design and workload DVFS may provide better results for System 1 with Workload A while CPU Packing provides better results for System 2 running Workload B Current EPA protocol does not capture time varying nature of system utilization. Any new power/performance benchmarks should capture this behavior. Response to time-varying behavior is a key feature of any power management implementation IBM Austin Research Lab

Cautionary note on scaling transactions for benchmarking Trivial example: Let max permissible utilization = 85% per processor (340% on previous graph) 600 transactions/minute Approximately 10 transactions a second 10% of the max permissible utilization is 34% Assume this is 60 transactions/minute Many different ways to distribute these over time Different ways of scaling transactions will give different results depending on what the available power management methods are. Total number of transactions are the same in the 3 cases below. A distribution centered on the average time is probably the most realistic option Different ways of scaling transactions may make it easy to “cheat” on the results. How the time-varying nature of workloads changes is important because it determines whether or not certain techniques are responsive enough both in the rapidity of response and the range of response. IBM Austin Research Lab

Variation in the nature of a workload’s activity What about systems under identical utilization, but running different workloads? How a workload uses systems resources, including the processor, has a significant impact on power consumption. On the Pentium M, power consumption at the same system utilization can vary by a factor of 2. Power variation due to workload will increase Processors will have more clock gating and will employ other, more aggressive, power savings techniques more extensively. Additional components are adding power reducing techniques such as memory power down, disk idle power reduction and so on IBM Austin Research Lab

Variation during SPECCPU2000 run IBM Austin Research Lab

A closer look at gcc and gzip Behavior is nearly chaotic at OS-level time scales Power management techniques Must respond on microarchitectural-level time scales gzip Exhibits a number of different stable phases at OS-level time scales Power management techniques Can be slower to respond Can have some associated overhead which gets amortized IBM Austin Research Lab

Why does nature of activity matter? What should be considered here? What system and processor resources are being used by the workload? Memory bound? CPU-bound with many mispredictions? How rapidly is the behavior of the workload changing? Stable phases? At what time scale? How rapidly can the various power management techniques respond to changing conditions and when? Sometimes “too fast” can be as bad as “too slow” Strive to capture realistic temporal variation in workload activity to contrast different power management solutions of different systems Current EPA protocol relies only on level scaling of the throughput of an application relative to its maximum Ignores differences between and within applications Has no variation in intensity over very long periods of time Can breed dependence on slow-response techniques only DVFS Low power sleep modes IBM Austin Research Lab

Other sources can impact power, but are harder to capture “Identical” components from the same manufacturer consume different amounts of power Process variation, part binning “Identical” components from different manufacturers consume different amounts of power Different design criteria may produce same functional spec, but different implementations Power supply efficiency varies Different loads Different power supplies Different environments cause components to consume different amounts of power Temperature, humidity, type of heat sink, etc Ex: Temperature of the datacenter can cause power consumption to increase as cooling becomes less effective (smaller ΔT) leakage power increases (exponential dependence on T). IBM Austin Research Lab

Variation across “identical” Pentium M processors 5 Pentium M parts were measured under the same thermal conditions and constant workload. The workload was constant with respect to both power and instruction throughput). IBM Austin Research Lab

Variation across “identical” DRAM from different vendors These DRAM parts have identical performance characteristics according to their datasheets, but have different power characteristics IBM Austin Research Lab

Consequences of additional variation? Example: A vendor benchmarks a system with 100 processors of type A fabricated by manufacturer X, 200 DIMMs of memory of type B fabricated by manufacturer Y and placed in an “ideal” datacenter… The vendor can benchmark with energy-efficient parts that may not match the system purchased by the customer. What happens when parts start failing and are replaced? The initial estimate is incorrect as more parts are replaced with “identical” parts. Power consumption can “drift” as more expensive parts are replaced with cheaper parts that consume more energy. System may eventually raise the temperature of the datacenter beyond the threshold at which it cools efficiently. System temperatures go up in response to rising datacenter temperatures and datacenter temperatures go up in response to rising system temperatures. Initial benchmark measurements may be useless. What can be done to address these issues? 1 – Competitive benchmarking could be independently assessed 2 – Users need comprehensive information on the benchmarking environment to decide applicability of usage of benchmarking results 3 – Site-specific benchmarking might be more appropriate for customers IBM Austin Research Lab

Conclusions Systems exhibit many types of variation that need to be captured in power/performance benchmarking Intensity Variation Workload variation in system utilization Workloads differ from one another A single workload may vary over time Nature of activity variation Workload variation in program characteristics A single workload may change how it uses system components over time Component-level variation “Identical” components are not actually identical in power or performance System response How quickly does the system respond to changing characteristics? How well does the system respond to changing characteristics? IBM Austin Research Lab

Questions? IBM Austin Research Lab

IBM/ARL – Power Aware Systems Group Focus is on power and thermal management techniques Can’t evaluate techniques without good benchmarks Developing and adapting benchmarks for power/performance characterizations since 2002. IBM Austin Research Lab

Correlating traces Send the power high to mark the beginning of a trace Capture both power and performance traces Send the power high to mark the end of a trace Done through various utilities for Windows and Linux. IBM Austin Research Lab