Improving Dynamic Voltage Scaling Algorithms with PACE Jacob R. LorchAlan Jay Smith University of California Berkeley June 18, 2001 To make the most of.

Slides:

Advertisements

Similar presentations

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Advertisements

Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.

Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.

Sampling Distributions

Central Limit Theorem.

Point estimation, interval estimation

Sampling Distributions

A P STATISTICS LESSON 9 – 1 ( DAY 1 ) SAMPLING DISTRIBUTIONS.

Traffic Modeling.

PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.

Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.

Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.

Hard Real-Time Scheduling for Low- Energy Using Stochastic Data and DVS Processors Flavius Gruian Department of Computer Science, Lund University Box 118.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.

The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.

NASSP Masters 5003F - Computational Astronomy Lecture 4: mostly about model fitting. The model is our estimate of the parent function. Let’s express.

Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.

Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.

Real-Time Operating Systems RTOS For Embedded systems.

Virtual University of Pakistan

STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample

Lesson 8: Basic Monte Carlo integration

Confidence Intervals and Sample Size

Chapter 7 Confidence Interval Estimation

6-4 Large-Sample Confidence Intervals for the Population Proportion, p

Statistical Intervals Based on a Single Sample

Jacob R. Lorch Microsoft Research

12. Principles of Parameter Estimation

Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.

Jacob R. Lorch Microsoft Research

Unit 5: Hypothesis Testing

Joint Probability Distributions and Random Samples

Central Processing Unit- CPU

CHAPTER 6 Statistical Inference & Hypothesis Testing

Sampling Distributions

Energy-Efficient Communication Protocol for Wireless Microsensor Networks by Wendi Rabiner Heinzelman, Anantha Chandrakasan, and Hari Balakrishnan Presented.

Chapter 5 Sampling Distributions

THE CENTRAL LIMIT THEOREM

Combining Random Variables

Chapter 5 Sampling Distributions

Slides by JOHN LOUCKS St. Edward’s University.

Flavius Gruian < >

Chapter 3: Principles of Scalable Performance

Determining the distribution of Sample statistics

The normal distribution

Discrete Event Simulation - 4

Dynamic Voltage Scaling

By: Greg Boyarko, Jordan Sutton, and Shaun Parkison

COMP60621 Fundamentals of Parallel and Distributed Systems

Chapter 10: Estimating with Confidence

2.C Memory GCSE Computing Langley Park School for Boys.

October 6, 2011 Dr. Itamar Arel College of Engineering

Warsaw Summer School 2017, OSU Study Abroad Program

PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.

Jacob R. Lorch and Alan Jay Smith University of California, Berkeley

Chapter 20. Learning and Acting with Bayes Nets

8.3 Estimating a Population Mean

12. Principles of Parameter Estimation

COMP60611 Fundamentals of Parallel and Distributed Systems

Chapter 9 Estimation: Additional Topics

Objectives 6.1 Estimating with confidence Statistical confidence

Objectives 6.1 Estimating with confidence Statistical confidence

Chapter 5: Sampling Distributions

Presentation transcript:

Improving Dynamic Voltage Scaling Algorithms with PACE Jacob R. LorchAlan Jay Smith University of California Berkeley June 18, 2001 To make the most of limited energy, pace yourself.

Improving Dynamic Voltage Scaling Algorithms with PACE Outline The role of dynamic voltage scaling in energy management The importance of deadlines in voltage scheduling Why a fundamental accepted principle about optimal voltage scheduling is wrong The optimal formula for improving voltage scheduling algorithms (the PACE approach) Practically applying the optimal formula using statistical modeling Results: PACE reduces CPU energy consumption by as much as 49% with no impact on performance

Improving Dynamic Voltage Scaling Algorithms with PACE Dynamic voltage scaling (DVS) for energy management Energy management is important –Limits on battery life and weight –Battery technology advancing slowly –High electricity prices Dynamic voltage scaling (DVS) –The ability to quickly and efficiently change the supply voltage of the CPU without rebooting –Starting to appear on commercial CPU’s (Transmeta, AMD, etc.) –Trades off power consumption for performance because lower voltages necessitate lower speeds –Energy consumption per cycle is CV 2, and maximum speed is roughly proportional to voltage, so CPU energy is roughly proportional to speed squared When is lower energy worth the lower speed?

Improving Dynamic Voltage Scaling Algorithms with PACE Scheduling Goals: Deadline-Based Applicable scenarios –The CPU is working on one or more tasks, each with a deadline. –As long as each task completes by its deadline, performance is considered equivalent. –The deadline may be soft, i.e., the CPU may not need to complete the task by then with probability 100%. Examples –CD/DVD player that does not buffer frames –User-interface events such as key presses and mouse operations: very common in traditional notebook computer use Goals –Complete as many tasks by deadline as possible –Minimize delay: time taken beyond deadlines –Minimize energy consumption Various algorithms exist (Weiser, Pering, Grunwald, etc.)

Improving Dynamic Voltage Scaling Algorithms with PACE Model CPU –Some minimum speed m and maximum speed M; can take any value in between –Energy consumption per cycle proportional to speed squared. Tasks –Known deadline D –Unknown work requirement W (number of CPU cycles needed) –Completion time depends on speed used –Effective completion time is the completion time or D, whichever is larger Reflects the fact that tasks completing by their deadlines might as well have completed at their deadlines –Excess is the amount of work left over after deadline –Delay is the time spent after deadline doing excess work –One task at a time

Improving Dynamic Voltage Scaling Algorithms with PACE Speed Schedules for Tasks with Deadlines A speed schedule has a pre-deadline part and post-deadline part. A schedule has a certain number of pre-deadline cycles (PDC) –Deadline missed if W > PDC. –Excess will be W – PDC. Two schedules are performance equivalent if, no matter what W is, the task will have the same effective completion time for both schedules. If two schedules have equal PDC’s and identical post-deadline parts, they are performance equivalent. We can improve a DVS algorithm by replacing its speed schedule with a performance-equivalent one with lower expected energy consumption. Pre-deadline partPost-deadline part PDC = 9,500,000 cycles for both schedules

Improving Dynamic Voltage Scaling Algorithms with PACE A Constant Speed Is Not Ideal When W is known, the ideal schedule uses a fixed speed, since power is a concave-up function of speed. However, generally W is unknown. In this case, a constant- speed schedule is not always ideal. Intuition: The task may be long or short; if we run slowly at first, we may never get to the high-energy late part. Example using 50 ms deadline –Work requirement is 5 Mc with probability 75% and 10 Mc with probability 25% –Ideal constant speed is 200 MHz –Another speed schedule: run at 163 MHz for the first ms, then 259 MHz for the next ms –They are performance equivalent, but the latter’s expected energy consumption is 13.3% less.

Improving Dynamic Voltage Scaling Algorithms with PACE What is the Ideal Replacement for the Pre-Deadline Speed Schedule? Depends on the probability distribution of the task’s work W Define F to be the cumulative distribution function of W –F(w) is the probability that W  w; F c (w) is the probability that W > w Express speed schedule as a function of work completed, s(w) –Makes optimization problem easier –Straightforward to convert to a function of time Want to minimize expected pre-deadline energy consumption,  0 PDC F c (w) s(w) 2 dw, subject to  0 PDC 1/s(w) dw = D. Ideal speed schedule: s(w) = S 0 [F c (w)] -1/3, bounded between m and M. –Constant of proportionality S 0 chosen so the task meets its deadline We call this replacement approach PACE: Processor Acceleration for Conserving Energy

Improving Dynamic Voltage Scaling Algorithms with PACE Estimating Work Probability Distribution Obtain sample of recent tasks’ work requirements –Possibly weight more recent ones more heavily –Keep separate samples for different task types For example, tasks initiated by key presses in Microsoft Word or tasks initiated by releasing the left mouse button in Microsoft Excel Infer distribution from sample Task Kc Task Kc Task Kc Task Kc … Distribution estimation method Cycles Probability

Improving Dynamic Voltage Scaling Algorithms with PACE Sampling Methods Future –Use all tasks in the workload, even ones that haven’t occurred yet –Impractical, but can be simulated to show idealized effect All –Use all tasks seen so far, equally weighted Recent-k –Use the most recent k tasks seen, equally weighted LongShort-k –Use the most recent k tasks seen, weighting the most recent k/4 three times more Aged-a –Use all tasks seen so far –Weight the k th most recent by a k (a  1) … … …

Improving Dynamic Voltage Scaling Algorithms with PACE Distribution Estimation Methods Parametric –Assume a given form of distribution –Estimate parameters of that distribution –Examples Normal (mean and standard deviation) Gamma (shape and scale) Nonparametric –Do not assume any form for distribution –Let the data “speak for themselves” –Kernel density estimation Each sample point is represented by a small subdistribution We choose triangular subdistributions due to high efficiency and ease of implementation Cycles Probability Cycles Probability Cycles Probability

Improving Dynamic Voltage Scaling Algorithms with PACE Simulation Results (Simulated CPU: 100–500 MHz, maximum power consumption 3W)

Improving Dynamic Voltage Scaling Algorithms with PACE What Sampling Method Works Best? Future sometimes good but often quite bad –Shows value of recent information –Nice, considering method can’t be implemented Aged a bit better than LongShort, which is a bit better than Recent –Differences minor enough that implementation efficiency is most important

Improving Dynamic Voltage Scaling Algorithms with PACE Which Distribution Estimation Method Works Best? Statistical methods show low probability of either parametric model fitting the data –Nevertheless, for our limited purposes, parametric models work reasonably Gamma model never increases energy by more than 2.3% Normal model not as good

Improving Dynamic Voltage Scaling Algorithms with PACE Effect of PACE on Existing Algorithms Existing algorithms simulated –Past/Weiser-style: similar to Weiser et al –LongShort/Chan-style: practical version of one of Chan et al’s best –Flat/Chan-style: another of Chan et al’s best Always runs at constant speed –Past/Peg: Grunwald et al’s best Workloads simulated –Word: Letter key presses in Microsoft Word –Excel: Releases of left mouse button in Microsoft Excel –Groupwise: Releases of left mouse button in Groupwise –Low-Level: Key presses detected at low level –MPEG-One: Playing the “Red’s Nightmare” movie –MPEG-Many: Playing seven different movies sequentially

Improving Dynamic Voltage Scaling Algorithms with PACE Effect of PACE with Word Workload

Improving Dynamic Voltage Scaling Algorithms with PACE Effect of PACE with Excel Workload

Improving Dynamic Voltage Scaling Algorithms with PACE Effect of PACE with GroupWise Workload

Improving Dynamic Voltage Scaling Algorithms with PACE Effect of PACE with Low-Level Workload

Improving Dynamic Voltage Scaling Algorithms with PACE Effect of PACE with MPEG-One Workload

Improving Dynamic Voltage Scaling Algorithms with PACE Effect of PACE with MPEG-Many Workload

Improving Dynamic Voltage Scaling Algorithms with PACE Summary of Results: Effect of PACE on Existing Algorithms PACE always improves existing algorithms By definition, PACE never has any performance impact Also by definition, PACE does not alter post-deadline energy Gamma model reduces CPU energy consumption by 2.4–49.0% Kernel method reduces CPU energy consumption by 1.4–49.5% PACE reduces energy consumption by up to 49%, with an average of 21%, with no performance impact

Improving Dynamic Voltage Scaling Algorithms with PACE Overhead of Implementing PACE Computing PACE schedules takes time and energy Suggestion: match sampling and distribution estimation methods for maximum efficiency –Kernel method goes well with Recent-k, Gamma with Aged-a Simulations show overhead is low all-around –Aged-0.95/Gamma faster than Recent-28/Kernel

Improving Dynamic Voltage Scaling Algorithms with PACE Conclusions Dynamic voltage scaling can substantially reduce energy consumption at a cost of slower operation A common scenario where dynamic voltage scaling is useful is a task with a known deadline but unknown CPU requirements –User should not notice slower performance as long as deadline is met –Common in user-interface driven operation (which most operations are) as well as other times such as playing media Existing scheduling algorithms can be improved using PACE One can implement PACE practically and efficiently PACE reduces CPU energy consumption by up to 49%, with an average of 21%, with no impact on performance