Presentation is loading. Please wait.

Presentation is loading. Please wait.

Accurate Power and Energy Measurement on Kepler-based Tesla GPUs Martin Burtscher Department of Computer Science.

Similar presentations


Presentation on theme: "Accurate Power and Energy Measurement on Kepler-based Tesla GPUs Martin Burtscher Department of Computer Science."— Presentation transcript:

1 Accurate Power and Energy Measurement on Kepler-based Tesla GPUs Martin Burtscher Department of Computer Science

2 Introduction  GPU-based accelerators  Quickly spreading in PCs and even handheld devices  Widely used in high-performance computing  Power and energy efficiency  Heat dissipation is a problem  Electric bill and battery life are of growing concern  Exascale requires 50x boost in performance per watt  Important research area  Need to develop techniques to reduce power and energy  Have to be able to measure power/energy of programs Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 2

3 GPU Power Sensors  Hardware  High-end compute GPUs include power sensors  For example, K20/K40 Tesla cards have built-in sensor  These cards are the target of this talk  Software  Can query sensor with NVIDIA Management Library  http://developer.nvidia.com/nvidia-management-library-nvml Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 3

4 Problems  Power sensor data behaves strangely  Running the same kernel twice yields different energy  First launch: 114 J, second launch: 147 J (29% more energy)  Running a kernel 2x as long more than doubles energy  1x input: 732 J, 2x input: 1579 J (8% above doubling)  Power sensor sampling rate varies greatly  Ranges from 0.266 ms to 130 ms (7.7 Hz to 3760 Hz) Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 4

5 Methodology  Hardware  Two K20c, two K20m, two K20X, and two K40m GPUs  Measurement  Query power and time in loop on “idle” CPU core  Test code  Compute-intensive regular n-body kernel  Constant computation rate of over 2 TFlops on a K20c  No data dependences; vary n to adjust kernel runtime Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 5

6 Expected Power Profile Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 6 Kernel starts executing Kernel stops executing GPU idle power Measurement loop runtime

7 Measured Power Profile Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 7 Power ramps up slowly Power ramps down slowly Switch to step shape Idle power reached Macroscopic phenomena 5s 3s 4s

8 Energy = Area Under Power Curve Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 8 Integrate to where? Unclear how big energy is Missing energy? Delayed energy?

9 Ramp-up Behavior of 2 Short Runs Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 9 Short run same as longer run 2 nd run starts higher but also follows curve Ramp down doesn’t follow

10 Ramp-down Behavior of Several Runs Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 10 Shape depends on power at t 2 Power increases after kernel done Shape always the same Steps down every second Driver lowers power level

11 Sampling Interval Lengths Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 11 Short intervals Wide range of intervals Very long interval Driver activity can prevent sampling

12 Sampling Interval Lengths (zoomed-in) Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 12 Identical values Many short intervals Very long interval Sampled power only ever changes after long interval

13 Correcting the Measurements Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 13

14 Sampling Frequency  Eliminate redundant samples  Only sample once every 15 ms (66.7 Hz)  Cannot accurately measure kernels under ~150 ms  Account for the variation in interval length  Use high-resolution time stamps  Example: energy from t 1 to t 4  Dotted (fixed intervals): 1205 J  Solid (variable intervals): 1066 J  13% discrepancy Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 14

15 True Power  Sensor hardware  Seems to asymptotically approach true power  Reminiscent of capacitor charging  True instant power  P true is a function of the slope of the power profile dP/dt and the power measured by the sensor P sensor P true = P sensor + C × dP sensor /dt  “Capacitance” of sensor  C ≈ 0.84 s on all tested K20 GPUs Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 15

16 Back-calculated from Expected Profile Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 16 ‘Capacitor’ function matches measured values perfectly Minimized absolute errors to determine C

17 Corrected Power Profile Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 17 Wobbles due to sampling errors Corrected profile matches expected rectangular profile ‘Active idle’ power level

18 Correction of 2 Short Runs Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 18 Corrected power profile matches expected profile

19 Second K20c GPU Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 19 Identical to original K20c

20 K20m GPU Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 20 Similar profile but higher power level

21 K20X GPU Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 21 Profile is good, no correction needed! Huge 600 ms gap

22 K40m GPU Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 22 K40m again requires correction

23 Application to Full CUDA Program  Implementation of Barnes Hut n-body algorithm  Taken from LonestarGPU benchmark suite  Contains multiple regular and irregular kernels  Highly optimized, but still suffers from load imbalance, divergence, and uncoalesced accesses  Main kernel is ‘regularized’ (warp-based) Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 23 NASA/JPL-Caltech/SSC

24 Barnes Hut Power Profile (1 Step) Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 24 Slow then fast drop-off “Wave” in profile Original profile is hard to interpret

25 Barnes Hut Power Profile (Kernels) Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 25 Slow then fast drop-off “Wave” in profile Original profile is hard to interpret

26 Corrected Barnes Hut Power Profile Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 26 Decrease due to load imbal. Two similar irreg. kernels One more irreg. kernel Very short regular kernel Corrected profile reveals important info Regularized main kernel

27 K20Power Tool  Output  Corrected profile and corresponding ‘active’ energy  Features  Computes instant power using ‘capacitor’ formula  Employs high-resolution time steps  Samples at true frequency of 66.7 Hz  Dissemination  Open source, research license  http://cs.txstate.edu/~burtscher/research/K20power/ Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 27

28 Marcher System  Tool will be part of Marcher system at Texas State  NSF-funded green computing infrastructure  Marcher is a power-measurable cluster system  832 general-purpose cores  12,000 GPU and MIC cores  1.2 TB of DDR3 with power throttling and scaling  50 TB of hybrid storage with hard drives and SSDs  Component-level power measurement tools (e.g., CPU, DRAM, Disk, GPU, Xeon Phi) Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 28

29 Summary  Correctly measuring K20/K40 power and energy  Sample at 66.7 Hz and include time stamps  Compute true power with presented formula  Use neighboring power samples to approximate slope  Compute true energy by integrating true power  Over intervals where power is above ‘active idle’  K20Power tool  Software tool that implements this methodology  Paper at http://cs.txstate.edu/~burtscher/papers/gpgpu14.pdf Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 29

30 Acknowledgments  Collaborators  Ivan Zecena and Ziliang Zong  U.S. National Science Foundation  DUE-1141022, CNS-1217231, and CNS-1305359  NVIDIA Corporation  Grants and equipment donations  Texas State University  Research Enhancement Program Accurate Power and Energy Measurement on Kepler-based Tesla GPUs 30 Nvidia


Download ppt "Accurate Power and Energy Measurement on Kepler-based Tesla GPUs Martin Burtscher Department of Computer Science."

Similar presentations


Ads by Google