Presentation is loading. Please wait.

Presentation is loading. Please wait.

Heterogeneous CPU/GPU co- processor clusters Michael Fruchtman.

Similar presentations


Presentation on theme: "Heterogeneous CPU/GPU co- processor clusters Michael Fruchtman."— Presentation transcript:

1 Heterogeneous CPU/GPU co- processor clusters Michael Fruchtman

2 Current State Eight of the top ten most efficient clusters are heterogeneous [1] Power law of efficiency

3 Current State At today’s efficiencies: – An exascale (10 18 ) cluster will require 200MegaWatts [2] – Cluster efficiency must grow by 66% a year to keep up with Moore’s Law – Most efficient cluster increased at normalized 61.4% average per year This gap represents the increase in power requirements to grow from petascale to exascale

4 Power Efficient Amdahl’s Law [3]

5 Given W c =0.25, S c =0.5, K c =0.60 N variable to power budget, K=1 Top: f=0.3 Bottom: f=0.9 P+c* is superior with increased parallelization

6 GPU Architecture [4]

7 P-E Amdahl’s Law and GPU W c = 0.00417, 0.5 watts per core, K=120 – Intel i7 980 XE K c = 0.115 – turning on a GPU is 71% of power draw [5] S c is harder to measure, memory or computation bound? GPU memory architecture makes this difficult to measure. S c = 0.172 assuming computational with the GTX580

8 Threads, Blocks and Performance [5]

9 Formal Power Modeling [6] Average Geometric Error of Power Prediction = 9.18%

10 Temperature Model [6] RC_Rise = 35 and RC_Decay = 65 GPU dependent constants

11 Conditions for GPU Use GTX 580 draws 244W on load – Speedup must be greater than 2, 3 for safety – f must be very high, preferably 0.9 or higher Improved energy efficiency is based on performance – Example: GPUDB SQL queries – Without joins speedup 20+ [7] – With joins 2-7 [8]

12 Reducing GPU Power Usage Powergating Improved Memory Coalescence – Memory Coalescence Models Incoherent Branching – Incoherent Branching Models NVIDIA Optimus reduces idle power to near zero

13 References [1] Feng, Wu-chan and Kirk W. Cameron. "The Green 500 List - November 2010." The Green 500. Virginia Tech and Virginia Polytechnic Institute and State University. November 2010. Web. March 15 2011. [2] T. Agerwala. Challenges on the road to exascale computing. Proceedings of the 22nd annual international conference on Supercomputing (ICS '08). ACM, New York, NY, USA, 2-2. 2008. [3] D. Woo and H-H Lee. Extending Amdahl's Law for Energy-Efficient Computing in the Multi-Core Era. IEEE Xplore. IEEE Computer Society. December 2008. Web. March 15, 2011. [4] R. Smith. "NVIDIA's GeForce GTX 580: Fermi Redefined. AnandTech. November 9, 2010. Web. March 16, 2011. http://www.anandtech.com/show/4008/nvidias-geforce-gtx-580http://www.anandtech.com/show/4008/nvidias-geforce-gtx-580 [5] R. Suda and D. Ren. Accurate Measurements and Precise Modeling of Power Dissipation of CUDA Kernels towards Power Optimized High Performance Computing. International Conference on Parallel and Distributed Computing, Applications and Technologies. IEEE Computer Society. pp. 432-438. 2009. [6] S. Hong and H. Kim. An Integrated GPU Power and Performance Model. ISCA '10 Proceedings of the 37th annual international symposium on Computer architecture. ACM, New York, NY, USA. pp. 280-289. 2010. [7] P. Bakkum and K. Skadron. Accelerating SQL Database Operations on a GPU with CUDA. GPGPU '10 Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units. ACM, New York, NY, USA. pp. 94-103. B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander. Relational Joins on Graphics Processors. SIGMOD '08 Proceeding on the 2008 ACM SIGMOD international conference on Management of data. ACM, New York, NY, USA. pp. 511-524. 2008.


Download ppt "Heterogeneous CPU/GPU co- processor clusters Michael Fruchtman."

Similar presentations


Ads by Google