Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.

Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International Conference on 2013/12/191

Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/192

Introduction The ubiquitous presence of multiple cores (at least one GPU) Efficient parallelism exploitation 2013/12/194

Introduction Motivation : to determine the division of workload between CPU and GPU an analytical performance model for scheduling tasks among CPU and GPU cores, such that the global execution time of the overall data parallel pattern is optimized 2013/12/195

Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes To decide whether the parallelism exhibited by the application is suitable for GPUs may be solved by looking at only those parallel patterns that fit the GPU execution model, that is considering data parallel patterns only. To decide how to use the CPU while the GPU is computing 2013/12/197

Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Figuring out whether or not it is beneficial to split a data parallel computation among CPU and GPU cores Figuring out the percentage of tasks to be run on CPU and GPU cores 2013/12/198

Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes MAPE loop : Monitor Analyze Plan Execute 2013/12/199

Evaluating The CPU/GPU Tradeoff Two node CPU + main memory GPU + GPU memory First system owns the data, part of which must be sent to the second system Data copy between main memory and GPU memory : Setup and data transmission one core of the CPU → K cores 2013/12/1911

Evaluating The CPU/GPU Tradeoff 2013/12/1912

Evaluating The CPU/GPU Tradeoff 2013/12/1913

Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time Total execution time 2013/12/1914

Evaluating The CPU/GPU Tradeoff N → O(N) → N matrix multiplication : 2N 2 → O(N 3 ) → N 2 2013/12/1915

Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time 2013/12/1916

Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time 2013/12/1917

Experimental Results 2013/12/1919 Experiment platform

Experimental Results Benchmark b1 Computing the matrix whose elements are the square of the corresponding elements in the input matrix N → O(N) → N Benchmark b2 The simplest matrix multiplication algorithm (three nested loops, no blocking, no further optimization) 2N 2 → O(N 3 ) → N 2 2013/12/1920

Experimental Results 2013/12/1921

Experimental Results Reduce sum Reduce min 2013/12/1923

Experimental Results P, 0.8P, 0.9P, 1.1P, 1.2P 2013/12/1926

Conclusions The main contribution of this work Computing the ratio between the number of tasks to be executed on CPU and GPU cores to optimize the completion time The classical map and reduce patterns which uses CPU and GPU cores according to the ratio computed by the model where the combined execution of tasks on GPU and CPU cores 2013/12/1928

Q&A 2013/12/1929

Thank you for listening 2013/12/1930

Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.

Similar presentations

Presentation on theme: "Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.

Similar presentations

Presentation on theme: "Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International."— Presentation transcript:

Similar presentations

About project

Feedback