Presentation is loading. Please wait.

Presentation is loading. Please wait.

Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.

Similar presentations


Presentation on theme: "Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International."— Presentation transcript:

1 Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International Conference on 2013/12/191

2 Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/192

3 Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/193

4 Introduction The ubiquitous presence of multiple cores (at least one GPU) Efficient parallelism exploitation 2013/12/194

5 Introduction Motivation : to determine the division of workload between CPU and GPU an analytical performance model for scheduling tasks among CPU and GPU cores, such that the global execution time of the overall data parallel pattern is optimized 2013/12/195

6 Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/196

7 Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes To decide whether the parallelism exhibited by the application is suitable for GPUs may be solved by looking at only those parallel patterns that fit the GPU execution model, that is considering data parallel patterns only. To decide how to use the CPU while the GPU is computing 2013/12/197

8 Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Figuring out whether or not it is beneficial to split a data parallel computation among CPU and GPU cores Figuring out the percentage of tasks to be run on CPU and GPU cores 2013/12/198

9 Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes MAPE loop : Monitor Analyze Plan Execute 2013/12/199

10 Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/1910

11 Evaluating The CPU/GPU Tradeoff Two node CPU + main memory GPU + GPU memory First system owns the data, part of which must be sent to the second system Data copy between main memory and GPU memory : Setup and data transmission one core of the CPU → K cores 2013/12/1911

12 Evaluating The CPU/GPU Tradeoff 2013/12/1912

13 Evaluating The CPU/GPU Tradeoff 2013/12/1913

14 Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time Total execution time 2013/12/1914

15 Evaluating The CPU/GPU Tradeoff N → O(N) → N matrix multiplication : 2N 2 → O(N 3 ) → N 2 2013/12/1915

16 Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time 2013/12/1916

17 Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time 2013/12/1917

18 Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/1918

19 Experimental Results 2013/12/1919 Experiment platform

20 Experimental Results Benchmark b1 Computing the matrix whose elements are the square of the corresponding elements in the input matrix N → O(N) → N Benchmark b2 The simplest matrix multiplication algorithm (three nested loops, no blocking, no further optimization) 2N 2 → O(N 3 ) → N 2 2013/12/1920

21 Experimental Results 2013/12/1921

22 Experimental Results 2013/12/1922

23 Experimental Results Reduce sum Reduce min 2013/12/1923

24 Experimental Results 2013/12/1924

25 Experimental Results 2013/12/1925

26 Experimental Results P, 0.8P, 0.9P, 1.1P, 1.2P 2013/12/1926

27 Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/1927

28 Conclusions The main contribution of this work Computing the ratio between the number of tasks to be executed on CPU and GPU cores to optimize the completion time The classical map and reduce patterns which uses CPU and GPU cores according to the ratio computed by the model where the combined execution of tasks on GPU and CPU cores 2013/12/1928

29 Q&A 2013/12/1929

30 Thank you for listening 2013/12/1930


Download ppt "Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International."

Similar presentations


Ads by Google