Presentation is loading. Please wait.

Presentation is loading. Please wait.

OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.

Similar presentations


Presentation on theme: "OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston."— Presentation transcript:

1 OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston

2 Top 10 Supercomputers (June 2011) 2

3 Why OpenMP Shared memory parallel programming model – Extends C, C++. Fortran Directives-based – Single code for sequential and parallel version Incremental parallelism – Little code modification High-level – Leave multithreading details to compiler and runtime Widely supported by major compilers – Open64, Intel, GNU, IBM, Microsoft, … – Portable www.openmp.org 3

4 OpenMP Example 4

5 Present/Future Architectures & Challenges they pose Node 0 Memory Node 1 Node 2Node 3 Memory accelerator Memory … Many more CPUS Location Heterogeneity Scalability 5 Node 0 Memory Node 1 Node 2Node 3 Memory

6 Heterogeneous Embedded Platform 6

7 Heterogeneous High-Performance Sy stems Each node has multiple CPU cores, and some of the nodes are equipped with additional computational accelerators, such as GPUs. www.olcf.ornl.gov/wp-content/uploads/.../Exascale-ASCR-Analysis.pdf 7

8 Must map data/computations to specific devices Usually involves substantial rewrite of code Verbose code – Move data to/from device x – Launch kernel on device – Wait until y is ready/done Portability becomes an issue – Multiple versions of same code – Hard to maintain Programming Heterogeneous Multicore: Issues Always hardware-specific! 8

9 Programming Models? Today’s Scenario // Run one OpenMP thread per device per MPI node #pragma omp parallel num_threads(devCount) if (initDevice()) { // Block and grid dimensions dim3 dimBlock(12,12); kernel >>(); cudaThreadExit(); } else { printf("Device error on %s\n",processor_name); } MPI_Finalize(); return 0; } www.cse.buffalo.edu/faculty/miller/Courses/CSE710/heavner.pdf 9

10 OpenMP in the Heterogeneous World All threads are equal – No vocabulary for heterogeneity, separate device All threads must have access to the memory – Distributed memories common in embedded systems – Memories may not be coherent Implementations rely on OS and threading libraries – Memory allocation, synchronization e.g. Linux, Pthreads 10

11 Extending OpenMP Example Main Memory Application data General Purpose Processor Cores HWA Application data Device cores Upload remote data Download remote data Remote Procedure call 11

12 Heterogeneous OpenMP Solution Stack OpenMP Application Directives, Compiler OpenMP library Environment variables Runtime library OS/system support for shared memory OpenMP Parallel Computing Solution Stack User layer Prog. layer OpenMP API System layer Core 1Core 2Core n … MCAPI, MRAPI, MTAPI Language extensions Efficient code generation 12 Target Portable Runtime Interface 12

13 Summarizing My Research OpenMP on heterogeneous architectures – Expressing heterogeneity – Generating efficient code for GPUs/DSPs Managing memories – Distributed – Explicitly managed – Enabling portable implementations 13

14 Backup 14

15 MCA: Generic Multicore Programming Solve portability issue in embedded multicore programming Defining and promoting open specifications for – Communication - MCAPI – Resource Management - MRAPI – Task Management - MTAPI (www.multicore-association.org) 15

16 Heterogeneous Platform: CPU + Nvidia GPU 16


Download ppt "OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston."

Similar presentations


Ads by Google