Presentation is loading. Please wait.

Presentation is loading. Please wait.

Martin Kruliš 26. 11. 2015 by Martin Kruliš (v1.0)1.

Similar presentations


Presentation on theme: "Martin Kruliš 26. 11. 2015 by Martin Kruliš (v1.0)1."— Presentation transcript:

1 Martin Kruliš 26. 11. 2015 by Martin Kruliš (v1.0)1

2  Interoperability ◦ Allows CUDA code to read/write graphical buffers  Works with OpenGL and Direct3D libraries ◦ Motivation  Direct visualization of complex simulations  Augmenting 3D rendering with visualization routines which are difficult to implement in shaders ◦ How it works  The graphics resource is registered and represented by struct cudaGraphicResource  The resource may be mapped to CUDA memory space cudaGraphicsMapResources(), … 26. 11. 2015 by Martin Kruliš (v1.0)2

3  Initialization ◦ Device must be selected by cudaGLSetGLDevice()  Resources ◦ cudaGraphicsGLRegisterBuffer() for buffers  The mapped buffers can be accessed in the same way as CUDA allocated memory ◦ cudaGraphicsGLRegisterImage() for images and render buffers  The image buffers can be also accessed through texture and surface mechanisms 26. 11. 2015 by Martin Kruliš (v1.0)3 Examples

4  Direct3D Support ◦ Versions 9, 10, and 11 are supported  Each version has its own API ◦ CUDA context may operate with one Direct3D device at a time  And special HW mode must be set on the device ◦ Initialization is similar to OpenGL cudaD3D[9|10|11]SetDirect3DDevice() ◦ Available Direct3D resources  Buffers, textures, and surfaces  All using cudaGraphicsD3DXXRegisterResource() 26. 11. 2015 by Martin Kruliš (v1.0)4

5  GPU SLI Mode ◦ Multiple GPUs are interconnected (physically) and cooperating in rendering the scene  AFR mode – different GPUs render subsequent frames ◦ CUDA interoperability issues  Any CUDA allocation on one GPU is automatically performed on all SLI-connected GPUs  CUDA has to use separate contexts for each GPU  cudaGLGetDevices() – identify, which devices are in SLI  cudaGLDeviceListAll  cudaGLDeviceListCurrentFrame  cudaGLDeviceListNextFrame 26. 11. 2015 by Martin Kruliš (v1.0)5

6 26. 11. 2015 by Martin Kruliš (v1.0)6

7 Martin Kruliš 26. 11. 2015 by Martin Kruliš (v1.0)7

8  CUDA Basic Linear Algebra Subroutines ◦ CUDA implementation of standard BLAS library ◦ Complete support of all 152 functions on vectors/matrices  copy, move, rotate, swap  maximum, minimum, multiply by scalar  sum, dot products, Euclidean norms  matrix multiplications, inverses, linear combinations ◦ Some operations have batch versions ◦ Supports floats, doubles, and complex numbers 26. 11. 2015 by Martin Kruliš (v1.0)8

9  CUDA Sparse Linear Algebra ◦ Open source C++ library for sparse linear structures (matrices, linear systems, …) ◦ Key features  Sparse matrix operations (add, substraction, max independent set, polynomial relaxation, …)  Supports various matrix formats  COO, CSR, DIA, ELL, and HYB ◦ Require CUDA CC 2.0 or higher 26. 11. 2015 by Martin Kruliš (v1.0)9

10  CUDA Fast Fourier Transform ◦ Decompose signal to frequency spectrum ◦ 1-3D transforms (up to 128M elements) ◦ Many variations (precision, complex/real types, …) ◦ API similar to FFTW library  Create plan ( cufftHandle ) which holds the configuration  Associate/allocate work space (buffers)  cufftExecC2C() (or R2C, C2R ) starts execution ◦ FFT plan can be associated with CUDA stream  For synchronization and overlapping 26. 11. 2015 by Martin Kruliš (v1.0)10

11  CUDA Thrust ◦ C++ template library based on STL API ◦ Basic idea is to develop C++ parallel applications with minimal overhead ◦ STL like vectors (for devices) and vector operations  copy, fill, create sequences, reordering, sorting, … ◦ Algorithms  Transformations  Reductions  Prefix-sums 26. 11. 2015 by Martin Kruliš (v1.0)11

12  GPU AI for Board Games ◦ Specific AI library designed for games with large, but well-defined configuration space ◦ Requires CUDA CC 2.0 ◦ Currently supports  Game Tree Split – alpha/beta pruning  Single and multiple recursion (with large depths)  Zero-sum games (3D Tic-Tac-Toe, Reversi, …)  Sudoku backtracking generator and solver  Statistical simulations (Monte Carlo for Go) 26. 11. 2015 by Martin Kruliš (v1.0)12

13  PhysX ◦ Realtime physics engine ◦ Originally developed by Ageia for PPU card  NVIDIA bought it and re-implemented it for CUDA ◦ Most important features  Simulation of rigid bodies (collisions, destruction)  Cloths and fluid particle systems  APEX ◦ Framework built on top of PhysX ◦ Designed for easy usage (artists, games, …) 26. 11. 2015 by Martin Kruliš (v1.0)13

14 26. 11. 2015 by Martin Kruliš (v1.0)14


Download ppt "Martin Kruliš 26. 11. 2015 by Martin Kruliš (v1.0)1."

Similar presentations


Ads by Google