Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Vector Capabilities of GPUs to Accelerate FFT

Similar presentations


Presentation on theme: "Using Vector Capabilities of GPUs to Accelerate FFT"— Presentation transcript:

1 Using Vector Capabilities of GPUs to Accelerate FFT
Vasily Volkov and Brian Kazian CS 258 Spring 2008

2 Sun Niagara II Specs 16K Instruction/8K Data Caches
8 SPARC 1.4 GHz (up to 8 threads each) 16K Instruction/8K Data Caches 4MB shared L2 Cache One FPU per core Four dual-channel FBDIMM Memory Controllers Theoretical limit of 11 Gflops/s for the 8 FPU’s Extremely large memory bandwidth (60 GB/s)

3 FFT On Niagara Decided to install and benchmark with the FFTW library
Very similar in execution to CUFFT Offers competitive performance on variety of platforms Compiled on Niagara II with pthreads enabled Uses double precision as opposed to G80’s single

4 Single FFT Comparison

5 FFTW with Built-in Threading

6 Batched FFTW

7 Hybrid FFTW

8 Results Found that the Hybrid gave best results
Tune thread count for problem size Limited by the number of threads in comparison to CUDA Issues with data alignment in cache Not stellar performance out of the box with FFTW


Download ppt "Using Vector Capabilities of GPUs to Accelerate FFT"

Similar presentations


Ads by Google