Presentation is loading. Please wait.

Presentation is loading. Please wait.

FFT Accelerator Project Rohit Prakash(2003CS10186) Anand Silodia(2003CS50210) Date : February 23,2007.

Similar presentations


Presentation on theme: "FFT Accelerator Project Rohit Prakash(2003CS10186) Anand Silodia(2003CS50210) Date : February 23,2007."— Presentation transcript:

1 FFT Accelerator Project Rohit Prakash(2003CS10186) Anand Silodia(2003CS50210) Date : February 23,2007

2 Current Objectives Validate the number of complex multiplications Run the code with intel compiler and compare the results – –For single run –For multiple runs Tabulate all the results Analyse these using vTune

3 Number of Complex multiplications Our results –(11/4)*nlog4(n) =8960 Result on net –(3/4)*nlog4(n) = 3840 The inner loop is trivial and does not require any “complex multiplications”

4 Inner loop of our Algorithm T  A[k+j] U  w*A[k+j+m/4] V  w*w*A[k+j+m/2] X  w*w*w*A[k+j+3*m/4] A[k+j]  T+U+V+X A[k+j+m/4]  T+(i)U-V-(i)X A[k+j+2m/4]  T-U+V-X A[k+j+3m/4]  T-(i)U-V+(i)X W  w*w_m Total number of multiplications n this loop : 11

5 New Inner loop of our Algorithm T  A[k+j] U  twiddle[k]*A[k+j+m/4] V  twiddle[2*k]*A[k+j+m/2] X  twiddle[3*k]*A[k+j+3*m/ 4] A[k+j]  T+U+V+X A[k+j+m/4]  T+i*U-V-i*X A[k+j+2m/4]  T-U+V-X A[k+j+3m/4]  T-i*U-V+i*X Total number of multiplications n this loop : 3 (3/4)*nlog4(n) =3840

6 Stuff we tried Improved the “bit reversal” –Better than the last time Though inefficient (O(nlogn)), still works faster than the previous implementation Still there exists many fast algorithms

7 System Specifications Processor: Intel Pentium 4 CPU 3.00Ghz Cache Size: 1MB RAM: 1GB Flags supported : sse, sse2

8 Results User time(ms) for 1024 points (single iteration)

9 Results User time(ms) for 1024 points (10 iterations)

10 Results User time for 4096 points (single iteration)

11 Results User time(ms) for 4096 points (10 iterations)

12 Results User time(ms) for 262144 points (single iteration)

13 Results User time(ms) for 262144 points (10 iterations)

14 Analysis Results are comparable due to the following reasons –Change in bit reversal –Number of computations FFTW : compiling option gcc Got to re-write the code for arbitrary number of points

15 Tabular Representation (1024 points) Time (ms) Recursiv e (single run on icpc) Recursive (single run on g++) Final (single run on icpc) Final (single run on g++) FFTW (single run on icpc) FFTW (single run on g++) Recursive (10 runs on icpc) Recursive (10 runs on g++) Fina l (10 runs on icpc) Fina l (10 runs on g++) FFTW (10 runs on icpc) FFTW (10 runs on g++) Real1113109 9285610171110 User461231214621041 System224455564547

16 Tabular Representation (4096 point) Time (ms) Recursiv e (single run on icpc) Recursiv e (single run on g++) Final (singl e run on icpc) Final (singl e run on g++) FFT W (singl e run on icpc) FFT W (singl e run on g++) Recursiv e (10 runs on icpc) Recursiv e (10 runs on g++) Fina l (10 runs on icpc ) Fina l (10 runs on g++ ) FFT W (10 runs on icpc) FFT W (10 runs on g++) Real1829101311109622112491312 User102335349021554144 System433642353656

17 Tabular Representation (262144 point) Time (ms) Recursive (single run on icpc) Recursive (single run on g++) Final (single run on icpc) Final (single run on g++) FFTW (single run on icpc) FFTW (single run on g++) Recursive (10 runs on icpc) Recursive (10 runs on g++) Final (10 runs on icpc) Final (10 runs on g++) FFTW (10 runs on icpc) FFTW (10 runs on g++) Real889197110843090879541216525833836601604 User77918358240260618400204935563811579578 System111132222522 1138102923 22 1821

18 Vtune Analysis TODO Vtune (not available)

19 Further Improvements Fast digit reversal Fast “twiddle compute” TODO: –Comparison with Intel Math Kernel library –Study FFTW implementation –Vtune Analysis Try winograd algorithm Code more efficiently

20 References Alan H. Karp “Bit Reversal on Uniprocessors” Angelo A. Yong “A better FFT Bit-reversal Algorithm”

21 Thank You


Download ppt "FFT Accelerator Project Rohit Prakash(2003CS10186) Anand Silodia(2003CS50210) Date : February 23,2007."

Similar presentations


Ads by Google