Presentation is loading. Please wait.

Presentation is loading. Please wait.

Feb. 14, 2001Parallel Processing1 Parallel Processing (CS 730) Lecture 7: Shared Memory FFTs * Jeremy R. Johnson Wed. Feb. 14, 2001 *Parts of this lecture.

Similar presentations


Presentation on theme: "Feb. 14, 2001Parallel Processing1 Parallel Processing (CS 730) Lecture 7: Shared Memory FFTs * Jeremy R. Johnson Wed. Feb. 14, 2001 *Parts of this lecture."— Presentation transcript:

1 Feb. 14, 2001Parallel Processing1 Parallel Processing (CS 730) Lecture 7: Shared Memory FFTs * Jeremy R. Johnson Wed. Feb. 14, 2001 *Parts of this lecture was derived from chapters IX in Lipson.

2 Feb. 14, 2001Parallel Processing2 Introduction Objective: To derive and implement a shared-memory parallel program for computing the fast Fourier transform (FFT). Topics –Derivation of the FFT Recursive version Iterative version –A parallel divide & conquer algorithm using threads –A parallel loop version using OpenMP –Obtaining additional parallelism

3 Feb. 14, 2001Parallel Processing3 FFT as a Matrix Factorization Compute y = F n x, where F n is n-point Fourier matrix.

4 Feb. 14, 2001Parallel Processing4 Matrix Factorizations and Algorithms function y = fft(x) n = length(x) if n == 1 y = x else % [x0 x1] = L^n_2 x x0 = x(1:2:n-1); x1 = x(2:2:n); % [t0 t1] = (I_2 tensor F_m)[x0 x1] t0 = fft(x0); t1 = fft(x1); % w = W_m(omega_n) w = exp((2*pi*i/n)*(0:n/2-1)); % y = [y0 y1] = (F_2 tensor I_m) T^n_m [t0 t1] y0 = t0 + w.*t1; y1 = t0 - w.*t1; y = [y0 y1] end

5 Feb. 14, 2001Parallel Processing5 Rewrite Rules

6 Feb. 14, 2001Parallel Processing6 FFT Variants Cooley-Tukey Recursive FFT Iterative FFT Vector FFT (Stockham) Vector FFT (Korn-Lambiotte) Parallel FFT (Pease)

7 Feb. 14, 2001Parallel Processing7 Tensor Permutations A natural class of permutations compatible with the FFT. Let  be a permutation of {1,…,t} Mixed-radix counting permutation of vector indices Well-known examples are stride permutations and bit- reversal. 

8 Feb. 14, 2001Parallel Processing8 Example (Stride Permutation)

9 Feb. 14, 2001Parallel Processing9 Example (Bit Reversal)

10 Feb. 14, 2001Parallel Processing10 Iterative Cooley-Tukey Algorithm Stage 0 Stage 1 Stage 2 Stage 3 R

11 Feb. 14, 2001Parallel Processing11 Iterative Cooley-Tukey Algorithm Stage 0 Stage 1 Stage 2 Stage 3 R

12 Feb. 14, 2001Parallel Processing12 Modified Pease Algorithm

13 Feb. 14, 2001Parallel Processing13 Iterative Implementation function y = ifft2(x) % Input: x a vector of length n. n = 2^t, t an integer, t >= 0. % Output: y = F_{2^t} x % Algorithm: Iterative. % F_{2^t} = { Prod_{c=1}^t I_{2^{t-c}}) % T^{2^{t-c+1}}_{2^{t-c}}) } R^{2^t} n = length(x); t = ceil(log2(n)); xt = bitreversal(x); yt = zeros(n,1); for c=t:-1:1 m = 2^(c-1); p = 2^(t-c); % W = W_p(omega_{2p}) W = exp((2*pi*i)/(2*p)*-(0:p-1)'); % yt = I_p)xt for j=0:m-1 % y^{2p}_{j*2p+1} = I_p)T^{2p}_p x^{2p}_{j*2p+1} % = I_p)(I_p $ W) x^{2p}_{j*2p+1} xt((j*2+1)*p+1:(j+1)*2*p) = W.* xt((j*2+1)*p+1:(j+1)*2*p); yt(j*2*p+1:(j*2+1)*p) = xt(j*2*p+1:(j*2+1)*p) + xt((j*2+1)*p+1:(j+1)*2*p); yt((j*2+1)*p+1:(j+1)*2*p) = xt(j*2*p+1:(j*2+1)*p) - xt((j*2+1)*p+1:(j+1)*2*p); end xt = yt; end y = yt;

14 Feb. 14, 2001Parallel Processing14 Iterative Implementation function y = ipfft2(x) % In-place Pease FFT algorithm. % Input: x a vector of length n. n = 2^t, t an integer, t >= 0. % Output: y = F_{2^t} x % Algorithm: Conjugated Pease. % F_{2^t} = { Prod_{c=1}^t F_2)T_c L^n_{2^c} R^{2^t} % n = length(x); t = ceil(log2(n)); y = bitreversal(x); w = exp(-2*pi*i/n); for c=t-1:-1:0 for r=0:2^(t-1)-1 r0 = mod(r,2^c); r1 = floor(r/2^c); a0 = r0*2^(t-c) + r1; a1 = a0 + 2^(t-c-1); y0 = y(a0+1); y1 = w^(r1*2^c) * y(a1+1); y(a0+1) = y0 + y1; y(a1+1) = y0 - y1; end


Download ppt "Feb. 14, 2001Parallel Processing1 Parallel Processing (CS 730) Lecture 7: Shared Memory FFTs * Jeremy R. Johnson Wed. Feb. 14, 2001 *Parts of this lecture."

Similar presentations


Ads by Google