Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Fast Fourier Transform Compiler Silvio D Carnevali.

Similar presentations


Presentation on theme: "A Fast Fourier Transform Compiler Silvio D Carnevali."— Presentation transcript:

1 A Fast Fourier Transform Compiler Silvio D Carnevali

2 Contents FFTW and genfft: an introduction genfft: How it works 1.) DAG Creation 2.) Simplifier 3.) Scheduler 4.) Unparsing Conclusion: similar applications

3 genfft special purpose compiler objective Camelot produces DFT subroutines Outputs C code parameterized according to: - Input length - Data type

4 FFTW Collection of “Codelets” Codelets: fragments of C code Generated by genfft plan: optimal composition of codelets  depends on input size and HW  automatically selected by FFTW (FJ98)

5 Performance of FFTW Powers of 2Any powers of 2, 3, 5, 7

6 genfft: creation of the codelet’s DAG Nodes: data types  Encode arithmetic expressions  Use real numbers for C compatibility Generic node = operator Children = operands DAG Algorithm depends on input size

7 DAG creation Algorithms

8 FT Equation X = input vector Y = FT of X  n  n th root of unity

9 genfft: DAG Simplifier Bottom-up traversal of DAG local improvements:  Algebraic transformations (constant folding, +/* simplification)  CSE: eliminate existing + create new ones  DFT-specific improvements

10 Algebraic transformations Simplifies multiplication by 1, 0 or -1 Simplifies addition by 0 Distribution: kx + ky = k(x + y)

11 DFT-Specific improvements Numeric constants made positive (Local)  Constants: generally k and -k  Reduces number of loads DAG transposition (for Linear Function)  Simplifies DAG, transpose + simplify, transpose + simplify  Reduces number of multiplications only

12 DFT-Specific improvements X Y A B 5 4 3 2 X Y A B 5 4 3 2 X Y A B 5 4 3 2 DAG D Simplify DAG E Transpose DAG E T Simplify DAG F T Transpose DAG F Simplify DAG E

13 genfft: DAG Scheduler Goal: minimize use of regs No instruction scheduling Partitions DAG in 2 recursively  register mapping  Optimal for n = 2 k  Partitioning heuristics Optimality? Not for n != 2 k

14 genfft: Unparsing Schedule unparsed to C Pipeline usage managed by C compiler genfft + C compiler: performance problems  egcs “optimizer”

15 Conclusion & future work FFTW: The best of the best of the best… Over 100 downloads every week! genfft: specialized for linear functions  Crystallographic FT  FIR & IIR filters  Image processing (JPEG discrete cosine transform)


Download ppt "A Fast Fourier Transform Compiler Silvio D Carnevali."

Similar presentations


Ads by Google