Presentation is loading. Please wait.

Presentation is loading. Please wait.

FFT VLSI Implementation

Similar presentations


Presentation on theme: "FFT VLSI Implementation"— Presentation transcript:

1 FFT VLSI Implementation
VLSI Signal Processing 台灣大學電機系 吳安宇 Shousheng He and Mats Torkelson, A new approach to pipeline FFT processor. IEEE Proc. Of IPPS, P , 1996. E. Bidet, D. Castelain, C. Joanblanq, and P. Senn, A fast single-chip implementation of 8192 complex point FFT. IEEE J. Solid-State Circuits, P , March 1995

2 FFT Review

3 Implementation --- Two Extreme Method
Slow  Speed Fast Small  Area Large Complicated  Control Simple

4 Design Consideration System Requirement
e.g., speed, area,power … Trade-off in these two cases, we need More Processing Elements (PE’s) Better Processing Element Utilization Rate Better Control Scheme

5 FFT Processor --- Block Diagram

6 Some Current Themes Radix-2 Multi-path Delay Commutator. ( N = 16 )
Radix-2 Single-path Delay Feedback. ( N = 16 )

7 Some Current Themes (cont.)
Radix-4 Single-path Delay Feedback. ( N = 256 ) Radix-4 Multi-path Delay Commutator. ( N = 256 ) Radix-4 Single-path Delay Commutator. ( N = 256 )

8 Distinctive merit of the above
The delay-feedback are more efficient than delay-commutator in terms of memory utilization Radix-4 has higher multiplier utilization ,however,Radix-2 has simpler BF which are better utilized

9 Comparison Radix / Speed
Low  High Control Theme Simple  Complex Processing Ability / Unit Low  High Combine the advantages  Further decompose high radix PE

10 Decompose Method (1) Simply ‘‘reuse’’ the repeated micro unit
A radix-4 PE

11 Decompose Method (2) From algorithm level Applying 3 index:
n=<n1*N/2 + n2*N/4 + n3>N k=<k1 + 2k2 + 4k3>N where n1,n2={0,1} ;n3={0~N/4-1} Summation of n1

12 Decompose Method (2) cont.
Summation of n2 Only real-imaginary swapping & sign inversion

13 Graphical Explanation (N=16)
Trivial multiplication

14 Graphical Explanation (cont.)
The Eqs are equivalent to the operations below

15 Circuit of BF2I First N/2 cycles Xr(n) Zr(n+N/2) Xi(n) Zi(n+N/2)
Xr(n+N/2) Zr(n) Xi(n+N/2) Zi(n) Second N/2 cycles

16 Circuit of BF2II Xr(n) Zr(n+N/2) Xi(n) Zi(n+N/2) Xr(n+N/2) Zr(n)
Xi(n+N/2) Zi(n) Swap Re&Im and sign inversion

17 Radix-22 Single-path Delay Feedback
FFT architecture using the above technique, for N=256 Compare with original architecture, for N=256

18 Structural advantage 2 Radix-2 has the same complexity as radix-4,but still retain radix-2 BF structure The stage has non-trivial multiplication Control is simple; synchronization controller address counter for W n

19 Conclusions FFT Applications: Radar Signal Processing, Fast convolution, Spectrum Estimation, OFDM-based Modulation/demodulations Efficient VLSI architectures (parallel processing) are required for real-time processing. However, most systems still employ DSP processors (e.g., TI C3x/C5x) for computations (fast algorithms like DIT and DIF FFT). VLIW (Very Long-length Instruction Word)-based processors (TI C6x) need new programming skills to utilize the two parallel MAC units.


Download ppt "FFT VLSI Implementation"

Similar presentations


Ads by Google