Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cost-Effective Pipeline FFT/IFFT VLSI Architecture for DVB-H System Present by: Yuan-Chu Yu Chin-Teng Lin and Yuan-Chu Yu Department of Electrical and.

Similar presentations


Presentation on theme: "Cost-Effective Pipeline FFT/IFFT VLSI Architecture for DVB-H System Present by: Yuan-Chu Yu Chin-Teng Lin and Yuan-Chu Yu Department of Electrical and."— Presentation transcript:

1 Cost-Effective Pipeline FFT/IFFT VLSI Architecture for DVB-H System Present by: Yuan-Chu Yu Chin-Teng Lin and Yuan-Chu Yu Department of Electrical and Control Engineering, National Chiao Tung University, Taiwan.

2 NST2007-2 Outline Introduction Radix-4 2 and Radix-4 3 based FFT/IFFT Algorithms Pipeline 4096-Point R4 2 SDF and R4 3 SDF based FFT/ IFFT VLSI Architecture Radix-4 Butterfly Memory Structure Constant Multiplier Eight-Folded Complex Multiplier Comparison Results Conclusion

3 NST2007-3 Introduction The OFDM modulation : low receiver complexity and high performance on highly dispersive channels Handheld consumer products: High throughput, low power and hardware efficient FFT/IFFT processor The 4K mode in digital video broadcasting – Handheld (DVB-H) system: 4096-point FFT/IFFT processor Pipeline architecture: regularity, lower operating frequency and high throughput Multipath delay commutator (MDC) architecture [4] Single-path delay feedback (SDF) architecture [4, 5, 6]: low hardware cost and high cost-efficiency with a tightly scheduling arithmetic operations

4 NST2007-4 Outline Introduction Radix-4 2 and Radix-4 3 based FFT/IFFT Algorithms Pipeline 4096-Point R4 2 SDF and R4 3 SDF based FFT/ IFFT VLSI Architecture Radix-4 Butterfly Memory Structure Constant Multiplier Eight-Folded Complex Multiplier Comparison Results Conclusion

5 NST2007-5 The FFT of the N-point input x[n] is defined as where Applying a 3-D linear index map where 0 ≦ n1, n2, k1, k2 ≦ 3 The common factor algorithm (CFA) form: Radix-4 2 FFT/IFFT based Algorithms (1)

6 NST2007-6 First butterfly structure Second butterfly structure Applying the CFA procedure recursively to the remaining FFTs of length N/16. Low multiplicative complexity as radix-16 algorithm Low hardware cost as radix-4 algorithm Similar radix-4 butterfly structure with only some sign inversions in IFFT computation. Radix-4 2 FFT/IFFT based Algorithms (2) Constant Multiplier

7 NST2007-7 Applying a 4-D linear index map where 0 ≦ n1, n2, n3, k1, k2, k3 ≦ 3. The common factor algorithm (CFA) form: Radix-4 3 FFT/IFFT based Algorithms (1)

8 NST2007-8 First butterfly structure Second butterfly structure Third butterfly structure Applying the CFA procedure recursively to the remaining FFTs of length N/64. Low multiplicative complexity as radix-64 algorithm Low hardware cost as radix-4 algorithm Similar radix-4 butterfly structure with only some sign inversions in IFFT computation. Radix-4 3 FFT/IFFT based Algorithms (2) Constant Multiplier

9 NST2007-9 Outline Introduction Radix-4 2 and Radix-4 3 based FFT/IFFT Algorithms Pipeline 4096-Point R4 2 SDF and R4 3 SDF based FFT/ IFFT VLSI Architecture Radix-4 Butterfly Memory Structure Constant Multiplier Eight-Folded Complex Multiplier Comparison Results Conclusion

10 NST2007-10 The Purposed SDF based Architecture R4 2 SDF: 6 radix-4 butterfly stages, 4095-word shift register, 3 constant multipliers and 2 complex multipliers R4 3 SDF: 6 radix-4 butterfly stages, 4095-word shift register, 4 constant multipliers and 1 complex multipliers

11 NST2007-11 Architecture Analysis Multiplication complexity in 4096-point FFT/IFFT computation: R2 2 SDF[5]R4 2 SDFR4 3 SDF Multiplication #1399674253969 Normalize Ratio10.530.26 Hardware requirement in 4096-point FFT/IFFT computation: R2 2 SDF [5]R4 2 SDFR4 3 SDF Butterfly Stages1266 Shifter Register4095 Constant Mul.034 Complex Mul.521

12 NST2007-12 Radix-4 Butterfly Butterfly hardware cost: Four four-input complex adders without multiplier SDF architecture: Fully pipeline with high utilization Highly regular High effective memory structure Simpler routing complexity

13 NST2007-13 Memory Structure and Timing Sequence Four Modes in Butterfly : 1.Mode 0~2: data reordering 2.Mode 3: radix-4 FFT/IFFT computation Delay Feedback Memory: 1.Mode 0~2: store serial data input and push FFT/ IFFT result output 2.Mode 3: store FFT/IFFT result and push data output

14 NST2007-14 Constant Multiplier Retrenched Constant Multiplier : 1.Shifters-and-adders 2.Complex conjugate Symmetry Rule: 83% 3.Sub-expression Elimination Algorithm [8]: 20% for shifters, 50% for adders Constant Multiplier

15 NST2007-15 Eight-Folded Complex Multiplier Retrenched Coefficient ROM Size: 1.Complex Conjugate Symmetry Rule 2.Sub-expression Elimination [8] HAddr. Mode ROM Addr. Data Mode ROM data 0~51100a+jb 512~10231H[9:0]1b+ja 1024~153502-b-ja 1536~20471H[9:0]3-a+jb 2048~255904-a-jb 2560~30711H[9:0]5-b-ja 3072~358306b-ja 3584~40951H[9:0]7a-jb

16 NST2007-16 Outline Introduction Radix-4 2 and Radix-4 3 based FFT/IFFT Algorithms Pipeline 4096-Point R4 2 SDF and R4 3 SDF based FFT/ IFFT VLSI Architecture Radix-4 Butterfly Memory Structure Constant Multiplier Eight-Folded Complex Multiplier Comparison Results Conclusion

17 NST2007-17 Hardware Cost Comparisons Area Conversion [5][9]: complex mult. and memory are 50 and 1.3 complex adders Pipeline Architecture Mult. Complexity Complex Mult. # Complex Adders # Complex Mem. # Area Index ( 4096-Points ) R2SDFRadix-2log 2 N-22log 2 NN-15847.5 R4SDFRadix-4log 4 N-18log 4 NN-15621.5 R8SDFRadix-8log 8 N-1(24+2T)log 8 NN-15609.5 R2 2 SDFRadix-2 2 log 4 N-14log 4 NN-15597.5 R2 3 SDFRadix-2 3 2(log 8 N-1)6log 8 NN-15647.5 R2MDCRadix-2log 2 N-22log 2 N1.5N-28508.6 R2 2 MDCRadix-2 2 log 2 N-22log 2 N1.5N-28508.6 R4MDCRadix-43log 4 N-34log 2 N2.5N-414104.8 R8MDCRadix-87log 8 N-7(24+2T)log 8 N4.5N-830664.4 R4 2 SDFRadix-4 2 log 16 N-1(16+T)log 16 NN-15470.5 R4 3 SDFRadix-4 3 log 64 N-1 (24+2T)log 64 N N-15429.5

18 NST2007-18 Hardware Utilization Rate Comparisons Pipeline Architecture Utilization Rate of Complex Mult. Utilization Rate of Complex Adders Utilization Rate of Complex Mem. R2SDF50 % 100 % R4SDF75 %25 %100 % R8SDF87.5 %12.5 %100 % R2 2 SDF75 %50 %100 % R2 3 SDF87.5 %50 %100 % R2MDC50 % R2 2 MDC37.5 %50 % R4MDC25 % R8MDC12.5 % R4 2 SDF87.5 %56.25 %100 % R4 3 SDF96.9 %60.42 %100 %

19 NST2007-19 Outline Introduction Radix-4 2 and Radix-4 3 FFT/IFFT based Algorithms Pipeline 4096-Point R4 2 SDF and R4 3 SDF based FFT/ IFFT VLSI Architecture Radix-4 Butterfly Memory Structure Constant Multiplier Eight-Folded Complex Multiplier Comparison Results Conclusion

20 NST2007-20 Conclusion The proposed R4 2 SDF and R4 3 SDF design achieve the high cost effective advantages Lower multiplicative complexity as radix-16 and radix-64 algorithm Lower hardware cost (smaller chip cost) Higher hardware utilization rate The R4 3 SDF design achieve the better performance than R4 2 SDF and other pipeline architecture in 4096-points FFT/IFFT processor design.

21 NST2007-21 References [1] ETSI, “Digital Video Broadcasting (DVB): Transmission System for Handheld Terminals (DVB- H),” ETSI EN302304. [2] R. K. Kolagotla, J. Fridman, M. M. Hoffiman, W. C. Anderson, B. C. Aldrich, D. B. Witt, M. S. Allen, R. R. Dunton and L. A. Booth, “ A 333-MHz dual-MAC DSP architecture for next- generation wireless application,” IEEE Inter. Conf. on Acou., Speech, and Signal Proc., vol. 2, pp. 1013-1016, May 2001. [3] W. Li and L. Wanhammar, “A pipeline FFT processor,” in Proc. IEEE Workshop on Signal Processing Systems, 1999, pp. 654-662. [4] S. He and M. Torkelson, “Designing pipeline FFT processor for OFDM (de)modulation, “ in Proc. URSI Int. Symp. Signals, Syst., Electron., pp. 257-262, 1998. [5] Wei-Hsin Chang, Truong Nguyen, “An OFDM-specified lossless FFT architecture, “ IEEE Trans. on Circuits and Systems I, vol. 53, issue 6, pp. 1235-1243, June 2006. [6] W. C. Yeh and C. W. Jen, “High-speed and low-power split-radix FFT,” IEEEE Trans. on Signal Processing, vol. 51, no. 3, pp. 864-874, Mar. 2003. [7] C. S. Burrus, “Index mapping for multidimensional formulation of the DFT and convolution, ” IEEE Trans. Acoust., Speech, Signal Processing, ASSP-25(3): 239-242, June 1977. [8] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. New York: Wiley, 1999. [9] T. Sansaloni, A. Perez-Pascual, V. Torres and J. Valls, “Efficient pipeline FFT processors for wireless LAN MIMIO-OFDM systems”, Electronics Letters, vol. 41, no. 19, Sep. 2005.

22 NST2007-22 Contact Information Address: No. 12, Innovation 1st Rd., Science-Based Industrial Park, Elan Microelectronics Corporation, Hsinchu City, Taiwan 308 R.O.C. E-mail: vincent_yu@emc.com.tw Thanks for your attention!!


Download ppt "Cost-Effective Pipeline FFT/IFFT VLSI Architecture for DVB-H System Present by: Yuan-Chu Yu Chin-Teng Lin and Yuan-Chu Yu Department of Electrical and."

Similar presentations


Ads by Google