Presentation is loading. Please wait.

Presentation is loading. Please wait.

Channel Equalization in MIMO Downlink and ASIP Architectures Predrag Radosavljevic Rice University March 29, 2004.

Similar presentations


Presentation on theme: "Channel Equalization in MIMO Downlink and ASIP Architectures Predrag Radosavljevic Rice University March 29, 2004."— Presentation transcript:

1 Channel Equalization in MIMO Downlink and ASIP Architectures Predrag Radosavljevic Rice University March 29, 2004

2 Wireless System Downlink transmission in MIMO wireless system Physical layer of the mobile handset Linear channel equalization Hardware implementation using ASIP architectures

3 Motivation MIMO Downlink and Equalization MIMO: high data rate and high spectral efficiency Interference from each antenna that introduces MAI DS-CDMA signals in multipath environment – user orthogonality is destroyed which causes ISI Solution: powerful channel equalization to mitigate ISI and MAI in order to restore user’s orthogonality Chip level channel equalization based on iterative CG and adaptive LMS algorithms

4 Motivation ASIP Hardware Implementation Future generations of mobile handsets: high speed, flexibility and low power Traditional approaches: ASIC and DSP processors ASIC: No flexibility: Family of ASICs are needed High probability of design errors, high design cost DSP: Not optimized for a given application Often limited instruction and data level parallelism ASIP: Tradeoff between efficiency of ASICs and flexibility of DSPs

5 Thesis Contributions Channel equalization in broad range of environments 16-bit fixed point implementation Flexible ASIP architecture design Same hardware - different equalization (slow/fast fading, CG/LMS) Extension of ASIP instruction set with application-specific operations Customized architecture: Real-time requirements for 1xEV-DV standard (1.2288 Mc/s) Reasonable clock frequency (up to 150MHz) and power dissipation Automatic hardware design: from C to gate level Hardware synthesis for FPGA and CMOS libraries

6 Outline Data model Channel equalization ASIP hardware implementations Conclusions and future work

7 Data Model: Transmission Side Alternating symbols over transmit antennas Spreading: orthogonality between users Scrambling: Reduction of inter-cell interference Transmission over multipath correlated channels

8 Receiver Implementations RAKE Receiver, Multiuser Detector, Kalman filter, LMMSE equalization RAKE: Deteriorated performance in highly loaded system Not appropriate for MIMO environments Multiuser Detectors: High computational complexity Limited knowledge about the activity of other users Kalman filter: Optimal solution in the sense of MSE Prohibitive complexity in MIMO environments

9 LMMSE Equalization Lower complexity in comparison with other receivers Independent on the number of users Iterative Solutions Good performance in highly scattered environments LMMSE Receiver

10 LMMSE Equalization Linear system to be solved: Covariance: block Toeplitz and positive definite A and B: Toeplitz Hermitian matrices C: Toeplitz matrix

11 LMMSE Approaches LMMSE solution: Cholesky decomposition More complex hardware primitives Conjugate Gradient (CG) Iterative solution, fast convergence Block algorithm – modifications for fast fading channels Least Mean Square (LMS) Adaptive algorithm Sensitivity to learning step

12 Equalization in Time-Varying Channels Spatially correlated, frequency selective (multipaths), fading channels Data-rate: 1.2288MChips/sec Antenna correlation: Base Station: 50.18% Mobile: 43.99%

13 Channel Equalization: CG Algorithm N samples: 4096 in slow fading channels

14 CG Equalization in Veh. A 30km/h Sliding Window (SW) approach Faster variations: more frequent update of filter coefficients

15 CG Equalization: Velocity of 120km/h Multiple sub-blocks instead of two blocks Partial channel estimation for each sub-block Apply weights for global channel estimation: Weights are adjusted according to the channel variations If channel fading is faster, faster the coefficients drop to 0

16 Architectural Alternative: LMS Equalization Adaptive LMS:

17 Performance: Slow Fading Environments From 32-bit floating to 16-bit fixed point Control of quantization error Pedestrian A – 3km/h Pedestrian B – 10km/h

18 Performance: Vehicular A 30km/h CG with sliding window (CG-SW): Improvement in comparison with basic CG

19 CG–SW Approach: Fixed Point 32-bit floating point and 16-bit fixed point About 1 % BER difference Vehicular A – 30km/h

20 Performance: Velocity of 120km/h CG with sliding window and weights averaging CG-SW-WA with different numbers of sub-blocks Performance improvement if weights are applied Pedestrian A - 120km/hVehicular A 120km/h

21 Computational Complexity Number of operations per chip in 1 second CG filter update is less complex Reason: block-level filter update algorithm

22 Directions for Architecture Implementation Equalization in different environments Block CG, adaptive LMS for slow fading environments Modifications of CG for fast fading channels Different computational complexity and amount of parallelism Flexible hardware for different equalizations and CG modifications Programmable architecture Application specific

23 ASIP Architecture for Equalization: Required Features Flexible architecture able to operate in different channel environments Slow/fast fading Low/high scattering Architecture customization Implementation of application-specific operations Instruction and data level parallelism Fast execution of complex algorithms Automatic hardware-software co-design Fast processor design starting from C/C++ code of application

24 ASIP Architecture Based on TTA Flexible architecture No limitations to add new FUs, buses, registers Customizable architecture Implementation of Special Function Units (SFUs) Instruction and data level parallelism VLIW architecture principle Efficient and parallel data flow Fast processor design Automatic search for best processor VHDL processor representation

25 General Structure of TTA Transport of operands triggers the appropriate operation as a side effect Only one instruction: “move” instruction 32-bit architecture

26 TTA Design Flow: MOVE Tool Design space exploration for optimal architecture

27 Customization of ASIP Implementation of application specific operations User-defined Special Function Units (SFUs) Sacrificing architecture generality for optimization and performance improvement Designed SFUs: Real multiplication with shifting ability Complex multiplication with shifting Sub-word arithmetic operations Sign-test and add/subtract

28 SFU: Complex Multiplication Reduction of data transports between FUs Less number of buses and smaller interconnection network Smaller instruction word Instruction and data parallelism is placed inside CXMUL

29 Performance Improvement with SFUs Bus reduction of 50% Instruction word length reduction of about 50%

30 TTA Processors for MIMO Equalization 1. Two co-processors (CG equalization) Co-processor for updating equalizer coefficients Co-processor for filtering and user detection 2. Single processor for all parts of equalization algorithm (CG/LMS equalization) Identical architectures for slow and fast fading environments

31 Single Processor vs. Two Coprocessors Single processor Smaller area and power dissipation Higher clock frequency

32 Processor Flexibility Identical customized processor for broad range of channel environments Identical processor for LMS and CG equalization

33 Example of Designed Processor Coprocessor for CG filter update

34 Hardware synthesis design flow MOVEGen: generates VHDL representation of processor core Xilinx tools for fast FPGA prototyping Mentor Graphics tools for CMOS gate level design

35 VHDL Template of TTA Processor Automatic VHDL generation of processor core, control and interconnection FUs, SFUs, peripherals: pre-designed or defined by user

36 MoveProc Synthesis on Xilinx FPGA CG/LMS equalizer including user detection no SFUs 32 buses Xilinx FPGA part: XC2V8000 Slices: 38,757 out of 46,592 BRAMs: 148 out of 168 IOBs: 263 out of 1108 MULT18x18s: 24 out of 168

37 MoveProc Synthesis on Xilinx FPGA Customized CG/LMS equalizer including user detection with SFUs 16 buses Xilinx FPGA part: XC2V6000 Slices: 21,126 out of 33,792 BRAMs: 107 out of 144 IOBs: 229 out of 1104 MULT18x18s: 11 out of 144

38 Gate Level CMOS Synthesis Mentor Graphics Tools 0.5  CMOS library Customized CG/LMS equalizer including user detection (with SFUs) Synthesis estimate of processor core: 182,887 gates

39 Conclusions Equalization algorithms for broad range of channel environments Slow fading: CG/LMS Fast fading: Modifications of basic CG equalization ASIP architecture design based on TTA Same architecture – different equalization algorithms Optimization with application-specific operations Reasonable frequency and power dissipation for 3GPP data rate Fast processor design VHDL representation of optimal processor FPGA synthesis and CMOS gate level synthesis

40 Future Work Processor layout synthesis IC Station software tool from Mentor Graphics Precise timing, area, and power analysis Implementation of hybrid word length Reduced precision for filter application part Implementation on C5x DSP for comparison

41 Acknowledgements Thanks to: Professor Cavallaro Dr. De Baynast Professor Aazhang Dr. Dabak Dr. Sabharwal Texas Instruments Nokia


Download ppt "Channel Equalization in MIMO Downlink and ASIP Architectures Predrag Radosavljevic Rice University March 29, 2004."

Similar presentations


Ads by Google