Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro,

Similar presentations


Presentation on theme: "Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro,"— Presentation transcript:

1 Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro, and Behnaam Aazhang This work is supported by Nokia, TI, TATP and NSF

2 Introduction Real-time VLSI architecture for multiuser channel estimation Multiuser channel estimation usually neglected – high computational complexity - DSPs infeasible –Single user sliding correlator structures used Iterative fixed point algorithm developed Area-Time tradeoffs presented –Area-Constrained,Time-Constrained, Area-Time efficient

3 Baseband signal processing Base-Station Receiver Channel estimation DetectionDecoding Multiple Users Antenna Detected Bits Tracking Training

4 Channel estimation Direct Path Reflected Path Noise +MAI User 1 User 2 Base Station compensate for unknown fading amplitudes and asynchronous delays.

5 Need for multiuser channel estimation Detector performance depends on accuracy of channel estimator Multiuser Channel Estimation –Jointly estimate parameters for all users –Better performance than single user estimates

6 Computing multiuser channel estimates Computed by sending a training sequence of known bits to the receiver. When absent, detected bits can be used to update estimates in a decision feedback mode for tracking. Importance of multiuser estimation usually neglected May exceed detector complexity

7 Multiuser Channel Estimation Algorithm = {+1, -1} : Training/Tracking bits = 8-bit integer (complex) : Received signal N = spreading gain (typically fixed,e.g: 32) K = number of users (variable, <=N) = Maximum Likelihood channel estimate

8 Implementation complexity Matrix inversions (size 32x32) per window Unable to meet real-time on DSPs [Asilomar’99] VLSI fixed-point architectures for matrix inversions –Difficult to design, Finite precision problems Typically, simpler single-user sliding correlator structures used.

9 Outline What is multiuser channel estimation? Need for multiuser channel estimation Implementation problems Algorithm enhancements VLSI architectures –Area-constrained,Time-constrained, Area-Time efficient Conclusions

10 Iterative scheme for channel estimation Bit-streaming : suitable for tracking (window length L) Method of gradient descent Stable convergence behavior Simple fixed-point VLSI architecture

11 Simulations - Static multipath channel SINR = 0 dB Paths =3 Preamble =150 Spreading N = 31 Users K = 15

12 Rayleigh Fading channel with tracking Doppler = 10 Kmph

13 Outline What is multiuser channel estimation? Need for multiuser channel estimation Implementation problems Algorithm enhancements VLSI architectures –Area-constrained,Time-constrained, Area-Time efficient Conclusions

14 Area-Time Tradeoffs Design for 32 users (K) and spreading code (N) 32 Target = 128 Kbps (4000 cycles at 500 MHz). Assume single cycle addition/multiplication Area-Constrained Architecture :Pico-cells/fewer users Time-Constrained Architecture : Maximum data rates Area-Time Efficient Architecture : Real-Time

15 Task decomposition: channel estimation IterateCorrelation Matrices (Per Bit) Pilot Bits Pilot MUXMUX Detected Bits Data MUXMUX A O(4K 2 N,8) R br O(2KN,8 ) R bb O(2K 2,8) TIME Channel Estimate to Detector b 0 (2K,1) Tracking Window r 0 (N,8) b(2K,1) r(N,8) L

16 Architecture design: auto-correlation b = {+1,-1} Multiplication is a XNOR operation Matrix updated using XNOR gates Auto-correlation matrix implemented as an UP/DOWN counter(s)

17 Architecture design: cross-correlation b = {+1,-1}, r = 8-bit integer vector (complex) Multiplications reduce to additions/subtractions Matrix (complex) can be updated with 8-bit adders Cross-correlation matrix stored as RAM.

18 Architecture design: channel estimate A = 8-bit integer matrix (complex) µ << 1 : Truncated multiplication [Schulte’93] Matrix-matrix (real-complex) multiplication of integers Forms the bottleneck (8-bit multipliers) Concentrate on multiplication for area-time tradeoffs!

19 Area-Constrained Architecture b0 b Channel Estimate

20 Area-constrained Architecture: Hardware Requirements

21 Time-constrained Architecture b*b T b 0 *b 0 T b b0b0 MUX R br MUXMUX r r0r0 MUXMUX R bb A Mult Subtract >> Subtract 2K*1 K(2K-1)*1 2K 2 *8 2KN*16 2KN*8 2K*1 N*8 2KN*8 Channel Estimate

22 Auto-correlation Update in Parallel R bb (i,j) Counter bb T (i,j) U/D# R bb (i,i) Counter 1 U/D# Array of XNORs a·ba·ca·d b·cb·d c·d bcda b (2K) bb T (K*{2K-1}*1) R bb (2K 2 *8) Array of Counters

23 Cross-Correlation Update in Parallel bcda b (2K*1) r (N*8) R br (2KN*8) r(j) R br (i,j) Adder b(i) Add/ Sub# 88 1

24 Time-constrained Architecture: Hardware Requirements

25 Area-Time efficient architecture design Area - constrained Architecture –Minimize area - single 8-bit multiplier –4K 2 N cycles (128,000 cycles ; 3.81 Kbps) Time-constrained Architecture –Minimize time - 4K 2 N 8-bit multipliers –Log 2 (2K) cycles (6 cycles ; 83.33 Mbps) Aim : To meet real-time with min. area overhead Different parallelism levels for multipliers

26 Area-Time Efficient Architecture b*b T b 0 *b 0 T b b0b0 MUX MUXMUX r r0r0 Mult Subtract >> Subtract 2K*1 2K*8 1*16 1*8 1*1 1*8 N*8 1*8 R br Counters StoreLoad R bb A (i) DEMUX MUX A (i-1) 1*8 Adder 1*8 2K*1 2K*8 Channel Estimate

27 Area-Time Efficient Architecture: Hardware Requirements

28 Outline What is multiuser channel estimation? Need for multiuser channel estimation Implementation problems Algorithm enhancements VLSI architectures –Area-constrained,Time-constrained, Area-Time efficient Conclusions

29 Comparisons DSPs unable to exploit bit-level parallelism Inefficient storage of bits Replacing multiplications by additions/subtractions

30 Scalability of Architectures with K Disadvantages of VLSI architectures Design for maximum number of users in the system If there are fewer users, –Turn off functional units to reduce power –Reconfigure hardware for higher data rates (FPGA) –Dr. Cavallaro, don’t know to handle this Question properly –We never designed an architecture/algorithm for varying number of users dynamically. (Though we had started on it) –What should be included in future work? –Please give suggestions!!

31 Conclusions Real-Time VLSI architecture for multiuser channel estimation Iterative fixed-point algorithm developed to avoid matrix inversions Area-Time Tradeoffs presented –Area-Constrained, Time-Constrained, Area-Time efficient VLSI architectures exploit bit-level computations and parallelism to meet real-time.


Download ppt "Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro,"

Similar presentations


Ads by Google