Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sridhar Rajagopal, Srikrishna Bhashyam,

Similar presentations


Presentation on theme: "Sridhar Rajagopal, Srikrishna Bhashyam,"— Presentation transcript:

1 Sridhar Rajagopal, Srikrishna Bhashyam,
Efficient VLSI architectures for baseband signal processing in wireless base-station receivers Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro, and Behnaam Aazhang This work is supported by Nokia, TI, TATP and NSF

2 Introduction A real-time VLSI architecture for channel estimation
Usually neglected, but high computational complexity Current DSP solutions do not meet real-time Iterative fixed point algorithm developed Area-Time tradeoffs presented Area-Constrained,Time-Constrained, Area-Time efficient

3 Outline What is multiuser channel estimation?
Need for multiuser channel estimation Implementation problems Algorithm enhancements VLSI architectures Area-constrained,Time-constrained, Area-Time efficient Conclusions

4 Evolution of mobile communications
First generation Voice Second/Current generation Voice + Low-rate data (9.6Kbps) Third generation + Voice + High-rate data (2 Mbps/384 Kbps/128 Kbps) + multimedia

5 Channel estimation Direct Path Reflected Noise +MAI User 1 User 2
Base Station

6 Need for channel estimation
To compensate for unknown fading amplitudes and asynchronous delays. Detector performance depends on accuracy of channel estimator Multiuser Channel Estimation Jointly estimate parameters for all users Better performance than single user estimates

7 Computing channel estimates
Computed by sending a training sequence of known bits to the receiver. When absent, detected bits can be used to update estimates in a decision feedback mode for tracking. Importance usually neglected May exceed detector complexity

8 Baseband signal processing
Antenna Multiple Users Detection Decoding Detected Bits Training Tracking Channel estimation Base-Station Receiver

9 Multiuser Channel Estimation Algorithm
b = {+1, -1} : Training/Tracking bits r = 8-bit integer (complex) : Received signal N = spreading gain (typically fixed ,e.g: 32) K = number of users (variable, <=N) A = Maximum Likelihood channel estimate

10 Implementation complexity
Matrix inversions (size 32x32) per window Unable to meet real-time on DSPs [Asilomar’99] VLSI fixed-point architectures for matrix inversions Difficult to design , Finite precision problems Typically, simpler single-user sliding correlator structures used.

11 Outline What is multiuser channel estimation?
Need for multiuser channel estimation Implementation problems Algorithm enhancements VLSI architectures Area-constrained,Time-constrained, Area-Time efficient Conclusions

12 Iterative scheme for channel estimation
Bit-streaming : suitable for tracking Method of gradient descent Stable convergence behavior Simple fixed-point VLSI architecture

13 Simulations - Static multipath channel
4 5 6 7 8 9 10 11 12 -3 -2 -1 Comparison of Bit Error Rates (BER) Signal to Noise Ratio (SNR) BER MF ActMF ML ActML O(K2N) O(K3+K2N) SINR = 0 dB Paths =3 Preamble L =150 Spreading N = 31 Users K = 15

14 Fading channel with tracking
Doppler = 10 Kmph 4 5 6 7 8 9 10 11 12 -3 -2 -1 SNR BER MF - Static MF - Tracking ML - Static ML - Tracking

15 Outline What is multiuser channel estimation?
Need for multiuser channel estimation Implementation problems Algorithm enhancements VLSI architectures Area-constrained,Time-constrained, Area-Time efficient Conclusions

16 Area-Time Tradeoffs Design for 32 users (K) and spreading code (N) 32
Target Data Rate = 128 Kbps (4000 cycles at 500 MHz). Area-Constrained Architecture : Pico-cells or fewer users Time-Constrained Architecture : Maximum data rates Area-Time Efficient Architecture : Real-Time

17 Task decomposition: channel estimation
Iterate Correlation Matrices (Per Bit) Pilot Bits Pilot M U X Detected Bits Data A O(4K2N,8) Rbr O(2KN,8) Rbb O(2K2,8) TIME Channel Estimate to Detector b0 (2K,1) Tracking Window r0 (N,8) b(2K,1) r(N,8) L

18 Architecture design: auto-correlation
b = {+1,-1} Multiplication is a XNOR operation Matrix updated using XNOR gates Auto-correlation matrix implemented as an UP/DOWN counter(s)

19 Architecture design: cross-correlation
b = {+1,-1}, r = 8-bit integer vector (complex) Multiplications reduce to additions/subtractions Matrix (complex) can be updated with 8-bit adders Cross-correlation matrix stored as RAM.

20 Architecture design: channel estimate
A = 8-bit integer matrix (complex) µ << 1 : Truncated multiplication [Schulte’93] Matrix-matrix (real-complex) multiplication of integers Forms the bottleneck (8-bit multipliers) Concentrate on multiplication for area-time tradeoffs!

21 Area-Constrained Architecture
MUX EN Counter Rbb A DEMUX MAC Add/ Sub Subtract Anew U/D Load Store j i r0 r b b0 16 8 1 Rbr >> b b0

22 Area-constrained Architecture: Hardware Requirements

23 Time-constrained Architecture
K(2K-1)*1 2K*1 M U X b b*bT b0 b0*b0T K(2K-1)*1 Channel Estimate 2K*1 Rbb A 2K*1 2K2*8 2KN*8 MUX Mult Subtract r M U X 2K*1 2KN*8 N*8 2KN*16 >> Rbr Subtract r0 N*8 2KN*8 2KN*16 N*8

24 Auto-correlation Update in Parallel
a·b a·c a·d b·c b·d c·d b c d a b (2K) bbT (K*{2K-1}*1) Rbb (2K2*8) Array of Counters 1 bbT(i,j) U/D# U/D# Array of XNORs Counter Counter Rbb(i,j) Rbb(i,i)

25 Cross-Correlation Update in Parallel
b c d a b (2K*1) r (N*8) Rbr (2KN*8) r(j) Rbr(i,j) Adder b(i) Add/ Sub# 8 1

26 Time-constrained Architecture: Hardware Requirements

27 Area-Time Efficient Architecture
b*bT b0*b0T b b0 MUX M U X r r0 Mult Subtract >> 2K*1 2K*8 1*16 1*8 1*1 N*8 Rbr Counters Store Load Rbb A DEMUX Anew Adder

28 Area-Time Efficient Architecture: Hardware Requirements

29 Outline What is multiuser channel estimation?
Need for multiuser channel estimation Implementation problems Algorithm enhancements VLSI architectures Area-constrained,Time-constrained, Area-Time efficient Conclusions

30 Comparisons DSPs unable to exploit bit-level parallelism
Inefficient storage of bits Replacing multiplications by additions/subtractions

31 Conclusions Real-Time VLSI architecture for multiuser channel estimation Iterative fixed-point algorithm developed to avoid matrix inversions Area-Time Tradeoffs presented Area-Constrained, Time-Constrained, Area-Time efficient VLSI architectures exploit bit-level computations and parallelism to meet real-time.


Download ppt "Sridhar Rajagopal, Srikrishna Bhashyam,"

Similar presentations


Ads by Google