Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementing Multiuser Channel Estimation and Detection for W-CDMA Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro and Behnaam Aazhang Rice.

Similar presentations


Presentation on theme: "Implementing Multiuser Channel Estimation and Detection for W-CDMA Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro and Behnaam Aazhang Rice."— Presentation transcript:

1 Implementing Multiuser Channel Estimation and Detection for W-CDMA Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro and Behnaam Aazhang Rice University {sridhar,skrishna,cavallar,aaz}@rice.edu This work is supported by Nokia, Texas Instruments, Texas Advanced Technology Program and NSF

2 Organization Joint Estimation & Detection An Implementation-Friendly Scheme Simulations Architectural Features –Task Partitioning –Area-Time Tradeoffs Conclusions Future Work

3 Base-Station with MUD Multiple Users Channel Estimation Multiuser Detection Decoder Data Pilot Demod -ulator Antenna Decision Feedback MUXMUX Detected Bits + Base-station Receiver Delay MUXMUX d b

4 Joint Estimation & Detection Jointly estimate the channel response and detect all the user’s bits. Shown to have better performance as well as reduced computational complexity. Maximum Likelihood Based Channel Estimation –[C.Sengupta et al. : PIMRC’1998 WCNC’1999] Differencing Multistage Detection based on Parallel Interference Cancellation –[G.Xu et al. : SPIE’1999]

5 Computations Involved Model Compute Correlation Matrices Bits of K async. users aligned at times I and I-1 Received bits of spreading length N for K users riri bibi b i-1 time delay

6 Multishot Detection Solve for the channel estimate, A i

7 Differencing Multistage Detection Stage 0 [ Matched Filter Detector] Stage 1 [ to build differencing vector] Successive Stages S=diag(A H A) y - soft decision d - detected bits (hard decision )

8 Structure of A H A Not difficult to Compute A H A Block Bi-Diagonal Matrix : Use Structure

9 Drawbacks Matrix Inversion/ Decomposition Needed Result not available till end of computation –Delay before Detection Difficult for Tracking Higher Precision Needed – Floating Point Units Larger Memory Requirements –Storage of elements to compute inverse –Float = 32 bits / Input accuracy = 12-14 bits SLOW! - Difficult to meet Real-Time –[S.Rajagopal et al. : TI DSPFest’1999]

10 Proposed Base-Station No Multiuser Detection TI's Wireless Basestation (http://www.ti.com/sc/docs/psheets/diagrams/basestat.htm)http://www.ti.com/sc/docs/psheets/diagrams/basestat.htm

11 New Scheme Iterative Method to find the Channel Estimates – [S.Bhashyam et al. : WCNC’2000 (submitted)] Can be easily adapted to Tracking for Fading Channels Fixed Point Implementation Estimates ready for detection Immediately Simpler Hardware and Software. –Computation Savings only Per Bit

12 Iterative Scheme Tracking –Slow Fading : Large Window L –Fast Fading : Smaller Window L Method of Steepest Descent Stable convergence behavior μ fixed : Bit-by-Bit update Matches Closely to the Scheme with Inversions

13 Simulations - AGWN Channel Detection Window = 12 SINR = 0 Paths =3 Preamble =150 10000 bits/user MF – Matched Filter ML- Maximum Likelihood ACT – using inversion

14 Fading Channel with Tracking Doppler = 10 Hz, 1000 Bits,15 users, 3 Paths

15 DSP Implementation C6201 Texas Instruments –Fixed Point Processor –200 MHz 32 -bit VLIW Architecture 8 Functional Units –2 Multipliers –4 Adders –2 Load/Store TI C Compiler

16 Simulation Work in Progress! Why better? –Fixed Point Implementation - Faster on DSPs –Higher Clock Speeds / Faster Multiplications –More SIMD Parallelism due to smaller wordlength. –Software Code Simpler to write Smaller Program Size Problems –Input Bit Precision Analysis –Overflows

17 Task - Partitioning the Algorithm Multiple Users Channel Estimation Multiuser Detection Decoder Data Pilot Demod -ulator Antenna Decision Feedback MUXMUX Detected Bits + Base-station Receiver Delay MUXMUX d b

18 Task Decomposition Matrix Products IterateCorrelation Matrices (Per Bit) R br [I] O(KN) A 0 H A 1 O(K 2 N) A H r O(KND ) A 1 H A 1 O(K 2 N) A 0 H A 0 O(K 2 N) A[I] O(K 2 N) Multistage Detection (Per Window) O(DK 2 M) b Pilot Data MUXMUX d Data’ MUXMUX A[R] O(K 2 N) d R br [R] O(KN) R bb O(K 2 ) Block I Block II Block III Block IV Channel Estimation Multistage Detection Task A Task B S.Das et al : Asilomar’99 TIME

19 Channel Estimation Architecture Detection Architecture –One version already ready –[G.Xu - Master’s Thesis 1999] Advantages over DSP Implementation: –Optimal Memory Utilization –Custom Blocks for exploiting available pipelining and parallelism –Parts could be mapped to FPGA / Reconfigurable logic –Shows theoretical bounds for maximum achievable Data Rates –Shows how tasks could be split among different processors

20 Block Diagram b 0 b 0 ’ (2K 2 ) bb’ (2 K 2 ) MUX (2K) MUX (N) MUX (2 K 2 ) Inverter (2 K 2 ) R bb (2 K 2 ) R br [R] (KN) Multiplier (2 K 2 N) A tmp [R] >> (4 K 2 ) A [R] (KN) b0 (2K) r0 (N) b r[R] Inverter (2K) Window MUX (2K) MUX (N) R br [I] (KN) A tmp >> (4 K 2 ) r0 (N) r[I] Inverter (2K) Multiplier (2 K 2 N) A [I] (KN) bit 8-bit REAL IMAG Each block shows no. of “operations” in it.

21 Channel Estimation Window bit 8-bit b 0 b 0 ’ (2K 2 ) bb’ (2 K 2 ) MUX (2K) MUX (N) MUX (2 K 2 ) Inverter (2 K 2 ) R bb (2 K 2 ) R br [R] (KN) Multiplier (2 K 2 N) A tmp [R] >> (4 K 2 ) A [R] (KN) b0 (2K) r0 (N) b r[R] Inverter (2K) REAL Each block shows no. of “operations” in it.

22 Auto-correlation Structure b 0 b 0 ’ (2K 2 ) bb’ (2 K 2 ) MUX (2 K 2 ) Inverter (2 K 2 ) b,b 0 are 1-bit Subtraction by using inverter Rbb using a Counter Fully Parallel 2K 2 elements O(1) Time Pipelined [with LOAD] 2K elements O(K) Time Serial [with LOAD] 1 element O(2K 2 ) Time R bb (2 K 2 )

23 Cross-Correlation Structure MUX (2K) MUX (N) R br [R] (KN) Inverter (2K) r is 8-bit, b is 1-bit Rbr using 8-bit Adders Based on sign of b Fully Parallel KN, O(1) Pipelined N, O(K) Serial 1, O(KN)

24 Iterative Update Structure R bb (2 K 2 ) R br [R] (KN) Multiplier (2 K 2 N) A tmp [R] >> (4 K 2 ) A [R] (KN) REAL 8-bit Multipliers 16-bit Adders for Multiplier 8-bit Adders for A Parallel KN, O(K) Pipelined N, O(K 2 ) Serial 1, O(K 2 N)

25 Elements in each block Example : N = 32,L =100, K =32 Fully Parallel Solution : 4K Multipliers, 12K Adders : O(32) Time Pipelined Solution :100 Multipliers, 300 Adders : O(1K) Time

26 Conclusions Iterative Scheme for Joint Estimation & Detection No loss in algorithm performance Suitable for Hardware Implementation –On DSPs, FPGAs and ASICs Supports Tracking for Fading Channels Fixed Point Implementation Feasible ASIC architecture –To exploit available pipelining and parallelism Multiuser Channel Estimation and Detection algorithms POSSIBLE to IMPLEMENT for W-CDMA.

27 Future Work MS Extend Architecture to Long Codes Task Partition the algorithm on the Sundance Multi- DSP/FPGA board to achieve real-time Post-MS Downlink Architectures to Min. Power Consumption /Area Implementing Coding/Decoding Blocks and integrate RENE’

28 EXTRA SLIDES

29 Data Rates Achieved Assuming Channel Estimation Real-Time

30 Fading Channel SNR = 10 dB, Doppler = 10 Hz, 1000 Bits


Download ppt "Implementing Multiuser Channel Estimation and Detection for W-CDMA Sridhar Rajagopal, Srikrishna Bhashyam, Joseph R. Cavallaro and Behnaam Aazhang Rice."

Similar presentations


Ads by Google