Presentation is loading. Please wait.

Presentation is loading. Please wait.

RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.

Similar presentations


Presentation on theme: "RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia."— Presentation transcript:

1 RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia Communications This work has been supported by Nokia, TI, TATP and NSF

2 RICE UNIVERSITY Single-slide version of my talk Algorithms DSP VLSI FPGA IMAGINE Multiuser channel estimation Multiuser detection Task-partitioning Parallelism Pipelining Conventional arithmetic On-line arithmetic Instruction set extensions Co-processor support Functional unit design and usage Distant Past Recent Past Recent and Near Future

3 RICE UNIVERSITY Contents  Algorithms for channel estimation and detection  Conventional and on-line arithmetic designs  Programmable architecture design using the IMAGINE simulator

4 RICE UNIVERSITY Estimation - detection algorithms?  Sophisticated, computationally complex algorithms proposed for 3G - 4G standards  Typically need complex operations, huge matrix sizes, matrix inversions  Difficult for hardware implementation and for real- time performance

5 RICE UNIVERSITY Multiuser channel estimation algorithm  = {+1, -1} : training/tracking bits  = 8-bit integer (complex) : Received signal  N = spreading gain (typically fixed, e.g. 32)  K = number of users (variable, <=N)  = maximum likelihood channel estimate

6 RICE UNIVERSITY Iterative scheme for channel estimation  Bit-streaming : suitable for tracking (window length L)  Method of gradient descent  Stable convergence behavior  Simple fixed-point VLSI architecture [ASAP 2000]

7 RICE UNIVERSITY Comparisons  DSPs unable to exploit bit-level parallelism  Inefficient storage of bits  Replacing multiplications by additions/subtractions

8 RICE UNIVERSITY Multiuser detection innovations  Developed a simple architecture for asynchronous multiuser detection for CDMA [ +, x ]  Bit-streaming  reduced latency  eliminates window edge computations  lower memory requirements  Pipelined stages  higher throughput (with more hardware)

9 RICE UNIVERSITY Block Pipelined Detector Variable latency [Worst case (1st bit)  D*latency per bit] 2 extra edge bit computations per stage. 11 MF 22 Bits 12-21 TIME 1 MF 12 Bits 2-11 1 PIC 1211 PIC 22 1 PIC 1211 PIC 22 1 PIC 1211 PIC 22

10 RICE UNIVERSITY Bit-streaming multiuser detection Savings in memory by D 2

11 RICE UNIVERSITY Pipelining the multiuser detector Matched Filter (causal) PIC - Stage 1 PIC - Stage 2 PIC - Stage 3 TIME Latency = 2*latency per bit (D/2 speedup over block) eliminated edge bit computations. [ISCAS 2001]

12 RICE UNIVERSITY Contents  Algorithms for channel estimation and detection  Conventional and on-line arithmetic designs  Programmable architecture design using the IMAGINE simulator

13 RICE UNIVERSITY Matched filter with conventional arithmetic T ~ log(N) * log(d) N - dot product size d - precision

14 RICE UNIVERSITY Conventional MF using CSAs T ~ a + log(d+c) a,c - constants

15 RICE UNIVERSITY Key concept in on-line arithmetic  Conventional detection - high precision operations (8-32 bits) followed by testing for sign.  Actual detection dependent only on most significant digits (1-3 bits).  Use MSDF computation to find the sign and avoid computation of the successive digits. [Arith-15] Detection

16 RICE UNIVERSITY Comparisons of arithmetic schemes

17 RICE UNIVERSITY Using on-line arithmetic for detection Channel -1,+1 -0.500.51 0 1 1.5 2 2.5 3 3.5 4 4.5 5 Received Signal Amplitude (Normalized) Time taken for addition (Normalized)

18 RICE UNIVERSITY Equations Probability of error for optimal BPSK detection Probability of error for on-line BPSK detection r – radix of the number system p – number of digits

19 RICE UNIVERSITY Probability of error using on-line

20 RICE UNIVERSITY On-line MF implementation T ~ c c - constant

21 RICE UNIVERSITY Throughput comparisons

22 RICE UNIVERSITY Area comparisons

23 RICE UNIVERSITY Implementing higher modulation schemes

24 RICE UNIVERSITY Conclusions on arithmetic schemes  CSAs better than straightforward implementation  1.35 - 1.6X speedup for 8-32 bit precision  1.64 - 1.14X less area  If reduced precision computations, on-line still better  1.67 - 2.12X speedup over CSA  0.64 - 12.73X less area over CSA

25 RICE UNIVERSITY Contents  Algorithms for channel estimation and detection  Conventional and on-line arithmetic designs  Programmable architecture design using the IMAGINE simulator

26 RICE UNIVERSITY A programmable architecture?  Flexibility in the algorithm requirements  channel dependent computations  changing algorithms on-the-fly  seamless switching between wireless LAN and wideband CDMA -- RENE.  Simulator needed to test performance of algorithms  extensions/modifications for critical operations

27 RICE UNIVERSITY Algorithms needed for 3G base-band base-station implementation  Equalization  FFT  Viterbi decoding  Channel estimation  Multiuser detection  Viterbi/Turbo decoding  Multiple antennas  Long spreading codes  Space-Time codes Wireless LAN W-CDMA If you felt that life was too easy

28 RICE UNIVERSITY The IMAGINE architecture and simulator  IMAGINE is a media signal processor, built at Stanford.  Many common workload features  Good starting point to explore.  Local expertise - Dr. Scott Rixner ( rixner@rice.edu )

29 RICE UNIVERSITY IMAGINE architecture  Great for media processing algorithms  1024 pt FFT in 7.4  s on a 500 MHz processor with a 8-cluster (48 units)  3.8W of power  Great for parallel, vector and streaming computations  Performance/extensions to sequential computation kernels such as Viterbi traceback needs to be investigated.

30 RICE UNIVERSITY Conclusions  Algorithm steps for designing communication systems  Design hardware-efficient versions  Fixed-point implementation  DSP implementation - bottlenecks  Task partitioning, pipelining, parallelism  Computer arithmetic ideas -- VLSI  Integration into a programmable processor


Download ppt "RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia."

Similar presentations


Ads by Google