Presentation is loading. Please wait.

Presentation is loading. Please wait.

TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada

Similar presentations


Presentation on theme: "TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada"— Presentation transcript:

1 TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca

2

3 The practice Suppose we have the vector – in-phase and out-of- phase data gathered over an antenna from a satellite for example. Gain issues make it x16 -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j, -16-16j 16+16j, 16+16j, etc Question – if the original data from the satellite had this form -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, How is the satellite data delayed? FOR THIS EXAMPLE …….. 0, 3, 6, 9, 12 etc

4 Tackle the issue with FIR First – modify correlation function to handle complex values Ignore that issue at the moment Imagine 1024 data points + 1024 PRN Need to do 1024 FIR each of 1024 taps We know how to optimize to do 2 taps every cycle (one in X and one in Y) Cycle time is 1024 * 512 cycles = 1 ms at 500 MHz XCORS can do 8 * 16 taps each cycle in each compute block – 148 times faster

5 Where does the CLU fit in?

6 XCORRS definition

7 THEORY Mathematical definition Uses registers TR D C And something called CUT

8 Satellite data Quad fetch brings in 8 complex values 8 bits each Pattern here is -1 + 0j, 1 + 0j, 1 + 0j, -1 + 0j, 1 + 0j, 1 + 0j, ……….

9 PRN code – 2 bit complex number Seems strange to have two dummy bits But actually makes sense PRN -1+ -1j, 1 + j, 1 + j, -1 + -1j, 1 + j, 1 + j, ………. +1, -1 are associated with the PSK – more next lecture Problem BINARY means 1 and 0, so how represent 1 and -1

10 PRN

11 0x3 value go in as C15 and C16 0011 -- C15 = -1 –j C16 = +1 + j

12 Loading the THR registers

13 Standard XCORRS instruction Lower 46 bits ofTHR1:0 R7:3 TR0, TR1, TR2 ……. TR15

14 TR15:0 = XCORRS(R7:4, THR3:0) TR0 += D7 * C22 + D6 * C21 +… 8 taps TR1 += D7 * C21 + D6 * C20 +… 8 taps ……….. TR15 += D7 * C7 + D6 * C6 + … 8 taps 64 taps each cycles – on both x and y compute blocks – if set up properly 128 taps each cycle – these are “complex taps” compared to 2 real taps / cycle after lab. 3

15 TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7) TR0 += D7 * C22 + D6 * C21 + … 8 taps TR1 += D7 * C21 + D6 * C20 + … 8 taps ……….. TR14 += D7 * C8 + D6 * C7 2 taps TR15 += D7 * C7 1 taps

16 TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15) TR0 += D7 * C22 + D6 * C21 … 8 taps TR1 += D7 * C21 + D6 * C20 … 7 taps ……….. TR7 += D7 * C15 … 1 taps TR0 += 0 … 0 taps ……….. TR15 += 0 … 0 taps

17 TR15:0 = XCORRS(R7:4, THR3:0) (CUT +15) TR0 += 0 … 0 taps TR1 += D0 *C14 1 taps ……….. TR7 += D6 * C14 + D5 * C13 + … 7 taps TR0 += D7 * C14 + D6 * C13 + … 8 taps ……….. TR15 += D7 * C7 + D6 * C7 + … 8 taps

18

19 TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15) TR0 += D7 * C22 + D6 * C21 … 8 taps TR1 += D7 * C21 + D6 * C20 … 7 taps ……….. TR7 += D7 * C15 … 1 taps TR0 += 0 … 0 taps ……….. TR15 += 0 … 0 taps

20

21 TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7) TR0 += D7 * C22 + D6 * C21 + … 8 taps TR1 += D7 * C21 + D6 * C20 + … 8 taps ……….. TR14 += D7 * C8 + D6 * C7 2 taps TR15 += D7 * C7 1 taps

22

23 TR15:0 = XCORRS(R7:4, THR3:0) TR0 += D7 * C22 + D6 * C21 +… 8 taps TR1 += D7 * C21 + D6 * C20 +… 8 taps ……….. TR15 += D7 * C7 + D6 * C6 + … 8 taps 64 taps each cycles – on both x and y compute blocks – if set up properly 128 taps each cycle – these are “complex taps” compared to 2 real taps / cycle after lab. 3

24

25 Problem at this point -- THR3:2 empty Need to bring in more PRN values

26 TR15:0 = XCORRS(R7:4, THR3:0) (CUT +15) TR0 += 0 … 0 taps TR1 += D0 *C14 1 taps ……….. TR7 += D6 * C14 + D5 * C13 + … 7 taps TR0 += D7 * C14 + D6 * C13 + … 8 taps ……….. TR15 += D7 * C7 + D6 * C7 + … 8 taps

27

28 Final Result Maximum correlation occurs every 3 shifts – which is what we expect Is it the correct results

29 Correlation – result expected In step -1 +0j, 1 + 0j, 1 + 0j, … 16 times with -1 - j, 1 + j, 1 + j, … 16 times -1 * -1 + 1 * 1 + 1 * 1 + 48 = 0x30 -- Real component Out of step -1 +0j, 1 + 0j, 1 + 0j, … 16 times with 1 + j, 1 + j, -1 - j, … 16 times -1 * 1 + 1 * 1 + 1 * -1 + -16 = -0x10 = 0xFFF0

30 Final Result 1) Now have correlation values for 16 shifts in TR registers – store to external memory Repeat for all other necessary shifts – find the maximum 2) Now make parallel in SISD mode 3) Now make parallel in SIMD

31 Take home Quiz 4 Old requirement Do Lab 4 with FFT and XCORRS Write tests and demonstrate XCORRS used for correlation a)Not parallel instruction format – but in a loop b) Now do in optimized SISD mode c) Now do in optimized SIMD mode


Download ppt "TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada"

Similar presentations


Ads by Google