Presentation is loading. Please wait.

Presentation is loading. Please wait.

Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter.

Similar presentations


Presentation on theme: "Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter."— Presentation transcript:

1 Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

2 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada2 / 30 6/18/2015 Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor speed goes down Need to understand how the ALU pipeline works  Learn to use the pipeline viewer May be different answer for floating point and integer operations

3 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada3 / 30 6/18/2015 Register File and COMPUTE Units

4 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada4 / 30 6/18/2015 Simple Example IIR -- Biquad For (Stages = 0 to 3) Do  S0 = X in * H5 + S2 * H3 + S1 * H4  Y out = S0 * H0 + S1 * H1 + S2 * H2  S2 = S1  S1 = S0 S0 S1 S2

5 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada5 / 30 6/18/2015 Set up the tests. Want to make sure correct answer as code changes

6 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada6 / 30 6/18/2015 Step 1 – Stub plus return value Build an assembly language stub for float iirASM(void); Make it return a floating point value of 40.5 to show that we can return a value of 40.5 J8 is an INTEGER so how can we return 40.5? ANSWER – WE DON’T We return the “bit pattern” for 40.5, which is “INTEGER”

7 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada7 / 30 6/18/2015 Code does not work when passing back floats with J8 register

8 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada8 / 30 6/18/2015 Code does work when using XR8 register – NOTE NOT XFR8

9 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada9 / 30 6/18/2015 Step 2 – Using C++ code as comments set up the coefficients XFR0 = 0.0;; Does not exist XR0 = 0.0;; DOES EXIST Bit-patterns require integer registers Leave what you wanted to do behind as comments

10 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada10 / 30 6/18/2015

11 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada11 / 30 6/18/2015 Modify C++ code so that it can be translated into assembly code Can only have 1 instruction per line Code must execute sequentially so remember the ;;

12 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada12 / 30 6/18/2015 Start with S0 = Xin instruction Can’t use XFR8 = XFR6 to copy a register

13 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada13 / 30 6/18/2015 Since XFR8 = XFR6 is not allowed Try XR8 = R6; SIMD  Single instruction Multiple Data R6 means move XR6 and YR6 (Multiple data move described in 1 instruction) Try XR8 = XR6

14 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada14 / 30 6/18/2015 Some operations are FLOAT operations and must have XFR on left side of equation BUT only R on the right Some operations are SISD operations and must have XR on both side of the equation (or just R on both sides of the equation making them SIMD X and Y with garbage happening on Y) Personally, I think all these problems are “assembler” issues and could be made consistent

15 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada15 / 30 6/18/2015 Disconnect from target and go to simulator

16 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada16 / 30 6/18/2015 Activate Simulator

17 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada17 / 30 6/18/2015 Rebuild the project and set breakpoints at start and end of ASM code

18 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada18 / 30 6/18/2015 Activate the pipeline viewer

19 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada19 / 30 6/18/2015 Adjust the pipeline window so can see all the instruction pipeline stages

20 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada20 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Instruction fetch -- F1, F2, F3 and F4  Fetch Unit Pipe – memory driven  128 bits fetched – may make up 1, 2, 3, or 4 instructions (or parts of a couple instructions  Instructions into IAB, instruction alignment buffer Integer ALU pipe – PD, D, I and A

21 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada21 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual 10 pipeline stages, but may be completely desynchronized (happen semi-indepently) Instruction fetch -- F1, F2, F3 and F4 Integer ALU – PreDecode, Decode, Integer, Access Compute Block – EX1 and EX2

22 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada22 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Instruction fetch -- F1, F2, F3 and F4  Fetch Unit Pipe  Memory driven not instruction driven  128 bits fetched – may make up 1, 2, 3, or 4 instruction lines (or parts of a couple of instruction lines  Instruction fetched into IAB, instruction alignment buffer

23 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada23 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Integer ALU pipe – PD, D, I and A  PreDecode – the next COMPLETE instruction line (1, 2, 3 or 4 ) fetched from IAB  Decode – different instructions dispatched to different execution units (J-IALU, K-IALU, Compute Blocks)  Data memory access start in Integer stage  A stands for Access stage  Results are not available EX2 stage, but (by register forwarding) can be sometimes accessed earlier

24 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada24 / 30 6/18/2015 PIPELINE STAGES See page 8-34 of Processor manual Compute Block  EX1 and EX2  Result is always written to the target register on the rising edge of CCLK after stage EX2  Following guaranteed R2 = R0 + R1; R6 = R2 * R3;; R2 at end of instruction R2 value at beginning of instruction used

25 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada25 / 30 6/18/2015 Only interested in later stages of the pipeline. Adjust properties

26 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada26 / 30 6/18/2015 Run the code till first ASM break point: Note cycle Number 39830 Then run again till reach second ASM breakpoint Calculate execution time

27 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada27 / 30 6/18/2015

28 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada28 / 30 6/18/2015 Pipeline viewer says 26 cycles but what do we expect 8 cycles

29 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada29 / 30 6/18/2015 Pipeline viewer says 26 cycles but what do we expect -- 21 13 cycles expected Where are the extra cycles coming from and how easy is it to code in such a way that the extra cycles can be removed ANSWER Fairly straight forward in idea, can be difficult in practice

30 Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada30 / 30 6/18/2015 Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor speed goes down Need to understand how the ALU pipeline works  Learn to use the pipeline viewer May be different answer for floating point and integer operations


Download ppt "Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter."

Similar presentations


Ads by Google