Presentation is loading. Please wait.

Presentation is loading. Please wait.

The planned and expected

Similar presentations


Presentation on theme: "The planned and expected"— Presentation transcript:

1 The planned and expected
Program Sequencer controls program flow and provides the next instruction to be executed Function calls

2 Tackled today Program sequencer
Linear flow of instruction – last lecture Jumps – software loops – last lecture Loops – hardware loops – last lecture Subroutines Interrupts and Exceptions – next lecture Idle – next lecture 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

3 Example code Look at moving elements from array fooHere[ ] to farAway[ ] using various instruction modes Straight line coding In a loop – please make sure that you understand the terminology – exam question Software loop – last lecture Hardware loop – last lecture In a subroutine Via an interrupt – next lecture 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

4 Linear program flow Program flow on the chip is mainly linear
The processor fetches and executes program instructions sequentially Non sequential structures (instructions and supporting registers) direct the processor to execute an instruction that is not the next sequential address 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

5 Jump instruction Both JUMP and CALL instructions transfer program flow to another memory location The difference between JUMP and CALL is that the CALL automatically loads the return address into the RETS register. The return address is the next sequenctal address after the CALL instruction. JUMPs can be conditional (depends on CC bit in ASTAT register. Conditional JUMP instructions use static branch prediction to reduce branch latency caused by the length of the Blackfin instruction pipeline. What does “static” branch prediction mean? What is “dynamic” branch prediction? When possible the assembler will use the short relative jump. The target instruction must be within to bytes of the current instruction. 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

6 Efficiency of Standard software Loop
Suppose we go round the loop N times 2 loop control instructions outside of loop + 4 * N loop control instructions inside the loop 2 * N “useful instructions” inside loop + 4 useful set up instructions Loop efficiency = 4 + 2 * N * 100% * N * N If N is large * N * 100% = 33% * N .extern _fooHere; extern _farAway; P0.H = _fooHere; P0.L = _fooHere; P1.H = _farAway; P1.L = _farAway; extern long fooHere[5]; extern farAway[5]; long *pt0; pt0 = fooHere; long *pt1; pt1 = farAway; R1 = 0; R2 = 5; LOOP: CC = R2 <= R1; IF CC JUMP LOOP_END; int num = 0; for ( /* empty */; num < 5 ; num++) { R0 = [P0++]; [P1++] = R0; *pt1++ = *pt0++; R1 += 1; JUMP LOOP; LOOP_END: outside loop } WARNING: LOOP_END is an instruction that IS NOT EXECUTED INSIDE THE SOFTWARE LOOP 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

7 Blackfin Hardware Loops
Blackfin supports a mechanism for zero-overhead looping Common design decision – the two inner-most loops are the most often executed – so make those the most efficient The program sequencer contains TWO loop units, each containing three registers Loop Top registers – LT0, LT1 Loop Bottom registers – LB0, LB1 Loop Count registers – LC0, LC1 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

8 Efficiency of Hardware Loop
Suppose we go round the loop N times 2 loop control instructions outside of loop + 0 loop control instructions inside the loop – There are some pipeline overhead issues on leaving loop 2 * N “useful instructions” inside loop + 4 useful set up instructions Loop efficiency = 4 + 2 * N * 100% * N + 2 If N is large * N * 100% = 100% * N .extern _fooHere; extern _farAway; P0.H = _fooHere; P0.L = _fooHere; P1.H = _farAway; P1.L = _farAway; extern long fooHere[5]; extern farAway[5]; long *pt0; pt0 = fooHere; long *pt1; pt1 = farAway; P2 = 5; LSETUP( LOOP_START, LOOP_END) LC1 = P2; int num = 0; for ( /* empty */; num < 5 ; num++) { LOOP_START: R0 = [P0++]; *pt1++ = *pt0++; LOOP_END: [P1++] = R0; OUTSIDE_LOOP: } WARNING: LOOP_END is an instruction that IS EXECUTED INSIDE THE HARDWARE LOOP 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

9 Subroutine Calls Both JUMP and CALL instructions transfer program flow to another memory location The difference between JUMP and CALL is that the CALL automatically loads the return address into the RETS register. The return address is the next sequential address after the CALL instruction. This means that a function call from inside a function call has the potential of destroying the RETS register and causing your program to malfunction 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

10 Routine to move arrays void MoveArrays(long *, long *, long);
void MoveArrays (long *pt1, long *pt2, long num){ for (int count = 0; count < num ; count++) { *pt2++ = *pt1++; } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

11 Leaf routine to move arrays
Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF void MoveArrays (long *pt1, long *pt2, long num){ for (int count = 0; count < num ; count++) { *pt2++ = *pt1++; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

12 Leaf routine to move arrays
Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF P0_pt1 = R0; // Can’t use R0 as pointer void MoveArrays (long *pt1, long *pt2, long num){ R R R2 P1_pt2 = R1; // Can’t use R1 as pointer for (int count = 0; count < num ; count++) { *pt2++ = *pt1++; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

13 Leaf routine to move arrays
Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF P0_pt1 = R0; // Can’t use R0 as pointer void MoveArrays (long *pt1, long *pt2, long num){ R R R2 P1_pt2 = R1; // Can’t use R1 as pointer for (int count = 0; count < num ; count++) { LSETUP(MA_LSTART, MA_LEND) LC1 = R2; *pt2++ = *p1++; MA_LSTART: R0 = [P0++]; MA:LEND: [P1++] = R0; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); Makes sense to be able to use R2 (INPAR3) to set LC1 counter – but is illegal 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

14 Leaf routine to move arrays
Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF P0_pt1 = R0; // Can’t use R0 as pointer void MoveArrays (long *pt1, long *pt2, long num){ R R R2 P1_pt2 = R1; // Can’t use R1 as pointer P2_counter = R3; // Can’t use R3 for a // LSETUP instruction for (int count = 0; count < num ; count++) { LSETUP(MA_LSTART, MA_LEND) LC1 = P2; *pt2++ = *pt1++; MA_LSTART: R0 = [P0++]; MA:LEND: [P1++] = R0; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

15 Main code and function extern long fooHere[5], farAway[5]
extern “C” void FillArrayMethod(long *, long); extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); void MoveArrays(long *, long *, long); char * Mymain(void) { void MoveArrays (long *pt1, long *pt2, long num){ FillArrayMethod (fooHere, 5); for (int count = 0; count < num ; count++) { MoveArrays(fooHere, farAway, 5); *pt2++ = *pt1++; return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

16 Do easy bits first – optimize later
.extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); .extern _MoveArray; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

17 Do a little more difficult next
.extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); .extern _MoveArray; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } I KNOW what happens when I put an EXTERNAL address into a P0 register, but I DON’T KNOW what happens with a R register and an EXTERNAL address – so don’t ask the question!!!!! 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

18 Handle -- FillArrayMethod (fooHere, 5);
.extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; R1 = 5; P0.H = _fooHere; P0.L = _fooHere; R0 = P0; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); R R1 Set OUTPAR2 = Set OUTPAR1 = &fooHere[0]; .extern _MoveArray; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

19 Handle -- MoveArray(fooHere, farAway, 5); THE WRONG WAY
.extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; R1 = 5; P0.H = _fooHere; P0.L = _fooHere; R0 = P0; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); R R1 Set OUTPAR2 = Set OUTPAR1 = &fooHere[0]; .extern _MoveArray; R2 = R1; P0.H = _farAway; P0.L = _ farAway; R1 = P1; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); R R R2 Value needed for R2 is in R Re-use R THE WRONG WAY P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

20 Handle -- MoveArray(fooHere, farAway, 5); THE CORRECT WAY
_MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; R1 = 5; P0.H = _fooHere; P0.L = _fooHere; R0 = P0; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); R R1 Set OUTPAR2 = Set OUTPAR1 = &fooHere[0]; .extern _MoveArray; R2 = 5; P0.H = _farAway; P0.L = _ farAway; R1 = P0; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); R R R2 Value that was in R1 might be destroyed by CALL _FillArrayMethod Value that was in R0 might be destroyed by CALL _FillArrayMethod Value that was in P0 might be destroyed by CALL _FillArrayMethod These ARE VOLATILE REGISTERS P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

21 Optimizing code Try and see what the C++ compiler would do with code
Make sure that the optimizer does not optimize to nothing – Use the arrays in something else Really keen – IPA – inter process optimization The linker does something special to code in different files if this is activated 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

22 Example – good mid-term example Bring your answer to tutorial tomorrow
ulong WaitTill(ushort PFsignal, const ushort high_low) { ushort counter = 0; ushort PFmask; ushort PFsignalwanted; ushort PFvalue; ushort quit = 0; PFmask = PFsignal; if (high_low == HIGH) PFsignalwanted = PF_signal; else PFsignalwanted = 0; PFvalue = ReadPFASM( ); PFvalue = PFvalue & PFmask; if (PFvalue == PFsignalwanted) quit = 1; while (quit != 1) { UseFixedTimeASM(0x4000); PFvalue = ReadPFASM( ); } temperature4 CalculateTemperatureASM( ushort, ushort, const ushort): ulong WaitTillASM(ushort, const ushort); #define HIGH 1 #define LOW 0 temperature4 MyMainASM(void) { ulong time1, time2; temperature4 temperature; SetUpPFLinesASM(true); WaitTillASM(0x4, HIGH); // PF11 after shifting time1 = WaitTillASM(0x4, LOW); // time2 = WaitTillASM(0x4, HIGH); temperature = CalculateTemperatureASM( time2, time1, CELSIUS): return temperature; } There might be a “better” way of doing this test. If so, then you use it 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

23 Tackled today Program sequencer
Linear flow of instruction – last lecture Jumps – software loops – last lecture Loops – hardware loops – last lecture Subroutines Interrupts and Exceptions – Friday Idle – Friday 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

24 Information taken from Analog Devices On-line Manuals with permission Information furnished by Analog Devices is believed to be accurate and reliable. However, Analog Devices assumes no responsibility for its use or for any infringement of any patent other rights of any third party which may result from its use. No license is granted by implication or otherwise under any patent or patent right of Analog Devices. Copyright  Analog Devices, Inc. All rights reserved. 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada


Download ppt "The planned and expected"

Similar presentations


Ads by Google