The planned and expected

Slides:



Advertisements
Similar presentations
A look at interrupts What are interrupts and why are they needed.
Advertisements

Microprocessor or Microcontroller Not just a case of “you say tomarto and I say tomayto” M. Smith, ECE University of Calgary, Canada.
Assignment Overview Thermal oscillator One of the ENCM415 Laboratory 2 items Oscillator out GND +5V.
Boot Issues Processor comparison TigerSHARC multi-processor system Blackfin single-core.
Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,
The planned but unexpected Program Sequencer controls program flow and provides the next instruction to be executed Interrupt – causing and handling.
6/2/2015 Labs in ENCM415. Laboratory 2 PF control, Copyright M. Smith, ECE, University of Calgary, Canada 1 Temperature Sensor Laboratory 2 Part 2 – Developing.
Thermal arm-wrestling Design of a video game using two programmable flags (PF) interrupts Tutorial on handling 2 Hardware interrupts from an external device.
Building a simple loop using Blackfin assembly code M. Smith, Electrical and Computer Engineering, University of Calgary, Canada.
Specialized Video (8-bit) and Vector (16-bit) Instructions on the Blackfin There is always a “MAKE-UP-YOUR-QUESTION-AND-ANSWER-IT” Question on a Dr. Smith.
Developing a bicycle speed-o-meter Part 2 A comparison between the Analog Devices ADSP-BF533 (Blackfin) and Motorola MC68332.
Microprocessor or Microcontroller Not just a case of “you say tomarto and I say tomayto” M. Smith, ECE University of Calgary, Canada.
Core Timer Code Development How you could have done the Take- Home Quiz using a test driven development (TDD) approach.
Specialized Video (8-bit) and Vector (16-bit) Instructions on the Blackfin Expand on these ideas for Q9 question and answer on the final.
A look at interrupts What are interrupts and why are they needed in an embedded system? Equally as important – how are these ideas handled on the Blackfin.
Developing a bicycle speed-o-meter A comparison between the Analog Devices ADSP-BF533 (Blackfin) and Motorola MC68332.
Understanding the Blackfin ADSP-BF5XX Assembly Code Format
A look at interrupts What are interrupts and why are they needed.
Microprocessor or Microcontroller Not just a case of “you say tomarto and I say tomayto” M. Smith, ECE University of Calgary, Canada.
Laboratory 1 – ENCM415 Familiarization with the Analog Devices’ VisualDSP++ Integrated Development Environment.
Microprocessor or Microcontroller Not just a case of “you say tomarto and I say tomayto” M. Smith, ECE University of Calgary, Canada.
Developing a bicycle speed-o-meter Midterm Review.
A Play Core Timer Interrupts Acted by the Human Microcontroller Ensemble from ENCM511.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
Blackfin Array Handling Part 1 Making an array of Zeros void MakeZeroASM(int foo[ ], int N);
A first attempt at learning about optimizing the TigerSHARC code TigerSHARC assembly syntax.
Building a simple loop using Blackfin assembly code If you can handle the while-loop correctly in assembly code on any processor, then most of the other.
Assignment 4 / Lab. 3 Convert C++ ISR to ASM AND GET IT TO WORK Doing Assignment 4 / Lab. 3 the Test Driven Development way.
Help for Lab. 1 Subroutines calling Subroutines
Developing a bicycle speed-o-meter
Chapter 10 The Stack.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
Software and Hardware Circular Buffer Operations
Generating the “Rectify” code (C++ and assembly code)
A Play Core Timer Interrupts
SPI Compatible Devices
Thermal arm-wrestling
DMA example Video image manipulation
Overview of SHARC processor ADSP Program Flow and other stuff
Trying to avoid pipeline delays
ENCM K Interrupts Theory and Practice
Understanding the TigerSHARC ALU pipeline
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
Thermal arm-wrestling
Using Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
Expand on these ideas for Q9 question and answer on the final
Thermal arm-wrestling
Concept of TDD Test Driven Development
Explaining issues with DCremoval( )
Lab. 1 – GPIO Pin control Using information ENEL353 and ENCM369 text books combined with Blackfin DATA manual.
Independent timers build into the processor Basis for Lab. 2
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
Lecture 6: Assembly Programs
DMA example Video image manipulation
Developing a bicycle speed-o-meter
Independent timers build into the processor
Developing a bicycle speed-o-meter
Developing a bicycle speed-o-meter
Thermal arm-wrestling
Building a simple loop using Blackfin assembly code
Developing a bicycle speed-o-meter Part 2
Understanding the TigerSHARC ALU pipeline
A first attempt at learning about optimizing the TigerSHARC code
Blackfin Syntax Stores, Jumps, Calls and Conditional Jumps
A first attempt at learning about optimizing the TigerSHARC code
Presentation transcript:

The planned and expected Program Sequencer controls program flow and provides the next instruction to be executed Function calls

Tackled today Program sequencer Linear flow of instruction – last lecture Jumps – software loops – last lecture Loops – hardware loops – last lecture Subroutines Interrupts and Exceptions – next lecture Idle – next lecture 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Example code Look at moving elements from array fooHere[ ] to farAway[ ] using various instruction modes Straight line coding In a loop – please make sure that you understand the terminology – exam question Software loop – last lecture Hardware loop – last lecture In a subroutine Via an interrupt – next lecture 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Linear program flow Program flow on the chip is mainly linear The processor fetches and executes program instructions sequentially Non sequential structures (instructions and supporting registers) direct the processor to execute an instruction that is not the next sequential address 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Jump instruction Both JUMP and CALL instructions transfer program flow to another memory location The difference between JUMP and CALL is that the CALL automatically loads the return address into the RETS register. The return address is the next sequenctal address after the CALL instruction. JUMPs can be conditional (depends on CC bit in ASTAT register. Conditional JUMP instructions use static branch prediction to reduce branch latency caused by the length of the Blackfin instruction pipeline. What does “static” branch prediction mean? What is “dynamic” branch prediction? When possible the assembler will use the short relative jump. The target instruction must be within -4096 to +4094 bytes of the current instruction. 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Efficiency of Standard software Loop Suppose we go round the loop N times 2 loop control instructions outside of loop + 4 * N loop control instructions inside the loop 2 * N “useful instructions” inside loop + 4 useful set up instructions Loop efficiency = 4 + 2 * N -------------------------- * 100% 4 + 2 * N + 2 + 4 * N If N is large 2 * N ----------- * 100% = 33% 6 * N .extern _fooHere; .extern _farAway; P0.H = _fooHere; P0.L = _fooHere; P1.H = _farAway; P1.L = _farAway; extern long fooHere[5]; extern farAway[5]; long *pt0; pt0 = fooHere; long *pt1; pt1 = farAway; R1 = 0; R2 = 5; LOOP: CC = R2 <= R1; IF CC JUMP LOOP_END; int num = 0; for ( /* empty */; num < 5 ; num++) { R0 = [P0++]; [P1++] = R0; *pt1++ = *pt0++; R1 += 1; JUMP LOOP; LOOP_END: outside loop } WARNING: LOOP_END is an instruction that IS NOT EXECUTED INSIDE THE SOFTWARE LOOP 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Blackfin Hardware Loops Blackfin supports a mechanism for zero-overhead looping Common design decision – the two inner-most loops are the most often executed – so make those the most efficient The program sequencer contains TWO loop units, each containing three registers Loop Top registers – LT0, LT1 Loop Bottom registers – LB0, LB1 Loop Count registers – LC0, LC1 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Efficiency of Hardware Loop Suppose we go round the loop N times 2 loop control instructions outside of loop + 0 loop control instructions inside the loop – There are some pipeline overhead issues on leaving loop 2 * N “useful instructions” inside loop + 4 useful set up instructions Loop efficiency = 4 + 2 * N -------------------------- * 100% 4 + 2 * N + 2 If N is large 2 * N ----------- * 100% = 100% 2 * N .extern _fooHere; .extern _farAway; P0.H = _fooHere; P0.L = _fooHere; P1.H = _farAway; P1.L = _farAway; extern long fooHere[5]; extern farAway[5]; long *pt0; pt0 = fooHere; long *pt1; pt1 = farAway; P2 = 5; LSETUP( LOOP_START, LOOP_END) LC1 = P2; int num = 0; for ( /* empty */; num < 5 ; num++) { LOOP_START: R0 = [P0++]; *pt1++ = *pt0++; LOOP_END: [P1++] = R0; OUTSIDE_LOOP: } WARNING: LOOP_END is an instruction that IS EXECUTED INSIDE THE HARDWARE LOOP 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Subroutine Calls Both JUMP and CALL instructions transfer program flow to another memory location The difference between JUMP and CALL is that the CALL automatically loads the return address into the RETS register. The return address is the next sequential address after the CALL instruction. This means that a function call from inside a function call has the potential of destroying the RETS register and causing your program to malfunction 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Routine to move arrays void MoveArrays(long *, long *, long); void MoveArrays (long *pt1, long *pt2, long num){ for (int count = 0; count < num ; count++) { *pt2++ = *pt1++; } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Leaf routine to move arrays Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF void MoveArrays (long *pt1, long *pt2, long num){ for (int count = 0; count < num ; count++) { *pt2++ = *pt1++; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Leaf routine to move arrays Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF P0_pt1 = R0; // Can’t use R0 as pointer void MoveArrays (long *pt1, long *pt2, long num){ R0 R1 R2 P1_pt2 = R1; // Can’t use R1 as pointer for (int count = 0; count < num ; count++) { *pt2++ = *pt1++; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Leaf routine to move arrays Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF P0_pt1 = R0; // Can’t use R0 as pointer void MoveArrays (long *pt1, long *pt2, long num){ R0 R1 R2 P1_pt2 = R1; // Can’t use R1 as pointer for (int count = 0; count < num ; count++) { LSETUP(MA_LSTART, MA_LEND) LC1 = R2; *pt2++ = *p1++; MA_LSTART: R0 = [P0++]; MA:LEND: [P1++] = R0; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); Makes sense to be able to use R2 (INPAR3) to set LC1 counter – but is illegal 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Leaf routine to move arrays Leaf routines are “guaranteed” NOT to call another routine .global _MoveArray void MoveArrays(long *, long *, long); _MoveArray: LINK 0; // Permitted since LEAF P0_pt1 = R0; // Can’t use R0 as pointer void MoveArrays (long *pt1, long *pt2, long num){ R0 R1 R2 P1_pt2 = R1; // Can’t use R1 as pointer P2_counter = R3; // Can’t use R3 for a // LSETUP instruction for (int count = 0; count < num ; count++) { LSETUP(MA_LSTART, MA_LEND) LC1 = P2; *pt2++ = *pt1++; MA_LSTART: R0 = [P0++]; MA:LEND: [P1++] = R0; } P0 = [FP + 4]; UNLINK _MoveArray.END: JUMP (P0); 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Main code and function extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); void MoveArrays(long *, long *, long); char * Mymain(void) { void MoveArrays (long *pt1, long *pt2, long num){ FillArrayMethod (fooHere, 5); for (int count = 0; count < num ; count++) { MoveArrays(fooHere, farAway, 5); *pt2++ = *pt1++; return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Do easy bits first – optimize later .extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); .extern _MoveArray; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Do a little more difficult next .extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); .extern _MoveArray; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } I KNOW what happens when I put an EXTERNAL address into a P0 register, but I DON’T KNOW what happens with a R register and an EXTERNAL address – so don’t ask the question!!!!! 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Handle -- FillArrayMethod (fooHere, 5); .extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; R1 = 5; P0.H = _fooHere; P0.L = _fooHere; R0 = P0; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); R0 R1 Set OUTPAR2 = 5 Set OUTPAR1 = &fooHere[0]; .extern _MoveArray; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Handle -- MoveArray(fooHere, farAway, 5); THE WRONG WAY .extern _fooHere, _farAway; extern long fooHere[5], farAway[5] extern “C” void FillArrayMethod(long *, long); .global _Mymain; extern “C” void MoveArrays(long *, long *, long); extern “C” char * Mymain(void); _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; R1 = 5; P0.H = _fooHere; P0.L = _fooHere; R0 = P0; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); R0 R1 Set OUTPAR2 = 5 Set OUTPAR1 = &fooHere[0]; .extern _MoveArray; R2 = R1; P0.H = _farAway; P0.L = _ farAway; R1 = P1; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); R0 R1 R2 Value needed for R2 is in R1 Re-use R0 THE WRONG WAY P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Handle -- MoveArray(fooHere, farAway, 5); THE CORRECT WAY _MyMain: LINK 16; char * Mymain(void) { .extern _FillArrayMethod; R1 = 5; P0.H = _fooHere; P0.L = _fooHere; R0 = P0; CALL _FillArrayMethod; FillArrayMethod (fooHere, 5); R0 R1 Set OUTPAR2 = 5 Set OUTPAR1 = &fooHere[0]; .extern _MoveArray; R2 = 5; P0.H = _farAway; P0.L = _ farAway; R1 = P0; CALL _MoveArray; MoveArrays(fooHere, farAway, 5); R0 R1 R2 Value that was in R1 might be destroyed by CALL _FillArrayMethod Value that was in R0 might be destroyed by CALL _FillArrayMethod Value that was in P0 might be destroyed by CALL _FillArrayMethod These ARE VOLATILE REGISTERS P0.H = _farAway; P0.L = _farAway; R0 = P0; P0 = [FP + 4]; UNLINK; _Mymain.end: JUMP (P0); return farAway } 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Optimizing code Try and see what the C++ compiler would do with code Make sure that the optimizer does not optimize to nothing – Use the arrays in something else Really keen – IPA – inter process optimization The linker does something special to code in different files if this is activated 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Example – good mid-term example Bring your answer to tutorial tomorrow ulong WaitTill(ushort PFsignal, const ushort high_low) { ushort counter = 0; ushort PFmask; ushort PFsignalwanted; ushort PFvalue; ushort quit = 0; PFmask = PFsignal; if (high_low == HIGH) PFsignalwanted = PF_signal; else PFsignalwanted = 0; PFvalue = ReadPFASM( ); PFvalue = PFvalue & PFmask; if (PFvalue == PFsignalwanted) quit = 1; while (quit != 1) { UseFixedTimeASM(0x4000); PFvalue = ReadPFASM( ); } temperature4 CalculateTemperatureASM( ushort, ushort, const ushort): ulong WaitTillASM(ushort, const ushort); #define HIGH 1 #define LOW 0 temperature4 MyMainASM(void) { ulong time1, time2; temperature4 temperature; SetUpPFLinesASM(true); WaitTillASM(0x4, HIGH); // PF11 after shifting time1 = WaitTillASM(0x4, LOW); // time2 = WaitTillASM(0x4, HIGH); temperature = CalculateTemperatureASM( time2, time1, CELSIUS): return temperature; } There might be a “better” way of doing this test. If so, then you use it 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Tackled today Program sequencer Linear flow of instruction – last lecture Jumps – software loops – last lecture Loops – hardware loops – last lecture Subroutines Interrupts and Exceptions – Friday Idle – Friday 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

Information taken from Analog Devices On-line Manuals with permission http://www.analog.com/processors/resources/technicalLibrary/manuals/ Information furnished by Analog Devices is believed to be accurate and reliable. However, Analog Devices assumes no responsibility for its use or for any infringement of any patent other rights of any third party which may result from its use. No license is granted by implication or otherwise under any patent or patent right of Analog Devices. Copyright  Analog Devices, Inc. All rights reserved. 11/30/2018 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada