Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Tackled today Declaring and initializing arrays off the stack – Review and a little bit of new Useful for background DMA tasks Useful for minimizing total memory used in non-general program Declaring arrays and variables on the stack – Review and a little bit of new Re-entrant code and thread safe Demonstrating memory to memory DMA DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Declaring fixed arrays in memory – not on the stack short foo_startarray[40]; short far_finalarray[40]; void HalfWaveRectifyASM( ) { // Take the signal from foo_startarray[ ] and rectify the signal // Half wave rectify – if > 0 keep the same; if < 0 make zero // Full wave rectify – if > 0 keep the same; if < 0 then abs value // Rectify startarray[ ] and place result in finalarray[ ] for (int count = 0; count < 40; count++) { if (foo_startarray[count] < 0) far_finalarray[count] = 0; else far_finalarray[count] = foo_startarray[count]; } The program code is the same – but the data part is not DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Attempt 1 .section data1 Tells linker to place this stuff in memory map location data1 .align 4 We know processor works best when we start things on a boundary between groups of 4 bytes [N * 2] We need N short ints We know the processor works with address working in byes Therefore need N * 2 bytes DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
We said “wrong approach” look at memory 20 bytes (16 bits) for N short value in C++ = N * 2 bytes DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Said “Correct approach NOT what I expected” ASM Array with space for N long ints .var arrayASM[N]; ASM Array with space for N short ints var arrayASM[N / 2]; ASM Array with space for N chars var arrayASM[N / 4]; DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Better answer is “Look at the assembler manual” DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Improving what we did before Big warning – external array initialization occurs on “reload” and NOT on “restart” – Understanding why this is true and why it is a problem will solve many issues when programming DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
When DMA might be useful -- Video manipulation Program Wait for picture 1 to come in – video-in Process picture 1 – lots of mathematics perhaps Wait for picture 1 to be transmitted – video out Spending a lot of time waiting rather than doing DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
When DMA might be useful -- Double Buffering Program Wait for picture 2 memory to fill – video-in Picture 3 comes into memory – background DMA Process picture 2 – place into picture 0 location Picture 4 comes into memory – background DMA Process picture 3 – place into picture 1 location Transmit picture 0 – background DMA Picture 0 comes into memory – background DMA Process picture 4 – place into picture 2 location Transmit picture 1– background DMA Picture 1 comes into memory – background DMA Process picture 0 – place into picture 3 location Transmit picture 2 – background DMA Picture 2 comes into memory – background DMA Process picture 1 – place into picture 4 location Transmit picture 3– background DMA DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
We are only going to look at a simple DMA task Normal code P0 address of start_array[0]; P1 address of final_array[0]; R0 max-value needed to transfer R1 How many values already transferred R1 = 0; LOOP: CC = R0 <= R1 IF CC JUMP DONE: R2 = [P0++]; VERY BIG PIPELINE [P1++] = R2; LATENCY ISSUES JUMP LOOP; MANY INTERNAL PROCESSOR STALLS DONE: WHILE WAIT FOR R2 TO BE Do something else READ, STORED and then TRANSMITTED DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
We are only going to look at a simple DMA task DMA_source_address_register address of start_array[0]; DMA_destination_address_register address of final_array[0]; DMA_max_count_register max-value needed to transfer DMA_count_register How many values already transferred R1 = 0; LOOP: CC = R0 <= R1 IF CC JUMP DONE: DMA_enable = true R2 = [P0++]; DMA transfer happen in background [P1++] = R2; Miminized pipeline issues JUMP LOOP; DONE: Do something else Do something else DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Write some test so we know how to proceed -- Test 1 Internal memory test – arrays on stack DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Write some test so we know how to proceed -- Test 2 External memory test – arrays in external SDRAM SDRAM -- MANY MEGS AVAILABLE Addresses hard-coded DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Write some test so we know how to proceed -- Test 3 Most probable way to use DMA – Store in SLOW external memory Move to process in FAST internal memory, put back into external SDRAM Addresses hard-coded DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Some results Code details later Debug Mode Release Mode L1 L1 8748 625 L1 L1 DMA 6579 6477 SDRAM SDRAM 39132 28200 SDRAM SDRAM DMA 12175 12090 SDRAM L1 DMA 5265 4836 SDRAM L1 DMA L1 SDRAM DMA 9792 9276 DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Memory to memory move Debug Code DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Review for final A) What happened here? B) What happened here? C) What happened here? D) Why did this happen? E) What happened here? F) Determine loop efficiency in terms of instructions in terms of cycles / read_write op DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Answer questions A B C D E DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Review for final -- Worksheet F) Determine loop efficiency in terms of cycles / read_write op internal memory -> internal memory size was ? Useful reads ? Useful writes ? Cycles as measured ? cycles / useful mem op Why not an exact number? Instructions in loop? Total # of reads / write ? / loop ? read / writes – around ? cycles DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Review for final F) Determine loop efficiency in terms of cycles / read_write op internal memory -> internal memory size was 300 Useful reads 300 Useful writes 300 Cycles 8748 as measured 8748 / 600 = 14.58 Why not an exact number? Instructions in loop? 19 Total # of reads / write 9 / loop 2700 read / writes – around 3 cycles DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Review for final -- Worksheet F) Determine loop efficiency in terms of cycles / read_write op SDRAM -> SDRAM size was ? Useful reads ? Useful writes ? Cycles as measured ? cycles / useful mem op Why not an exact number? Instructions in loop? Total # of reads / write ? / loop ? read / writes – around ? cycles DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Review for final F) Determine loop efficiency in terms of cycles / read_write op SDRAM external -> SDRAM memory Useful reads / writes 300 each Cycles 39132 as measured 39132 / 600 = 65.22 Why not an exact number? Instructions in loop? 19 Total # of reads / write 9 / loop 7 * 300 read / writes internal 2 * 300 read / writes external Time r/w external = 39132 – 2100*3 33000 / 600 = 5.5 cycles Factor of 2 slower DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
Memory to memory move Release Mode DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Review for final A) What happened here? B) What happened here? C) What happened here? D) Why did this happen inside loop? E) What happened here? F) Determine loop efficiency in terms of instructions in terms of cycles / read_write op DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Answer questions A B C D E DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada F) Determine loop efficiency in terms of cycles / read_write op internal memory -> internal memory size was 300 Useful reads 300 Useful writes 300 Cycles 625 as measured 625 / 600 = 1.05 Why not an exact number? Instructions in loop? 4 WE WOULD EXPECT 1200 cycles!!!! Where did the difference go? DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Worksheet F) Determine loop efficiency in terms of cycles / read_write op SDRAM -> internal memory size was ? Useful reads ? Useful writes ? Cycles ? as measured ? / ? = ? SDRAM access ? cycles L1 memory 1 cycle Would make sense to process in L1 memory? DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada F) Determine loop efficiency in terms of cycles / read_write op SDRAM -> internal memory size was 300 Useful reads 300 Useful writes 300 Cycles 28200 as measured 28200 / 600 = 47 SDRAM access 47 cycles L1 memory 1 cycle Would make sense to process in L1 memory – so move SDRAM to L1 to process DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Worksheet F) Determine loop efficiency in terms of cycles / read_write op SDRAM -> internal memory size was ? Useful reads ? Useful writes ? Cycles ? as measured 300 of those are L1 writes Leaving ? ? / ? = ? SDRAM read before ? cycles SDRAM read now ? cycles L1 -> L1 ? cycle Would make sense to process in L1 memory – so move SDRAM to L1 to process Loads of overhead in SDRAM to SDRAM DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada F) Determine loop efficiency in terms of cycles / read_write op SDRAM -> internal memory size was 300 Useful reads 300 Useful writes 300 Cycles 4836 as measured 300 of those are L1 writes Leaving 4500 4500 / 300 = 15 SDRAM read before 47 cycles SDRAM read now 15 cycles L1 -> L1 1 cycle Would make sense to process in L1 memory – so move SDRAM to L1 to process Loads of overhead in SDRAM to SDRAM DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada F) Determine loop efficiency in terms of cycles / read_write op SDRAM -> internal memory size was 300 Useful reads 300 Useful writes 300 Cycles 4836 as measured 300 of those are L1 writes Leaving 4500 4500 / 300 = 15 SDRAM read before 47 cycles SDRAM read now 15 cycles L1 -> L1 1 cycle Would make sense to process in L1 memory – so move SDRAM to L1 to process Loads of overhead in SDRAM to SDRAM DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Tackled today Review of handling external arrays from assembly code Arrays declared in another file Arrays declared in this file -- NEW Needed for arrays used by ISRs Arrays declared on the stack Pointers passed as parameters to a subroutine Can’t use arrays on the stack when used by ISR DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019
DMA , Copyright M. Smith, ECE, University of Calgary, Canada Information taken from Analog Devices On-line Manuals with permission http://www.analog.com/processors/resources/technicalLibrary/manuals/ Information furnished by Analog Devices is believed to be accurate and reliable. However, Analog Devices assumes no responsibility for its use or for any infringement of any patent other rights of any third party which may result from its use. No license is granted by implication or otherwise under any patent or patent right of Analog Devices. Copyright Analog Devices, Inc. All rights reserved. DMA , Copyright M. Smith, ECE, University of Calgary, Canada 1/1/2019