Presentation is loading. Please wait.

Presentation is loading. Please wait.

Process for changing “C-based” design to SHARC assembler ADDITIONAL EXAMPLE M. R. Smith, Electrical and Computer Engineering University of Calgary, Canada.

Similar presentations


Presentation on theme: "Process for changing “C-based” design to SHARC assembler ADDITIONAL EXAMPLE M. R. Smith, Electrical and Computer Engineering University of Calgary, Canada."— Presentation transcript:

1 Process for changing “C-based” design to SHARC assembler ADDITIONAL EXAMPLE M. R. Smith, Electrical and Computer Engineering University of Calgary, Canada smithmr @ ucalgary.ca This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab Type in action items as they come up Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered.

2 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 2 / 20 To be tackled today Need to set up review process to look for, and remove, common errors when writing assembly code Process to translate a “C” program involving arrays into SHARC code Comparison of timings for non-optimized code, optimized code, hardware loops, super-scalar architecture

3 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 3 / 20 Code review Sheet -- PSP Need to identify common errors -- CODE REVIEW Constructs to link to “C” Are all declarations at the start of subroutine -- #define etc CONSTANTS, variables, FunctionNames, EXPORT leading underscores,.segment declarations Assembly syntax Self documentating code, clanguage_register_defines.I Missing semicolons -- CODE REVIEW Conditional Delayed Branching properly handled -- DESIGN REVIEW Load/Store Architecture -- DESIGN REVIEW Can’t do R1 = R2 + 4. Becomes temp = 4; R1 = R2 + temp; Register operations, volatile, order of I and M registers -- CODE REVIEW What is your favourite error to waste time?

4 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 4 / 20 Simpler example of array handling void MakeRamp{ float re_array[ ], int num ) { int count; for (count = 0; count < num; count++) { re_array[count] = count; } } THINGS TO WORRY ABOUT DURING TRANSLATION Prologue, EpilogueREVIEW How handleLOAD/STORE architecture How handlefor-loop How handle = count operation (int to float conversion) How handle stepping through array -- post modify How handlehow handle parameter passing

5 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 5 / 20 Step 1 -- int to float conversion Int to float conversion must be handled by YOU void MakeRamp{ float re_array[ ], int num ) { int count; for (count = 0; count < num; count++) { re_array[count] = (float) count; } } THINGS TO WORRY ABOUT DURING TRANSLATION Prologue, EpilogueREVIEW How handleLOAD/STORE architecture How handlefor-loop How handle = count operation (int to float) How handle stepping through array -- post modify How handlehow handle parameter passing

6 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 6 / 20 Watch for SHARC assembler nastiness The code F2 = dm(I1,1) disassembles as R2 = dm(I1,1) MEANING there is no special instruction needed as F2 and R2 are the same register. Translation handled by assembler F2 = 1.0 is translated as R2 = bit pattern for 1.0 MEANING there is no special instruction needed as F2 and R2 are the same register. Translation handled by assembler NASTY SIDE EFFECT F2 = 1 is translated as R2 = bit pattern for 1 and is NOT TRANSLATED as R2 = bit pattern for (float) 1 so you get the effect of F2 = 1.0 * 10 -45 -- which is not what you intended. Make sure that you always add the decimal point.0

7 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 7 / 20 Use local pointer set to pointer value passed on the stack void MakeRamp{ float *re_array, int num ) { int count NOT A USEABLE POINTER dm float *arraypt = re_array; for (count = 0; count < num; count++) { *arraypt = (float) count; arraypt++; } } THINGS TO WORRY ABOUT DURING TRANSLATION Prologue, EpilogueREVIEW How handleLOAD/STORE architecture How handlefor-loop How handle stepping through array -- post modify How handlehow handle parameter passing Step 2 -- Convert to use local pointers (in scope)

8 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 8 / 20 Step 3 -- load-store architecture Use registers variables and scratch register void MakeRamp{ register float *re_array, register int num ) { register int count = GARBAGE; register float scratch = GARBAGE; register dm float *arraypt = re_array; for (count = 0; count < num; count++) { scratch = (float) count; // *arraypt = (float) count *arraypt = scratch; arraypt++; } } THINGS TO WORRY ABOUT DURING TRANSLATION Prologue, EpilogueREVIEW How handleLOAD/STORE architecture How handlefor-loop How handlehow handle parameter passing

9 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 9 / 20 Step 4 -- convert the for-loop void MakeRamp{ register float *re_array, register int num ) { register int count = GARBAGE; register float scratch = GARBAGE; register dm float *arraypt = re_array; count = 0; while (count < num) { scratch = (float) count; *arraypt = scratch; arraypt++; count = count + 1; } } THINGS TO WORRY ABOUT DURING TRANSLATION Prologue, EpilogueREVIEW How handlefor-loop -- 68K like -- NOT OPTIMIZED How handlehow handle parameter passing

10 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 10 / 20 Step 5 -- Prologue -- which registers? void MakeRamp{ register float *re_array, register int num ) { INPAR1 (R4)INPAR2 (R8) NOW SEE WHY INPAR1 NOT POINTER register int count = GARBAGE; scratchR1 register float scratch = GARBAGE; scratchF2 (not F1) register dm float *arraypt = re_array; scratchDMpt count = 0; while (count < num) { scratch = (float) count; *arraypt = scratch; arraypt++; count = count + 1; } } Prologue -- leaf routine -- no stack changes Epilogue -- since leaf routine -- standard 5 lines How handleparameter passing

11 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 11 / 20 Step 6 -- Handle loop -- Part 1 void MakeRamp{ register float *re_array, register int num { #define numR4 INPAR2 #define countR1 scratchR1 // register int count = GARBAGE; countR1 = 0;// count = 0; _MR_WHILE: // while (count < num) { ????// Loop body countR1 = countR1 + 1; // count = count + 1; JUMP(PC, _MR_WHILE) (DB); // } nop; nop; // } end MakeRamp()

12 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 12 / 20 Step 7 -- Handle loop -- Part 2 void MakeRamp{ register float *re_array, register int num ) { #define numINPAR2 INPAR2 #define countR1 scratchR1 // register int count; countR1 = 0; // count = 0; MR_WHILE: COMP(countR1,numINPAR2); // while (count < num) { if GT JUMP(PC, MR_ENDLOOP) (DB); nop; nop; ???? // Loop body countR1 = countR1 + 1; // count = count + 1; JUMP(PC, _MR_WHILE) (DB); // }nop; MR_ENDLOOP: 5 magic lines of code for “C” return // }

13 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 13 / 20 Reminder of what trying to do! void MakeRamp{ register float *re_array, register int num ) { register int count; register float scratch, *arraypt = re_array; for (count = 0; count < num; count++) { scratch = (float) count; *arraypt = scratch; arraypt++; } }

14 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 14 / 20 Step 8 -- handle loop body // void MakeRamp{ register float *re_array, register int num ) {.segment seg_pmco;.global _MakeRamp; _MakeRamp: #define re_arrayINPAR1 INPAR1 // register int count; #define tempF2 scratchF2 // register float temp = GARBAGE #define arraypt scratchDMpt // *arraypt = GARBAGE; arraypt = re_arrayINPAR1; // *arraypt = re_array; // for (count = 0; count < num; count++) { tempF2 = FLOAT countR1; // temp = (float) count; dm(arraypt, 1) = tempF2; // *arraypt = temp; // arraypt++; // } // }

15 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 15 / 20 Final “C” Code Translation Code as directly translated Possible Optimization Decide if it is worth the effort of optimizing? Optimized Don’t do it unless asked for this course in quizzes and labs Very easy to get it wrong

16 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 16 / 20 #define re_arrayINPAR1 INPAR1 #define numINPAR2 INPAR2.global _MakeRamp; _MakeRamp: #define countR1 scratchR1 #define arraypt scratchDMpt countR1 = 0; arraypt = re_arrayINPAR1; MR_WHILE: COMP(countR1, numINPAR2); if GT JUMP(PC, MR_ENDLOOP) (DB); nop; nop; #define tempF2 scratchF2 tempF2 = FLOAT countR1; dm(arraypt, 1) = tempF2; countR1 = countR1 + 1; JUMP(PC, MR_WHILE) (DB);nop; MR_ENDLOOP: 5 magic lines of code for “C” return

17 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 17 / 20 Final “C” Code Translation Code as directly translated (7 + num *10 instr) Possible Optimization -- Worth the effort? Best case would be (7 + num * 6 instructions) Optimized Don’t do it unless asked for this course in quizzes and labs Very easy to get it wrong Improved algorithm using DSP architecture Hardware loop capability (8 + num * 2 instructions) Activate Super-Scalar capability (7 + num * 1 instructions)

18 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 18 / 20 #define re_arrayINPAR1 INPAR1 #define numINPAR2 INPAR2.global _MakeRamp; _MakeRamp: #define countR1 scratchR1 #define arraypt scratchDMpt countR1 = 0; CAN’T BE MOVED arraypt = re_arrayINPAR1; CAN’T BE MOVED MR_WHILE: COMP(countR1, numINPAR2); if GT JUMP(PC, MR_ENDLOOP) (DB); nop; #define tempF2 scratchF2 tempF2 = FLOAT countR1; JUMP(PC, MR_WHILE) (DB); dm(arraypt, 1) = tempF2; countR1 = countR1 + 1; MR_ENDLOOP: 5 magic lines of code for “C” return

19 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 19 / 20 Final “C” Code Translation Code as directly translated (7 + num *10 instr) Possible Optimization -- Worth the effort? Best case would be (7 + num * 6 instructions) Actual optimized was (7 + num * 7 instructions) Optimized Don’t do it unless asked for this course in quizzes and labs Very easy to get it wrong Improved algorithm using DSP architecture Hardware loop capability (8 + num * 2 instructions) Activate Super-Scalar capability (7 + num * 1 instructions)

20 6/2/2015 ENEL515 -- Translating “C-based” design to 21061 code Copyright smithmr@ucalgary.ca 20 / 20 Tackled today Need to set up review process to look for, and remove, common errors when writing assembly code Process to translate a “C” program involving arrays into SHARC code Comparison of timings for non-optimized code, optimized code, hardware loops, super-scalar architecture


Download ppt "Process for changing “C-based” design to SHARC assembler ADDITIONAL EXAMPLE M. R. Smith, Electrical and Computer Engineering University of Calgary, Canada."

Similar presentations


Ads by Google