Chapter 12 C and Assembly Interface

Slides:



Advertisements
Similar presentations
Using the SQL Access Advisor
Advertisements

Chapter 3 Addressing Modes
Chapter 17 DTMF generation and detection Dual Tone Multiple Frequency
Chapter 19 Fast Fourier Transform
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
TK1924 Program Design & Problem Solving Session 2011/2012
Chapter 5 Assembly Language
Process Description and Control
Copyright © 2003 Pearson Education, Inc. Slide 1.
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Chapter 4 Parameters and Overloading. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 4-2 Learning Objectives Parameters Call-by-value Call-by-reference.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2003 Pearson Education, Inc. Slide 7-1 Created by Cheryl M. Hughes The Web Wizards Guide to XML by Cheryl M. Hughes.
Copyright © 2003 Pearson Education, Inc. Slide 1.
Processes and Operating Systems
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Date: File: PRO1_17E.1 SIMATIC S7 Siemens AG All rights reserved. Information and Training Center Knowledge for Automation Solutions (Version.
1.
Version 1.0 digitaloffice.intel.com Intel ® vPro Technology Intel ® Active Management Technology Setup and Configuration HP Laptop – Compaq 6910p Small.
Figure 12–1 Basic computer block diagram.
Turing Machines.
PP Test Review Sections 6-1 to 6-6
Chapter 17 Linked Lists.
Chapter 4 Linked Lists. © 2005 Pearson Addison-Wesley. All rights reserved4-2 Preliminaries Options for implementing an ADT List –Array has a fixed size.
Chapter 1 Object Oriented Programming 1. OOP revolves around the concept of an objects. Objects are created using the class definition. Programming techniques.
Data structure is concerned with the various ways that data files can be organized and assembled. The structures of data files will strongly influence.
User Friendly Price Book Maintenance A Family of Enhancements For iSeries 400 DMAS from Copyright I/O International, 2006, 2007, 2008, 2010 Skip Intro.
Project 5: Virtual Memory
Briana B. Morrison Adapted from William Collins
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Chapter 10: Virtual Memory
© Copyright by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Outline 24.1 Test-Driving the Ticket Information Application.
Operating Systems Operating Systems - Winter 2012 Chapter 4 – Memory Management Vrije Universiteit Amsterdam.
CS Spring 2014 Prelim 2 Review
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Procedures. 2 Procedure Definition A procedure is a mechanism for abstracting a group of related operations into a single operation that can be used repeatedly.
Chapter 10 Linking and Loading. Separate assembly creates “.mob” files.
Copyright © 2000, Daniel W. Lewis. All Rights Reserved. CHAPTER 10 SHARED MEMORY.
Types of selection structures
Pointers and Arrays Chapter 12
Essential Cell Biology
Copyright 2013 – Noah Mendelsohn Compiling C Programs Noah Mendelsohn Tufts University Web:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Pointers and Linked Lists.
 2003 Prentice Hall, Inc. All rights reserved. 1 Chapter 13 - Exception Handling Outline 13.1 Introduction 13.2 Exception-Handling Overview 13.3 Other.
Techniques for proving programs with pointers A. Tikhomirov.
User Defined Functions Lesson 1 CS1313 Fall User Defined Functions 1 Outline 1.User Defined Functions 1 Outline 2.Standard Library Not Enough #1.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Data Structures Using C++ 2E
Chapter 9: Using Classes and Objects. Understanding Class Concepts Types of classes – Classes that are only application programs with a Main() method.
Chapter 3 โพรเซสเซอร์และการทำงาน The Processing Unit
Lecture 6 Programming the TMS320C6x Family of DSPs.
MSP430 Teaching Materials
CSS 372 Lecture 1 Course Overview: CSS 372 Web page Syllabus Lab Ettiquette Lab Report Format Review of CSS 371: Simple Computer Architecture Traps Interrupts.
1 Homework Turn in HW2 at start of next class. Starting Chapter 2 K&R. Read ahead. HW3 is on line. –Due: class 9, but a lot to do! –You may want to get.
RapUp Dynamic Allocation of Memory in C Last HW Exercise Review for Final Final Exam Next Thursday – Same Time / Same Place.
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. TI C55x instruction set C55x programming model. C55x assembly language. C55x memory organization.
8051 ASSEMBLY LANGUAGE PROGRAMMING
System Calls 1.
6-1 Infineon 167 Interrupts The C167CS provides 56 separate interrupt sources that may be assigned to 16 priority levels. The C167CS uses a vectored interrupt.
DSP C5000 Chapter 12 C and Assembly Interface Copyright © 2003 Texas Instruments. All rights reserved.
Introduction 8051 Programming language options: Assembler or High Level Language(HLL). Among HLLs, ‘C’ is the choice. ‘C’ for 8051 is more than just ‘C’
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
Lecture 3 Translation.
Writing Functions in Assembly
ENERGY 211 / CME 211 Lecture 25 November 17, 2008.
Overheads for Computers as Components 2nd ed.
Writing Functions in Assembly
Subject Name: Digital Signal Processing Algorithms & Architecture
Overheads for Computers as Components 2nd ed.
Presentation transcript:

Chapter 12 C and Assembly Interface DSP C5000 Chapter 12 C and Assembly Interface Copyright © 2003 Texas Instruments. All rights reserved.

Using C and assembly C54x C55x

Objectives – C54 Understand the C Environment Run the compiler Describe how to Mix C and Assembly Language

C Run-time Environment Main.C int func(int,int,int,int,int); int y = 0; void main (void) { y = func(1,2,3,4,5); } What other sections get created?... y (global) .bss 0 (init val) .cinit code .text Func.C int func(int a,int b,int c,int d,int e) { return(a + b + c + d + e); }

C sections Section Name Used for Type of Memory .text: code Program ROM .cinit: global inits Program ROM .bss: variables Data RAM .stack: for SP Data RAM vectors vectors Program ROM (0xFF80) .const const int x=25; Data ROM .switch for case stmts Program ROM .sysmem heap, dynamic mem Data RAM Section Name Used for Type of Memory

CCS Compile & Link Process file1.c Compiler Optimizer Run-time Library (rts.lib) file.asm Assembler Lnk.rcp file.out VL file2.asm -g and -03 compete with each other. -g is used primarily for debug and the compiler generates very little code movement to assist in debugging. However, if you want the fastest code, use -o3 alone - lots of code motion and much better performance. Using -o3 hurts the debug effort, but you should have used -g alone during debug before you flipped the -o3 switch anyway. So, the moral of the story is: use -g first, debug. Then, when everything looks good, remove -g and use -o3. If you use them together, -g will turn off some of the optimizations which -o3 would have done. file.obj file.map Debug: Symbolic Debug, Level 1 Optimization Full opt: Level 3 Optimization

Initializing the C Environment ... Boot.c in rts.lib _c_int00: Initialize global and static variables Initialize C environment variables Setup stack (SP) Call _main ;cvectors.asm On reset, how do you tell the CPU to begin execution at _c_int00? .ref _c_int00 .sect “vectors” rsv: B _c_int00 All symbols accessed by C require an underscore

Run-time Environment Presumed Modified STx Bits Name Value by C? ARP Auxiliary Reg Ptr 0 Yes ASM ACC shift mode Yes BRAF Block Rpt Active Flag No C Carry bit Yes C16 Dual 16-bit math 0 No CMPT Compatibility mode 0 No CPL Compiler mode 1 No FRCT Fractional mode 0 No OVA/B ACC Overflow flags Yes OVM Overflow mode 0 * SXM Sign-extension mode Yes SMUL Saturate/multiply * TC Test Control flag Yes Presumed Modified STx Bits Name Value by C? If the user modifies a “presumed value”, this value must be restored by the function * - for intrinsics only

How are the parameters passed to func( ) ? Writing Func.ASM Main.C int func(int,int,int); int y = 0; void main (void) { y = func(1,2,3); } Main.C: - prototypes called function - calls function How are the parameters passed to func( ) ? Func.C int func(int a,int b,int c) { return(a + b + c); }

Parameter Passing SP SP SP y = func(1,2,3); AR0 AR1 AR2 AR3 AR4 AR5 Save on entry (SOE) - child must save if used used PC SP arg2 = 2 SP A B arg1, ret value arg3 = 3 SP y = func(1,2,3); Argument 1 is passed in A accumulator Arguments 2,3… passed in reverse order via stack PC placed on stack Return value placed in A accumulator Ex: LD *SP(1),B ;arg2 loaded to accumulator B, i.e. *(SP + 1 ) Arguments on the stack can be accessed using compiler mode (CPL=1): Context save/restore: PSHM AR6, POPM AR6

Func.ASM Entry Algorithm Exit .def _func _func: - declare func as global - define entry point (label) ;push SOE registers - save SOE registers ADD *SP(1),A ;a + b ADD *SP(2),A ;+ c Algorithm - execute the algorithm return(a + b + c); - place result in return reg ;pop SOE registers RET Exit PC SP - restore SOE registers 2 - return to calling routine 3 used With maximum optimization, func is deleted and main simply does: ST #6,*(y)

Accessing MMRs from C Using pointers to access Memory-Mapped Registers : Declare the necessary MMR component : volatile unsigned int *SWWSR = (volatile unsigned int *) 0x28; Read and write to the register as desired : *SWWSR = 0x8244; Volatile modifier : Especially important when using the optimizer Tells compiler to always recheck actual memory whenever encountered Otherwise, optimizer might register-base value, or eliminate construct The regs54xx.h header file includes most MMR definitions

Interrupts in C Interrupt Service Routine C function to run when interrupt occurs All necessary context save/restore performed automatically Interrupt Initialization Code Should be called prior to run-time process Interrupt status may be modified during run-time Interrupt Vector Table Written in ASM

Writing ISRs in C int x[100] ; int *p = x ; main { … } ; interrupt void name(void) { static int y = 0 ; y += 1 ; if y < 100 *p++ = port0001; else asm(“ intr 17 “); } Global variables allow sharing of data between main functions & ISR Keyword Name of ISR function Void input and return values Locals are lost across calls Statics persist across calls ISRs should not include calls Return is with enable (RETE) Some compiler options delete “dead” code

Initializing Interrupts in C Setup pointers to IMR & IFR. Initialize IMR, IFR, INTM : volatile unsigned int *IMR = (volatile unsigned int *) 0x0000; volatile unsigned int *IFR = (volatile unsigned int *) 0x0001; *IFR = 0xFFFF; *IMR = 0xFFFF; asm(“ RSBX INTM “); Create Vector Table : .sect “.vectors” … B _ISR1 nop Compiled ISR Sequence : I$$SAVE performs context save (from RTS.LIB) ISR function runs I$$RESTORE performs context restore (RTS.LIB) RETE - Return with Enable

Numerical Types in C Q15 math in C is accomplished by shifting the result: xxxx xxxx xxxx xxxx 16-bit int * yyyy yyyy yyyy yyyy 16-bit int zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz 32-bit prod z= (int)(((long)x * (long)y )>>15); z= x * y; z(Q15) z(Q0) an integer is defined as the low portion of the accumulator short, char, etc, occupy full 16-bits of memory float operations supported via rts.lib (multicycle) These are other items that some might find useful. Watch out for in-line assembly. It’s ok for setup code like turning on/off interrupts, etc. However, the compiler will optimize and move code around and simply COPY the asm statements into the assembly file, NOT caring a bit about WHERE it’s placed. Intrinsics are handy if an assembly instruction performs a task that is difficult to do in C (for example and saturated add). If you don’t want all your global variables to end up in .bss or all your code in .text, pragmas can help place code/data sections EXACTLY where you want them. This is analogous to .usect and .sect in assembly. The compiler’s optimizer may strip out variables and code chunks if it cannot see a variable changing. For example, if you have a variable tied to something in hardware (like a timer), C can’t SEE it change. In the example above, if ctrl doesn’t change inside the loop, C thinks it’s an infinite loop and removes the variable and the associated code. Using the volatile keyword says to the compiler “don’t mess with this variable - I know what I’m doing…” The interrupt keyword automatically tells the compiler to generate context save/restore. It will save only those registers used in the ISR. However, if a CALL to another function occurs inside the ISR, C will perform a full context save/restore. So, don’t make a CALL inside an ISR! From the ANSI C perspective x*y is NOT guaranteed to be 32-bits. Therefore the casting to long.

C Optimization Levels - allocates variables to registers - simplifies expressions - eliminates unused code - removes unused assignments and common expressions - single function (local) optimizations - performs loop optimizations/unrolling - multi-function (global) optimizations - removes unused functions - in-lines calls to small functions - can perform multi-file optimizations using project mode (assertions) - other options available with Level 3 Level 0 Level 1 “0” + ... Level 2 “1” + ... Level 3 “2” + ... optimization levels are set via CCS build options

Other C Stuff... In-Line Assembly - can disrupt C asm(“ IDLE 1”); #include <intrindefs.h> y = _smacr(x1, x2, x3); Intrinsics - ASM instructions in C - see C Compiler guide #pragma Data_Section(y,“Var”); int y = 0; Data/Program Sections volatile unsigned int *ctrl; while (*ctrl != 0xFF); Volatile Keyword - compiler may remove code without volatile keyword VL allows you to easily change the default stack and heap sizes These are other items that some might find useful. Watch out for in-line assembly. It’s ok for setup code like turning on/off interrupts, etc. However, the compiler will optimize and move code around and simply COPY the asm statements into the assembly file, NOT caring a bit about WHERE it’s placed. Intrinsics are handy if an assembly instruction performs a task that is difficult to do in C (for example and saturated add). If you don’t want all your global variables to end up in .bss or all your code in .text, pragmas can help place code/data sections EXACTLY where you want them. This is analogous to .usect and .sect in assembly. The compiler’s optimizer may strip out variables and code chunks if it cannot see a variable changing. For example, if you have a variable tied to something in hardware (like a timer), C can’t SEE it change. In the example above, if ctrl doesn’t change inside the loop, C thinks it’s an infinite loop and removes the variable and the associated code. Using the volatile keyword says to the compiler “don’t mess with this variable - I know what I’m doing…” The interrupt keyword automatically tells the compiler to generate context save/restore. It will save only those registers used in the ISR. However, if a CALL to another function occurs inside the ISR, C will perform a full context save/restore. So, don’t make a CALL inside an ISR! If students want more info on pragmas and intrinsics, simply pull up the C compiler guide and go through the lists.

LAB11A - Mixing C and ASM 1. Review the given file: MAIN11A.C 2. Modify block FIR routine to be C callable 3. Create a Visual Linker recipe 4. Build, profile and verify operations Time: 75 minutes Lab11b demonstrates calling a DSPLIB function from C. If you have time, run the lab.

MAIN11A.C - Solution #define RESULTS 185 #define TAPS 16 // Initialize Coefficient Table #pragma DATA_SECTION (a,"coeffs"); int a[TAPS] = {0x7FC, 0x7FD, 0x7FE, 0x7FF, 0x800, 0x801, 0x802, 0x803, 0x803, 0x802, 0x801, 0x800, 0x7FF, 0x7FE, 0x7FD, 0x7FC}; // Specify specific address for the result: y #pragma DATA_SECTION (y,"yloc"); int y[RESULTS]; // include initialized x array #include "in11.h" extern void fir(int taps,int results,int *y); main() { // set wait states to zero using in-line assembly asm(" STM #0,SWWSR"); // call assembly FIR routine fir(TAPS,RESULTS,y); }

LAB11A.ASM - Solution ; allocate label definition here .mmregs .def _fir .ref _a,_x ; allocate initialized data sections here ; only the first 8 values are used in Labs 2a and 3a ; allocate code section here .sect "code" _fir: STLM A,BK ;load BK with TAPS (16) LD *SP(1),A ;load parameter RESULTS into A SUB #1,A ;subtract 1 from the number STLM A, BRC ;load BRC with RESULTS-1 (184) MVDK *SP(2),*(AR1) ;load ARn with &y RSBX CPL ;turn off Compiler Mode LD #0,DP ;set SST bit (saturate on store) ORM #1,@PMST SSBX CPL ;turn on Compiler Mode

LAB11A.ASM - Solution (continued) SSBX FRCT ;set FRCT bit (fractional mode) RSBX OVM ;clr OVM bit (overflow mode) SSBX SXM ;set SXM bit (sign extension) STM #1,AR0 STM #_a,AR2 ;setup ARs for MAC STM #_x,AR3 RPTB done-1 MPY *AR2+0%,*AR3+,A ;1st product RPT #14 ;mult/acc 15 terms MAC *AR2+0%,*AR3+,A MAR *+AR3(-15) STH A,*AR1+ ;store result done: RSBX FRCT RET ;return

MAIN11B.C - Solution /* include header files */ #include "math.h" #include "tms320.h" #include "dsplib.h" /* Define Sample and Tap sizes for function */ #define RESULTS 200 #define NH 16 #define NX 200 short i; /* Initialize Coefficient Table ... MUST BE ALIGNED IN MEMORY */ #pragma DATA_SECTION (h,"coeffs"); DATA h[NH] = {0x7FC, 0x7FD, 0x7FE, 0x7FF, 0x800, 0x801, 0x802, 0x803, 0x803, 0x802, 0x801, 0x800, 0x7FF, 0x7FE, 0x7FD, 0x7FC}; /* Specify specific address for the result array r */ #pragma DATA_SECTION (r,"yloc"); DATA r[RESULTS]; /* include initialized x array */ #pragma DATA_SECTION (x,"xloc"); #include "in11b.h"

MAIN11B.C - Solution (continued) /* Setup delay buffer ... MUST BE ALIGNED IN MEMORY */ #pragma DATA_SECTION (db,"delaybuff") DATA db[NH]; /* Setup indirected delay buffer pointer */ DATA *dbptr = &db[0]; main() { /* set wait states to zero using in-line assembly */ asm(" STM #0,SWWSR"); /* clear delay buffer */ for (i=0; i<NH;i++) db[i] = 0; /* call DSPLIB BLOCK FIR routine */ /* x pointer to data h pointer to aligned coeffs &dbptr delay buffer NH # of data samples NX # of coeffs oflag overflow error flag */ fir(x, h, r, &dbptr, NH, NX); }

Follow on Activities for C54x Laboratory 9 for the TMS320C5416 DSK Compares the performance of functions written in C code and assembly language and monitors the execution speed. Will allow the following question to be answered: Does assembly language always offer and advantage in performance over C code?

Objectives – C55 Understand the C Environment Setting compiler options Describe how to Mix C and Assembly

C Run-time Environment Main.C int func(int,int,int,int,int); int y = 0; void main (void) { y = func(1,2,3,4,5); } How are these sections linked? y (global) .bss 0 (init val) .cinit code .text Func.C int func(int a,int b,int c,int d,int e) { return(a + b + c + d + e); }

C Linker Command File vectors: > VECS ;vector table MEMORY { VECS: org = 0xFFFF00, len = 00100h EPROM: org = 0xFF0000, len = 0FF00h SARAM: org = 0x008000, len = 08000h DARAM: org = 0x002000, len = 02000h CROM: org = 0xFE0000, len = 10000h } vectors: > VECS ;vector table .text: > EPROM ;code .cinit: > CROM ;global inits .bss: > SARAM ;global vars .stack: > DARAM ;for SP .sysstack: > DARAM ;for SSP .const: > CROM ;constants .sysmem: > SARAM ;for heap .switch: > CROM ;for case stmts SECTIONS { }

Run-time Environment (STx_55) Compiler Compiler ST0_55 Name Expects Modifies ACOV[0–3] Overflow detection Yes C Carry Yes TC[1–2] Test control Yes DP[7–15] Local data page reg No BRAF Block-repeat active No CPL Compiler mode 1 No INTM Interrupt mode No M40 32/40-bit (D unit) 0 No SATD Saturation (D unit) 0 Yes SXMD Sign-exten. (D unit) 1 No C16 Dual 16-bit math No FRCT Fractional mode 0 Yes C54CM Compatibility mode 0 No ASM Accumulator shift Yes Compiler Compiler ST1_55 Name Expects Modifies

Run-time Environment (STx_55) Compiler Compiler ST2_55 Name Expects Modifies ARMS AR mode 1 No RDM Rounding mode 0 No CDPLC CDP linear/circular 0 No AR[0–7]LC AR[0–7] lin/circular 0 No Compiler Compiler ST3_55 Name Expects Modifies MPNMC MPNMC mode No SATA Saturation (A unit) 0 Yes SMUL Sat on multiply 0 Yes SST Sat on store No If the user modifies a “presumed value”, this value must be restored by the function

Initializing the C Environment ... Boot.c in rts55.lib _c_int00: Initialize global and static variables Initialize C environment variables to expected values Setup stacks (SP & SSP) Call _main ;cvectors.asm On reset, how do you tell the CPU to begin execution at _c_int00? .ref _c_int00 .sect “vectors” rsv: .ivec _c_int00 All C symbols accessed in assembly require a leading underscore

Objectives Understand the C Environment Setting compiler options Describe how to Mix C and Assembly

Setting Up a C Project In addition to what has been done before, you need to: Add any necessary include files (via your source code) Add the run-time support library (rts55.lib) Add the C source files

Selecting the Build/Optimization Options -g and -o3 compete with each other. -g is used primarily for debug and the compiler generates very little code movement to assist in debugging. However, if you want the fastest code, use -o3 alone - lots of code motion and much better performance. Using -o3 hurts the debug effort, but you should have used -g alone during debug before you flipped the -o3 switch anyway. So, the moral of the story is: use -g first, debug. Then, when everything looks good, remove -g and use -o3. If you use them together, -g will turn off some of the optimizations which -o3 would have done. Select: Project  Build Options  Compiler Tab Choose the necessary compiler options Build and benchmark.

Objectives Understand the C Environment Setting compiler options Describe how to Mix C and Assembly Language

Creating a C-Callable Assembly Routine Main.C int func(int,int,int,int,int); int y = 0; void main (void) { y = func(1,2,3,4,5); } Main.C: - prototypes called function - calls function How are the parameters passed to func( ) ? Func.C int func(int a,int b,int c,int d,int e) { return(a + b + c + d + e); } With -o3 enabled, func is deleted and main simply does: MOV #15,*(#_y)

Parameter Passing Conventions y = func(1,2,3,4,5); The compiler will scan the parameters (from left to right) and place them into the following registers (from left to right): 16-bit Integers Pointers Longs T0-1, AR0-4 XAR0-4 AC0-2 The registers are filled in the order shown: - T0 gets the first 16-bit int, T1 gets the 2nd, etc. - XAR0 gets the first pointer if available, etc. - 32-bit values are passed in AC0-2 All parameters that don’t find a home in a register, are placed on the stack (SP). Don’t get hung up on the “left to right” thing. It’s really simple. The compiler will scan the parameters (from left to right) and place the first parameter into the register shown (from left to right). This means that if the first parameter is an int, it goes into T0. If it is a pointer, it goes into XAR0. If it is a long, it goes into AC0. Then, after the first parameter has been placed, the compiler grabs the second parameter and places it into the first home it can find going “left to right” in the table again. If the FIRST parameter was a POINTER, XAR0 has been used. If the SECOND parameter is an INT, it goes into T0 (because T0 has not been used yet). This all can be a bit confusing, but if you follow the language in the first bullet above, it says it all. So, which registers will contain the parameters listed above?

Exercise/Recommendations var = func(0x98765432,1,2,3,4,5,a,x,y,z); a, x, y, and z are pointers Where are these parameters placed by the compiler? T0= T1= XAR0= XAR1= XAR2= XAR3= XAR4= AC0= 1 2 3 4 5 a x 0x98765432 last used Stack y SP lo hi z (parameters are placed on the stack as shown) The registers list (T0…AC0) is the same order as on the previous slide. The compiler will attempt to place the first parameter into T0, then T1, etc. all the way to AC2. I chose to put the long (0x98765432 first in the list to make this whole point). Pass pointers first Pass most-used parameters first Recommend:

Calling Convention, Accessing the Stack So, who’s responsible for saving registers? The parent or child? Parent - SOC (save on call) T0-1 XAR0-4 AC0-3 Child - SOE (save on entry) T2-3 XAR5-7 If the child routine modifies the SOE registers, it must preserve their contents Return values from a function are placed here: Return Values 16/32-bit integer: T0/AC0, Data/Func pointer: XAR0/AC0 Accessing parameters on the stack: PSH AR6,AR7 MOV *SP(#3),AR6 MOV *SP(#4),AR7 SP parent stk Stack &z &y PC AR7 AR6

Now let’s write Func.C in assembly... Remember Func.C ? Main.C int func(int,int,int,int,int); int y = 0; void main (void) { y = func(1,2,3,4,5); } Now let’s write Func.C in assembly... Func.C int func(int a,int b,int c,int d,int e) { return(a + b + c + d + e); }

Writing Func.ASM Entry Algorithm Exit .def _func _func: - declare func as global - define entry point (label) ;push SOE registers ;adjust STx_55 values - save SOE registers ADD T1,T0 ;a + b ADD AR0,T0 ;+ c ADD AR1,T0 ;+ d ADD AR2,T0 ;+ e Algorithm - execute the algorithm return(a + b + c + d + e); - place result in return reg ;restore STx_55 values ;pop SOE registers RET Exit - restore SOE registers PC SP - return to calling routine used

Other C Stuff... In-Line Assembly - can disrupt C asm(“ IDLE”); #include <intrindefs.h> y = _sadd(x1, x2); Intrinsics - ASM instructions in C - see C Compiler guide #pragma Data_Section(y,“Var”); int y = 0; Data/Code Sections volatile unsigned int *ctrl; while (*ctrl != 0xFF); Volatile Keyword - compiler may remove code w/o volatile kywd interrupt void my_ISR (void) Interrupt Keyword - context save/restore These are other items that some might find useful. Watch out for in-line assembly. It’s ok for setup code like turning on/off interrupts, etc. However, the compiler will optimize and move code around and simply COPY the asm statements into the assembly file, NOT caring a bit about WHERE it’s placed. Intrinsics are handy if an assembly instruction performs a task that is difficult to do in C (for example and saturated add). If you don’t want all your global variables to end up in .bss or all your code in .text, pragmas can help place code/data sections EXACTLY where you want them. This is analogous to .usect and .sect in assembly. The compiler’s optimizer may strip out variables and code chunks if it cannot see a variable changing. For example, if you have a variable tied to something in hardware (like a timer), C can’t SEE it change. In the example above, if ctrl doesn’t change inside the loop, C thinks it’s an infinite loop and removes the variable and the associated code. Using the volatile keyword says to the compiler “don’t mess with this variable - I know what I’m doing…” The interrupt keyword automatically tells the compiler to generate context save/restore. It will save only those registers used in the ISR. However, if a CALL to another function occurs inside the ISR, C will perform a full context save/restore. So, don’t make a CALL inside an ISR! Linker Command Options - stack, sysstack and heap sizes can be changed via CCS

LAB12A - Mixing C and ASM Exercise for C55 1. Review the given file: MAIN12A.C 2. Modify block FIR routine to be C callable 3. Review/modify given linker command file 4. Compile, link, benchmark and verify operations Time: 90 minutes

LAB12A – Using the C Compiler/Optimizer 1. Review the given file: lab12a.c 2. Run the compiler, profile the results. Result? 54K cycles. 3. How do you use compiler options to decrease the benchmark to ~2000 cycles? Time: 30 minutes LAB12B – Mixing C and Assembly 1. Review the given file: main12b.c 2. Modify block FIR routine to be C callable 3. Compile, link, benchmark and verify operations Time: 90 minutes

main12b.C - Solution /* Define Sample, Taps, Counts for function */ #define SAMPS 200 #define TAPS 16 #define BLKRPT_CNT (SAMPS-TAPS)/2 /* generate 186 results */ #define RPT_CNT TAPS-3 /* modify based on your asm routine */ /* Initialize Coefficient Table */ int a0[TAPS] = { 0x7FC, 0x7FD, 0x7FE, 0x7FF, 0x800, 0x801, 0x802, 0x803, 0x803, 0x802, 0x801, 0x800, 0x7FF, 0x7FE, 0x7FD, 0x7FC}; /* Specify specific address for the result: y */ #pragma DATA_SECTION (y0,"yloc"); int y0[200]; /* include initialized x array */ #include "in12.h" extern void fir(int *x0,int *a0,int *y0,int taps,int blkrpt,int rpt); main() { /* call assembly FIR routine */ fir(x0,a0,y0,TAPS,BLKRPT_CNT,RPT_CNT); for (;;); }

lab12b.asm - Solution .cpl_on .arms_on .c54cm_off .def _fir ; Parameter pass looks like this: ; XAR0 = x0 ; XAR1 = a0 ; XAR2 = y0 ; T0 = TAPS ; T1 = BLKRPT_CNT ; XAR3 = RPT_CNT .text _fir: PSH mmap(@ST1) ;context save of affected PSH mmap(@ST2) ;status registers BSET FRCT ;turn on multiplier shift BSET M40 ;turn on 40 bit math BSET SXMD ;turn on sign extension (C default) BCLR C54CM ;go to C55 native mode (C default) MOV T1,BRC0 ;block repeat count MOV XAR0,XAR4 ;set upper bits of XAR4 MOV XAR1,XCDP ;pointer for coefficients MOV #0, CDP ;zero lower bits of XCDP MOV AR1,mmap(@BSAC) ;buffer start address MOV T0,mmap(@BKC) ;buffer size

lab12b.asm - Solution ADD #1,AR4 ;pointer setup for x1 MOV AR3,T0 ;set up TO for ptr wrap MOV AR3,CSR ;Computed single repeat value BSET CDPLC ;turn on circ addressing for CDP RPTBlocal end MPY *AR0+,*CDP+,AC0 ;AC0 gets 1st product :: MPY *AR4+,*CDP+,AC1 ;AC1 gets 2nd product || RPT CSR ;RPT in parallel with MPYs MAC *AR0+,*CDP+,AC0 ;form 14 results :: MAC *AR4+,*CDP+,AC1 MAC *(AR0-T0),*CDP+,AC0 :: MAC *(AR4-T0),*CDP+,AC1 ;form last result and wrap pointers end: MOV pair(hi(AC0)),dbl(*AR2+) ;store AC0 & AC1 results POP mmap(@ST2) ;restore status registers POP mmap(@ST1) RET