2 The DSP Software Challenge hardware capabilityapplication complexityjust ship it !!TIME-TO-MARKETPRESSURE%HWSWtincreased cost/riskinsufficient re-useSoftware has become increasingly important to TI as part of an overall DSP solution for our customers. Here’s why....DSP hardware capability increasing at a dramatic pacemore MIPS, less power, greater integration, etcno end in sight; nothing but “green lights” aheadcomplexity of applications targeted at today’s DSPs rising at comparable ratesyesterday’s 1000 line assembly programs becoming 100,000 lines of C codecan we develop SW fast enough to harness HW capability; “red light”software now dominates the overall engineering costs of DSP product developmentestimates range as high as 80%; 4 out of 5 developers on a typical projectputting a chip and some memory on a board is “easy”if you’re late to market, it’s probably because of a SW (not HW) probleminsufficient re-use of pre-fabricated software componentsmost designers “re-invent” the wheel from one application to the nextno relief from management regarding timely product deliverymarket windows actually shrinking, despite increased SW cost/complexitysoftware has emerged as the critical factorif you fail, it’s probably because software overwhelmed youbut software is also where you differentiate and innovate; the key to successSo let’s look at how TI has responded to the challenge of DSP software....software — the critical factor
3 Elevating The Platform EDN Innovation of the YeareXpressDSPintegrated development toolsreal-time software foundationstandards for interoperabilitynetwork of third-party partnerstarget programapplication frameworksCode Composer Studio™algplug-inTMS320 DSPAlgorithmStandardWith the fall 99 announcement of our eXpressDSP Real-Time Software Technology, TI has once again raised the bar and set new standards across the industry....create a discontinuity from SW environment of the pastbasic (command-line) tools for building and debugging target DSP programseXpressDSP introduces four essential ingredients into today’s DSP SW environmentingredients not necessarily found in our competitors offerings(1) integrated development environment, known as Code Composer Studioworld-class HLL program generation tools; industry’s leading debuggerintuitive, easy-to-learn visual environment; open and extensible(2) real-time software foundation, which is DSP/BIOSnot enough to have world-class tools, if we just “re-invent” the wheelBIOS comprises essential target software content common to all applicationsenables real-time program analysis through real-time host link (RTDX)(3) standards for application interoperabilty, such as TMS320 DSP Algo. Std.enables easy integration of independent (third-party) software componentsanalogous to “building codes” that standardize wiring, plumbing, etcpotential to add unlimited value to the BIOS software foundation(4) thriving network of third-party partners that build upon this infrastructureover 400 third-parties; larger than our two biggest competitors combinedadd plug-ins and supplementary tools to (1)extend (2) with platform-specific drivers and/or network communications modulesflesh out (3) with 100s of compliant algs that interoperateskeletal application frameworks that build upon (1), (2), and (3)won “EDN Innovation of the Year Award” in 2000All that you see here is available today, ready to address your needs through modular application software solutions that leverage the efforts of others to the greatest extent possible. But first, let’s consider the alternative....programbuildprogramdebugreal-timeanalysisRTDX™DSP/BIOS™driverscommhost computerTMS320 DSP
4 Grow Your Own ... too costly to develop too costly to enhance too costly to maintainapplicationalgalgapplicationI/OschedulercommalgschedulerI/Oapplicationapp + algalgschedulerapplicationapp + algA + algB + ...app + sched + algAn + algBn + ...Developing DSP application software entirely from scratch, while seemingly simple at first, can lead you down a slippery slope from which there is no escape....(1) execute a single algorithm, written in software, on suitable DSP hardwarepossibly leverage 3rd-party IP, but with some NRE for customization(2) encapsulate algorithm in an application program performing other system functionsHW initialization, data I/O, control, etc.(3) new requirements; need to add a second DSP algorithm to the mixincompatible assumptions with first alg; more NRE to “bring it in line”individual algorithms often assume they have “run of the house”(4) using a more powerful DSP; capable of supporting a multi-channel applicationneed to add a “home-brew” scheduler to serve as a traffic copmore NRE in the algs to support multi-channel operation (e.g., re-entrancy)(5) multi-channel systems utilize sophisticated peripheral devices for multi-channel I/Oneed to extend our home-brew scheduler to drive I/O devices as wellI/O devices are complex, subtle, and continually being “improved” in new DSPs(6) suddenly there are multiple processors in the system, DSPs as well as a master GPPmore work in the scheduler to support DSP-DSP communication/synchronizationmaster-slave GPP-DSP communication involves integration with GPP/OShere’s the slippery slope....application grows ever-larger; “house-of-cards” foundation; ticking time-bombsoftware costs escalate out-of-control, espcially after initial developmentcan’t rapidly accommodate new hardware technologies and/or market needsUsing eXpressDSP, you can get a handle on the hidden cost of home-grown DSP software and, quite literally, do more with less....algapplicationapp + sched + I/O + algAn + algBn + ...applicationalgalgapp + sched + I/O + comm + algAn + algBn + ...00101DSPDSPGPP
5 §some programming required ... Or Leverage Othersmore time to innovateless time to integrateability to differentiateapplicationblueprintsoff-the-shelfalgorithmsDSP/BIOS™real-time kernelModularApplicationSoftwareSolutionsCUSTOMEReXpressDSP™infrastructureWith eXpressDSP, not only can you work smarter; by leveraging the efforts of others you can actually avoid work altogether and spend your precious engineering resources on more productive endeavors....(1) eXpressDSP provides a standard “backplane” for target softwareensure interoperability, ease-of-integration, etc.leverage a mature set of standards already in use(2) DSP/BIOS kernel implements core functions used in virtually all applicationstask scheduling, device I/O, memory mgmt, communication, etc.leverage technology in deployment for a decade within 1000s of designs(3) eXpressDSP value web offers extensive catalog of off-the-shelf algorithmsbeyond “make vs buy” decision for most customersleverage IP of the industry’s largest network of third-party partners(4) TI provides extensive programming examples that serve as application blueprintsprogram templates, architectural design patterns, software recipes, copywareleverage our own expertise in DSP system developmentresult is M.A.S.S. in which you, TI, and our value web all contribute software contentnot necessarily a turnkey-solution; not necessarily a “software chipset”“some programming required”; expect the customer to enhance/extendeXpressDSP also provides the industry’s best programming tools and IDE (CCS)the bottom linemore time for creative engineering work; “express” yourself with eXpressDSPless time integrating disparate software elements; major cost factor todayability to add unique, differentiated, competitive value to your productHaving seen the big picture of eXpressDSP, let turn our attention to the real-time software foundation provided by the DSP/BIOS kernel....VALUE-WEBFOUNDATION§some programming requiredBACKPLANE
6 TMS320 Software Foundation target programsDSP/BIOS Kernel InterfaceTMS320 DSP PlatformextensiblescalableDSP/BIOS Kernel Moduleslibrary of essential application servicesmanages threads, memory, I/O, timers, ...support for C5000, C6000, C2000 familiesconsumes minimal MIPS & memoryintegrated real-time analysis toolsroyalty-free with every TMS320 DSProbust, field-tested, industry-provenused in 1000s of active DSP designsWhen you think of targeting an application program to a TMS320 DSP, that platform is more than just silicon; it also incorporates a set of scalable, extensible software modules that comprise the DSP/BIOS kernel....module functions invoked from target application through programmatic interface (API)don’t “re-invent the wheel”; build “skyscrapers” on the foundation we already providesome important things to know about the DSP/BIOS kernelapplication-agnostic; essential functions common to virtually all DSP applicationsmanages key resources of target platform (CPU, memory, peripherals)not a generic kernel for MCUs; features designed for unique needs of DSPsupports all 5000, 6000, 2000 devices, including new 55x and 64x architecturespackaged as relocatable, re-entrant SW library (scalability)only the modules you need are present in the target; minimal memory footprintmany modules written in assembler; reduces MIPS consumptionrule of thumb is < 1K (2K) words on 5000 (6000); less than .1 MIP in many appsnot “new bits”; some modules have been in deployment for over 10 yearspart of the CCS product bundle; tightly integrated with other development toolsreal-time analysis tools; software-equivalent to a HW logic analyzerrun-time (deployment) license included in the price of every TMS320 DSPthe “floor” which we and our third-parties presume is present (extensibilily)1000s of new designs each year; used by 80% of our customers; “critical mass”We mentioned Code Composer Studio; let’s take a closer look at how target programs utilizing DSP/BIOS are developed within this integrated environment....C5000C6000C2000
7 Programming With DSP/BIOS C- and ASM-callable functionsHOST DEVELOPMENT COMPUTERCode Composer StudioBUILDprogramsourceskernel APIsinteractive configuration toolkernel modulesCONFIGURATIONkernel-aware debug supporton-the-fly program analysisexecutableimagetarget application programVISUALIZATIONCode Composer Studio integrates all of the hosted tools needed to develop and deploy a target application program that utilizes the DSP/BIOS kernel as part of its execution (or run-time) environment ....DSP/BIOS is essentially a library of functions callable from C (or assembly-language)prepare your source files in CCS; include BIOS API headers; embed function callsdozens of callable target functions, organized into semi-independent modulesprograms are built (compiled/assembled) in the usual fashioninteractive tool enables configuration of BIOS modules, tailored to your applicationselect only the modules you needset module-specific parameters that control run-time behaviorpre-create kernel data objects (in static systems only) to save memoryoutput a library of kernel code/data that is linked into program in usual mannerresulting executable file contains BIOS code/data within its imageload, execute, and test programs using JTAG-based emulation and CCS debuggeradditional features for displaying status of kernel modules at breakpoints(task-aware debugging will be supported in future CCS release)perform “on-the-fly” (real-time) analysis of target program, without halting executionDSP/BIOS contains functions for capturing information about running programcaptured information uploaded to host using RTDX protocol over physical JTAG>100 kbps RTDX bandwidth (single voice channel); roadmap to 10s of Mbytes/sCCS contains visual tools for displaying program info; SW “logical analyzer”unique feature of CCS/BIOS; important for finding “nasty glitches”“if you can’t see the problem, you can’t fix it”If you want to know more DSP/BIOS, we’ll shortly be giving you pointers to in-depth technical information about the product; but first, let’s address the “bread-and-butter” of your application — DSP algorithms....DEBUGDSP/BIOS Kernel InterfaceRTDXreal-timecapturemultiplethreadshardwareabstractionJTAGEMULATIONTARGET TMS320 DSP HARDWARE
8 Mass-Market Algorithms 600300900catalog of standard, compliant algorithmsvoice, telephony, video, imaging, audio, ...multiple suppliers — over 50 third-partiesfollow uniform set of rules and guidelinessimplifies benchmarking of alternativessupport for C5000, C6000, C2000 familiesdeployable in static or dynamic systemsE-commerce channel for “one-stop” shopThe TMS320 DSP Algorithm Standard — like DSP/BIOS, another essential ingredient of eXpressDSP — enables our third-party partners to deliver sophisticated signal processing algorithms to the broader marketplace....rather than “home-grown”, we want to drive a “supermarket” mentalitylots of inventory; lots of suppliers; easy-to-buy; easy-to-usekey metrics (as of 01/01), giving evidence of “critical mass”~300 algorithms already stamped “eXpressDSP Complaint”even more algorithms in backlog, awaiting compliance-testingover 700 compliant algorithms by the end of this yearsome important things to know about the Alg Std and our third-party value web“catalog algorithms”; not unlike “catalog DSPs”addressing a variety of apps; < non-telecom is a fastest-growing area >multiple suppliers gives the customer more choice, security, value, etc.all compliant algs follow rules/guidelines formally prescribed by TI in stds docthis facilitates “apples-to-apples” comparison; also speeds integration; no surprisesalg std contains generic rules, plus rules specific to 5000, 6000, 2000 ISAsalg std comprehends spectrum of target systems; static/dynamic, 1/n channels, etc.streamlined mass-market E-commerce through //dspvillage.comwe strongly encourage you to visit this web site <and will do so again later!!>Before we demonstrate for you the kind of algorithm-interoperability made possible by the TMS320 DSP Algorithm Standard, let’s understand a little more about the standard itself....http//dspvillage.ti.com
9 DSP Algorithm Standard ease-of-integrationALGORITHMCONSUMERSstatic alg1 chan1dynamic algn channRules & Guidelinesuniform naming conventionsregister usage requirementsdata addressing modesre-entrant, relocatable codememory allocation policiesaccess to HW peripheralsminimizing interrupt latencyperformance characterizationResource Management Framework(s)To best appreciate the power of the TMS320 DSP Algorithm Standard, we must consider the perspective of algorithm producers as well as algorithm consumers....the perspective of algorithm producersthey want to see the same work used in many, many application environmentsmany are “smaller” companies; can’t afford to engage in per-customer NREthe perspective of algorithm consumersthey want to easily integrate a third-party algorithm into their applicationthere is quite a spectrum of possibilities; static/dynamic, single/multiple channelsthe producer and consumer are separated in time and space; minimal interactionthe alg std dictates rules/guidelines for producers that simplify re-use by consumersrelatively few producers; many, many consumersadditional “work” imposed on producers offset by more consumers, less supportsome rationalization for these rules; many are SW “common sense”; <not complete set>1) consistency, no conflicts with other software2) consistency, no conflicts, works with standard C code3) consistency, no conflicts, works with standard C code4) single/multi-channel, flexible use of program memory5) a common framework for requesting memory; on-chip, off-chip, scratch, etc.6) generally disallowed in algs; keeps the alg independent of the platform7) can impact real-time performance of the system as a whole8) facilitates apples-to-apples comparisonthe alg std also dictates a common programmatic interface implemented by all algsthe consumer (system integrator) implements a “resource framework” as neededexamples covering static to dynamic, with variations as wellHere too, we can provide you with additional technical information on the Algorithm Standard once <!!!> you have found compliant algorihtms that meet your needs....Common Programmatic Interfacewrite once, deploy widelyALGORITHMPRODUCERS
10 Points To Rememberdon’t re-invent the wheel — build upon the DSP/BIOS foundation designed & optimized for DSP applicationsshop our value web — take advantage of our extensive catalog of compliant DSP algorithmsinnovate and differentiate — join the 1000s of active DSP customers already usingeXpressDSPTo summarize, here are some key points we’d like you to remember as you consider using TMS320 DSPs in your next application....< what more can I say; just read the slide!!! >So let’s get started with eXpressDSP....FOUNDATIONVALUE-WEBCUSTOMERBACKPLANE
11 Let’s Get Started visit http: //dspvillage.ti.com app notes, bulletins, FAQs, discussion groups, ...register at TI&ME for personalized contentget first-hand experience with DSP/BIOSenroll in our hands-on, one-day training courseprototype your application using our DSP Starter KitHere’s a number of actions you can take to learn more about eXpressDSP....visit our on-line DSP village; lots of technical content; be sure to registertake a BIOS workshop; touch the product; pick the instructor’s brainor play with BIOS on your own; prototype your app; weigh it / time it yourselfif appropriate, see whether a compliant algorithm already exists for your appif so < and only then!> you can download more info about the alg std< always give customer a choice; and make sure ‘no’ isn’t one of them!!! >< use this slide as a ‘presummative close’ >< assign “action items” for later followup >< be careful with letting the customer “go it alone”, without some prior study / support >I know you still may want to investigate this further, but let me ask you this right now: “Do you see any reason why you would not be leveraging eXpressDSP in your next TMS320 design?”explore the world of compliant DSP algorithmsquery our on-line database of third-party productsdownload the Algorithm Standard Developer’s Kit
13 TMS320TM DSP Algorithm Standard (XDAIS) IntroductionMemory TypesMemory Setup SequenceAbstract InterfaceInstance ObjectsAlgorithm Coding RulesConclusionseXpressDSPAlgorithms in ApplicationsNon-standard AlgorithmsConnecting Vendors & UsersBenefits of the StandardRequirements of a StandardWhat is the benefit of the standard?What makes a good standard?
14 eXpressDSPTM: The DSP Software Solution Set Code Composer StudioTM IDE Powerful, integrated development toolsDSP/BIOS Real-time software foundationTMS320™ DSP Algorithm Standard Standards for application interoperability and reuseTI DSP Third-Party Network Software and support
15 Elements of eXpressDSPTM Host ToolsTarget ContentYour ApplicationProgram BuildProgram DebugData VisualizationHost APIsPlug-in ToolsAnalysisADC ConfigTMS320TM DSP Algorithm Standard- IDE -DSP/BIOSRTDXReal-Time AnalysisHost ComputerJTAGTMS320TM DSP
16 Problems with Non-Standardized Algorithms Today it’s difficult to integrate real-time algorithms from more than single source because of a lack of standards.Integration times are extendedDebugging is tricky (what’s that black box doing ?)It’s difficult or impossible to compare similar algorithmsIt’s difficult or impossible to rapidly prototype a systemApplicationAlgAlgAlgAlgAlgTMS320 DSP
17 TI Enhances Vendor / User Process ALGORITHMPRODUCERSTMS320TM DSPAlgorithmStandard SpecificationRules & GuidelinesProgramming rulesAlgorithm packagingAlgorithm performanceDSP platformC5000C6000TEXAS INSTRUMENTSSYSTEM INTEGRATORSAlgorithmApplicationwrite once, deploy widelyease of integration
18 Benefits of the TI DSP Algorithm Standard An application can use algorithms from multiple vendorsfor users: allows greater selection based on system needs: power, size, cost, quality, etcfor vendors: levels the playing fieldAn algorithm can be inserted into practically any applicationfor vendors: larger potential marketfor users: yields larger number of algorithms availableThe same code can be used in static or dynamic systemsfor vendors: more reuse potentialfor users: more reliabilityAlgorithms are distributed in binary formfor vendors: Intellectual Property (IP) protectionfor users: “black box” simplicity
19 Requirements of a Successful Standard For a DSP Algorithm Standard to be successful, it must:Be easy to adhere toBe measurable/verifiable as conformed to by algorithmsEnable host tools to simplify:ConfigurationPerformance modelingStandard conformanceDebuggingIncur little or no overheadQuantify the algorithm’s: memory, latency, speedTI’seXpressDSPAlgorithm Interface Specificationmeets all these requirementsXDAIS
20 TMS320TM DSP Algorithm Standard IntroductionMemory TypesMemory Setup SequenceAbstract InterfaceInstance ObjectsAlgorithm Coding RulesConclusionsAlgorithm Memory TypesScratch vs. PersistentControlling Memory SharingStatic Shared MemoryWhat kinds of memory can algorithms specify?How do I minimize memory usage?What system options do I have?
21 Types of Memory Needed by Algorithms StackLocal variables; managed by algorithmManaged by Application “Framework”HeapContains algorithm objects and variable-length buffersRead/Write dataMay be allocated and freed at run-time (dynamic systems)Scratch memoryUndefined pre & post condition of data in bufferPersistent memoryPre-condition(t): data in buffer = post-condition(t - 1)Static DataData allocated at link time; shared by all instances
22 Space Inefficient Memory Allocation Algorithm AScratch APersistent AAlgorithm BScratch BPersistent BPhysicalScratch BPersistent BScratch APersistent AMemoryMay be OK for speed optimized systems, but may pose problems for systems where minimum memory usage is desired...
24 Examples of Scratch RAM Management... BCDEA, B, and C are sequential to each other.D & E are parallel to A,B, or C, but sequential to each otherScratch RAMABCDEFA-E have enough space to all run in parallel. F needs all the scratch, so A-E are all Deactivated to make room for FScratch management is entirely at the discretion of the application.The algorithm is not perturbed by the implementation choices selected.
25 Shared Scratch Memory Synchronization Inhibit preemption when running code that accesses shared memoryAssign concurrent processes to the same priority = automatic FIFO otherwise, any number of desired methods can be considered:Disable interrupts HWI_disable HWI_enableDisable scheduler SWI_disable SWI_enableTSK_disable TSK_enableTask Semaphores (lock, unlock) SEM_pend SEM_postRaise priority SWI_raisepri SWI_restorepri TSK_setpri TSK_setpri
26 Shared Persistent Memory Static read-only tablesOptimize reuse (e.g., in on-chip memory) by sharing global read-only data for multiple instances of an algorithmSeparate object referenced by multiple instancesExample: 2 FIR filters with identical - fixed - coefficient tablesStatic Read-only Data…Static global dataInstanceInstancenInstance heap data
27 TMS320TM DSP Algorithm Standard IntroductionMemory TypesMemory Setup SequenceAbstract InterfaceInstance ObjectsAlgorithm Coding RulesConclusionsMemory Setup ModelMemory Setup SequenceIALG OptionsIALG InterfaceWhat are the steps to setting up the memories needed?What optional controls are available?How do we optimize for static and dynamic systems?
28 Memory Setup Model Algorithm Size Knows memory requirements Requests appropriate resources from ApplicationSizeAlignmentTypeScr/PersistApplication “Framework”Manages memory requirementsDetermines what memories are available to which algorithms - and whenAddressSizeAlignmentTypeScr/PersistPhysical Memory Types:External (slow, plentiful, lower cost)Internal (fast, limited, higher cost)SARAM, DARAMAddressSizeAlignment...
29 Algorithm Memory Interrogation Sequence To Alg: How many blocks of memory do you need? algNumAlloc() To App: n App: make 5*n words of memory table (memtab) availableTo Alg : Write the needs of each block to memtab. algAlloc() Alg: writes 4 values describing block info (size, alignment, type, scratch/persistent) App: set aside specified memories, fill in address of blocks in memtabTo Alg: Here’s the memories I got for you. algInitObj() Alg: copy address pointers to my instance structure and set up persistent arraysTo Alg: Get ready to run - prepare your scratch memory. algActivate() Alg: fill up my scratch memory as desired (eg: history buffers, etc)App may now call alg processing functions to run it’s routines…To Alg: I need your scratch memory back. algDeactivate() Alg: copy needed scratch values to persistent memoryTo Alg: Update memtab so I know what I can free up. algFree() Alg: update 5*n values in memtab App: de-allocate any desired scratch memories for use by other components
30 IALG Object Creation Sequence Diagram Call algNumAlloc() to get # of memory reqsCall algAlloc() to get memory requestsmalloc()Call algInitObj() to initialize instance objectInitialize instance objectCall algActivate() to prep instance for useInitialize scratch memoryApplication “Framework”Algorithm ModuleCall algorithm processing methodsProcess data, return resultAlgorithm InstanceAlgorithm InstanceCall algDeactivate() to prep for mem re-useSave state to persistent memoryCall algFree() to retrieve buffer pointersReturn all buffers and sizesfree()
32 TMS320TM DSP Algorithm Standard IntroductionMemory TypesMemory Setup SequenceAbstract InterfaceInstance ObjectsAlgorithm Coding RulesConclusionsIALG Abstract InterfaceModule InterfaceInterface OptionsNaming RulesHow do I access IALG functions?How do I access algorithm functions?Is there a naming style I can rely upon?
33 IALG Interface algNumAlloc return maximum number of memory requests algAlloc return all memory allocation requests to applicationalgInitObj initialize allocated instance memoryalgActivate initialize scratch memory from persistent memoryalgMoved instance memory movedalgControl algorithm specific control operationsalgDeactivate save persistent data in scratch memoryalgFree return pointers to all instance memoryIALG is an abstract interface that separates the algorithm from application scheduling and memory management policies.Compliant algorithms are packaged in modules that include the IALG implementation.
34 Algorithm “Module” Interface Void algActivate(IALG_Handle);Int algAlloc(const IALG_Params *,…);Int algControl(IALG_Handle, …);Int algDeactivate(IALG_Handle);Int algFree(IALG_Handle, …);Int algInit(IALG_Handle, …);Void algMoved(IALG_Handle, …);Int algNumAlloc();Void decode(IG729_Handle, IG729_Frm …);Void encode(IG729_Handle, Int16 *in,…);Void …IALG_FxnsIG729_FxnsAlgorithm interfaces are abstract interfaces derived from IALGIALG functions provide the methods to create/manage “instance objects”Additional module-specific functions are appended to access the algorithms themselvesAbstract interfaces define a “v-table” for accessing the module’s functionsAbstract interfaces define module functions as a structure of pointers- 34
35 Interface Options Application Standard Module Vendor Algorithm Standard Interface:Abstract TemplateDefined by TIIALG table onlyModule Interface:Required for complianceDefined by VendorIALG + Alg FxnsVendor Interface:Optional MethodDefined by Vendoreg: “shortcuts”
36 Naming RulesAll external identifiers follow the format: MODule_VENder_xxxexample: Line Echo Canceller from Texas Instruments: LEC_TI_runextensions to the library file types define the target architecture :MOD_VEN.a62 62xx targetMOD_VEN.a62e 62xx target - big endianMOD_VEN.a54f x target - far call/rtn versionMOD_VEN.a54n 54x target - near call/rtn versionAvoid name space pollution (target symbols, development system files)Enable tool supportSemantics of operations and object files can be inferredInstallation is simplified; generic installation programs can be createdSupports vendor differentiation: Vendor specific operations can be addedSimplifies code audits: Understand one algorithm you know them all
37 TMS320TM DSP Algorithm Standard IntroductionMemory TypesMemory Setup SequenceAbstract InterfaceInstance ObjectsAlgorithm Coding RulesConclusionsThe Instance ObjectApp to Alg Control FlowRe-entrancyMultiple InstancesHow does the application find and interact with the algorithm functions?How do we assure no hardware conflicts between algorithms?What about the case of re-entrancy or multiple instances of an algorithm?
38 Application to Algorithm Control Interface .bss / stackhandleXY.sysmeminstanceXY*IALG_Fxns*a*xlen....bssglobalsvtable “XY”X_Y_numX_Y_allocX_Y_init…X_Y_run.textnum…allocrun.cinitcopy ofV tableInstance Object:table of pointers to data structures1: ptr. to v.table2-N: alg data arrays and variablesmodule interfacealgorithm code
39 Application to Algorithm Chronology .bss / stackhandleXY.sysmeminstanceXY*IALG_Fxns*a*x*x_stg.bssglobalsvtable XYa.textalgcode....cinitcopy ofV tablex x_stgx +1. On build Alg code2. At boot V.table3. FIR_TI_Alloc() mem for: inst obj,x, a, x_stg4. FIR_TI_InitObj() fill inst.obj & persist5. FIR_TI_Activate() fill scratch6. FIR_TI_Run() process FIR filter7. FIR_TI_Deactiv() x to x_stg, reclaim x8. FIR_TI_Free reclaim inst.obj,x_stg, a
40 Re-entrancy & Multiple Instances Hi PriorityProcessProcessProcessSWI ASWI BDuring this time, both A and B are running the same function. How do we avoid having A’s context overwrite B’s?IDLELow PriorityConcurrent running of multiple instance of the same algorithm must be supported. Allow repeated entries in a preemptive environmentReentrancy enables multiple channel (instance) systems“Reentrancy is the attribute of a program or routine that allows the same copy of a program or routine to be used concurrently by two or more threads”
41 Multiple Instances of an Algorithm .bss/stackhandleXY1handleXY2instanceXY1*IALG_Fxns*a*x*x_stginstanceXY2.bssglobalsvtable XYa.textalgcode.cinitcopy ofV tableAllocate, Activate as many instances as requiredUniquely named handles allow control of individual instances of the same algorithmAll instance objects point to the same v.tableConstant tables are commonScratch can be common or separate as desiredx x_stgx x_stg
42 TMS320TM DSP Algorithm Standard IntroductionMemory TypesMemory Setup SequenceAbstract InterfaceInstance ObjectsAlgorithm Coding RulesConclusionsCoding RulesThreads vs AlgorithmsObject Based ProgrammingWhat rules do compliant algorithm functions follow?How do algorithms relate to the DSP/BIOS scheduling environment?How do the various concepts relate to each other?
43 Algorithm Standard Coding Rules General Coding:No self-modifying codeC callableRe-entrantProcessor AccessNo direct memory allocationRelocatable data and codeNo direct peripheral interfaceApplication “Framework” manages all hardware resourcesBenefits:No hardware contentionPortability to other DSPsDSPA/DAlgD/AAlgAlgo StandardctrlAlgstatusApplicationCore Run-time
44 Threads vs Algorithms Compliant algorithms are A thread may call multiple algorithm instancesAlgorithms are not, and may not uses threadsAlgorithms are “data transducers” not schedulersThread “B”Thread “A”G.729 XG.729 YG.168DTMFDTMFCompliant algorithms are“pure” data transducers with state: not threads“black box” components - accessed by v.tableextensible via vendor interface optionAllows for unique methods and creation parametersUsers may directly access these features but lose interchangeability
45 Object Based Programming Environment ModuleSmallest logical unit of softwareEach module has, defined in the module’s header file, a particularInterface and calling conventionsData structuresInterfaceUsed by the client to systematically interact with a moduleRelates a set of constants, types, variables & functions visible to clientInstance ObjectUnique set of parameters that define the state of each instance
46 TMS320TM DSP Algorithm Standard IntroductionMemory TypesMemory Setup SequenceAbstract InterfaceInstance ObjectsAlgorithm Coding RulesConclusionsValue of the StandardAlgorithm “Package”Algorithm DocumentationDeveloper ResourcesSystem Overhead
47 Value of the TMS320TM DSP Algorithm Standard An application can use algorithms from multiple vendorsAn algorithm can be inserted into practically any applicationThe same code can be used in static or dynamic systemsAlgorithms can be distributed in binary formBe measurable/verifiable as conformed to by algorithmsEnable host tools to simplify:ConfigurationPerformance modelingStandard conformanceDebuggingQuantify the algorithm’s: memory, latency, speedBe easy to adhere toIncur little or no overheadoff-the-shelf DSP contentFaster, easier algorithm integration
48 Compliant Algorithm “Package” Compliant Algorithms must include:Libraries of the code providedHeader files listing the implemented abstract interfacesDocumentation defining the algorithmTMS320TM DSP Algorithm StandardLIBHDOC
49 Algorithm Performance Characterization All algorithms must characterize their:Memory requirementsExecution timeInterrupt latencyStandard basis for comparison and tradeoffs
50 Algorithm Developer Resources DocumentsManualsApplication notesDevelopers kitRuntime support libraries and all interface headersExample algorithms and applications source codeDevelopment toolsWeb resource
52 Standard: Overview & Rationalization TMS320 DSP AlgorithmStandard: Overview & RationalizationHello and welcome to the overview and rationalization of the TMS320 DSP algorithm standard, the fourth and final section of this online training. My name is Maher Katorgi. I’m a Software Technical Staff supporting Texas Instruments’ eXpressDSP software technology. We're here today to describe the new TMS320 DSP Algorithm Standard and how it will help take algorithm components to the next level.
53 Interactions with eXpressDSP Technologies AgendaOverviewInteractions with eXpressDSP TechnologiesRationalization and BenefitsThe agenda for this section consists of three parts. First, we will present an overview of the TMS320 DSP algorithm standard. Next, we will highlight the relationship between the algorithm standard and other tools of the expressDSP technology. Finally, we will explain the rationale behind the rules that make up the algorithm standard and highlight their benefits to consumers of algorithms complying to these rules.
54 TMS320 DSP Algorithm Standard ease-of-integrationALGORITHMCONSUMERSstatic alg1 chan1dynamic algn channRules & Guidelinesuniform naming conventionsregister usage requirementsdata addressing modesre-entrant, relocatable codememory allocation policiesaccess to HW peripheralsminimizing interrupt latencyperformance characterizationResource Management Framework(s)TI's TMS320 DSP Algorithm Standard is specifically designed to achieve these goals. The standard is made up of over 30 programming rules and several additional guidelines, some of which we'll take a look at in a minute. The rules cover general programming issues that cover all algorithms, specific rules for each of the TI platforms, specific resource management APIs for handling memory and DMA, and finally algorithm characterization rules.Common Programmatic Interfacewrite once, deploy widelyALGORITHMPRODUCERS
55 eXpressDSPTM - Technology Interactions LogicalTemporalPhysicalCode Composer Studioget the code to workSingle channel, single algorithmSingle GUI for develop & debugGraphical Data AnalysisExpandable by 3P plug-inseXpressDSPTMDifferent tools to solvedifferent problemsDSP Algorithm Standardoff-the-shelf softwareMulti-ChannelStatic or dynamicMemory and DMA managementSingle or multi-channelDSP/BIOS IImeet real-time goalsMulti-algorithmSoftware schedulingReal-time analysisHardware abstractionThe third dimension is the physical dimension. These are the actual resources like banks of memory and DMA channels, incredibly valuable resources on any DSP system. The DSP Algorithm Standard is specifically tailored to help solve these resource management issues. In particular it helps with both static and dynamic systems, single or multi-channel environments, plus different memory types and DMAs.Added together, these three dimensions give the user an incredibly powerful development and debug environment not offered by any other solution.
56 Algorithm Standard - Rules & Benefits Consistency/Ease of IntegrationHands off certain registersAccess all data as far dataLittle endian formatDSP/BIOS name conventionsPortability/FlexibilityRe-entrant codeCode must be re-locatableNo direct access peripheralsUsability/InteroperabilityStandardized memory managementStandardized DMA managementMeasurabilityWorst case memory usageWorst case interrupt latencyWorst case execution timeFinally, there is measurability. This enables the specifications for algorithms to be previewed and verified. Examples of these are memory requirements, worst case interrupt latency, and worst case execution time. It is these standardized metrics that make it so much easier to compare one algorithm to another.
57 ObjectiveExplain the rationale behind the rules of the eXpressDSP Algorithm Standard and their benefits to customers of compliant algorithms.The goal of this session is to the explain to you the rationale behind the rules that make up the eXpressDSP algorithm standard and to highlight their benefits to consumers of algorithms complying to them.
58 TMS320 DSP Algorithm Standard DefinitionTMS320 DSP Algorithm StandardA set of rules designed to ensure components interoperate with algorithms from different vendors in virtually any application.First, I would like to start with a definition of what the eXpressDSP standard really is. Simply put, it is a set of rules and guidelines that allow algorithms from different vendors in virtually any application to play nice together.
59 Respect C Run-time Conventions All algorithms must follow the run-time conventions imposed by TI’s implementation of the C programming languageNeed to avoid having algorithm interfere with application stateTop-most interface must be “C callable”Most DSP systems run in C environment – common interface language and run-time support libraries usedBenfitsEase of IntegrationBinding algorithms to applicationControl flow of data between algorithmsNo run-time surprisesWith that I’ll move to the rules.By far, C language is the common interface language in most DSP systems. Software sub-systems making up the solution are bound together using C. By ensuring that the algorithms are callable by C and follow the C runtime conventions, a system integrator can now easily bind algorithms together, control the flow of data between algorithms, and interact with other processors in the system without any concern that algorithms will interfere with application state. This does not entail that algorithms must be written in C, but that they must be callable from the C language and maintain the C language runtime conventions.
60 Algorithms Must be Re-entrant All algorithms must be re-entrant within a preemptive environmentAlgorithm code running multiple times simultaneouslyMulti-channel systems (e.g. servers)Real-time systems with real-time OSTasks are independent of each other and reentrancy must be ensured.Memory or global variables shared by multiple instances must be protectedBenfitsFlexibilityOptimized program memory usage (e.g. multiple channels will be running same code)Maintains integrity of algorithm instance dataReentrancy.Any real time embedded system must address the reentrancy issues. All but the simplest systems will most likely require reentrant code. Reentrancy is crucial to any section of code that may be invoked more than once (Multi-channel systems for example). Moreover in a real time OS, each task is independent and therefore reentrancy becomes a real concern. Any function shared between tasks can be a source of problem, since the scheduler can context switch on a timer tick during the execution of this critical routine, and then schedule another task that invokes the same function. Each instance of execution must have its own set of local variables to avoid corrupting the other instance. In multi-channel systems, functions must be reentrant if more than one channel is to use it.What are the consumer benefits:They can create multiple algorithm instances running same code simultaneously. One set of code in program memory and multiple instances using it. All of that happening in real-time while maintaining integrity of each instance data intact.
61 Data & Code Relocatability All algorithms data (code) references must be fully relocatable. There must be no “hard coded” data (program) memory locations.Ability to run algorithm components from any type of memoryOptimized use of memory resourcesAllows any number of instances without data collisionsBenfitsPortabilityTransfer algorithms between systemsFlexibilityPlacement of algorithm components anywhere in memoryRunning algorithms within a range of operating environmentsAll DSP programmers are aware of the limited on-chip memory resource available to them. In order to increase the functionality, many DSPs also provide larger amounts of external memory, but not without a penalty. This external memory is significantly limited by the increased access time. Since on-chip memory is significantly faster than off-chip memory, algorithms tend to place frequently accessed data in on-chip memory for optimization. It now becomes the system integrators duty to determine the optimal location for each algorithm component. This rule must be implemented in order to simplify this task.Similarly, many algorithms have initialization code that is run once during the lifespan of an application. The cost of execution for this “run-once” code is usually not a factor and therefore may be placed in external memory.Consumer benefits? Portability for one. Algorithms can be transferred between systems with different memory configuration, from a 6201 to a 6211 architecture for example. Flexibility is another benefit allowing optimum use of memory resources and ability to run in different operating environment such as with or without cache on the 6211 architecture for example.
62 No Direct Peripheral Access Algorithms must never directly access any peripheral device. This includes, but is not limited to, on-chip DMA, timers, I/O devices, and cache control registers.Algorithms cannot know what peripherals exist or are availableSpecific resources will vary from system to systemMultiple algorithms will compete for resourcesPeripherals need to be configured differently for various algosBenfitsInteroperabilityFramework manages resourcesNo resource competitionPortabilityTransfer s/w between systemsPlatform independenceThe sole purpose of an algorithm is to process information. This is a fundamental assumption of the eXpressDSP standard. The algorithm is not aware of peripherals on the system it is running on. The client of that algorithm (framework) is responsible for fetching the data from the system and explicitly passing it to the algorithm. One might argue that algorithms should be able to directly acquire data from a peripheral device, but with all the variations in chip designs and board layouts, there is no way to guarantee that the algorithm will run in a system with a chip that has the designated peripheral. Let’s say the designated peripheral does exist, what if two algorithm use DMA? Worse, the both algorithms require DMA be configured differently. The algorithms would not be able to work together.Benefits to consumers? This rule will ensure interoperability of compliant algorithms by not having algorithm compete for peripheral resources. It also drive portability in that algorithm are independent of the platform they run on and hence can be moved between systems.
63 Symbol Naming Conventions All external definitions must be either API references or API and vendor prefixed.All modules must follow the naming conventions of the DSP/BIOS for those external declarations exposed to the client.Algorithms must avoid name space collisionsDifferent algorithms may have same name for data types and functionsApplication cannot resolve multiply-defined symbolsBenfitsEase of integrationNo name space collisionSingle consistent naming conventionShorter system integration timeConsistencyEnhanced code readabilityCompliant algorithms intended for use with DSP/BIOSOne of the most frequently encountered problems in integrating multiple algorithms is having external identifiers being declared more than once. System integrators will waste valuable time waiting for a vendor to rename identifiers within the algorithm to resolve this problem. In order to avoid this namespace collision, algorithm providers must prefix all external identifiers in a module with the API or API and vendor. This naming convention will ensure that integrators are one step closer to a seamless integration. Furthermore, one of the primary benefits of following a single consistent naming convention is the ability to clarify the purpose, logic, and information flow within code. System integrators can leverage the use of naming convention to simplify the integration of multiple algorithms within a system by greatly enhancing the readability of code. One may ask why was the DSP/BIOS standard chosen? The DSP/BIOS naming convention was chosen not only because compliant algorithms were intended to be incorporated into systems that use DSP/BIOS, but because it provided the benefits stated above. Clearly the benefits to consumer are consistency and ease of integration.
64 Module External References All undefined references must refer to operations from a subset of C runtime support library functions, DSP/BIOS or other eXpressDSP-compliant modules.Algorithms are as compliant as the modules they invokeAlgorithm must not reference non-compliant modulesBenfitsEase of integrationDSP/BIOS and C RTS part of CCSSingle consistent naming conventionShorter system integration timeConsistencyEnhanced code readabilityCompliant algorithms intended for use with DSP/BIOSAn algorithm is only as compliant as the functions/modules that it invokes. In order for an algorithm to maintain the integrity of compliance, it must not reference any non-compliant functions/modules. For example, if an algorithm references a function that has been determined to be non-reentrant, the algorithm would also inherit that non-reentrant attribute. The eXpressDSP standard specs document a list of TI C-Language Run-time Support Library functions and DSP/BIOS Runtime Support Library modules that can be safely referenced by a compliant algorithm.Again in this case, the benefits to consumers are consistency and ease of integration.
65 Abstract Interface Implementation All algorithms must implement the IALG interface.Defines communication protocol between client and algorithmEnables client to create, manage and terminate algorithm instancesRun in virtually any system (preemptive and non-preemptive, static and dynamic)Common to all compliant algorithmsBenfitsEase of integrationUniform abstract interfaceLearn once apply manyShorter system integration timeInteroperability/ConsistencyUniform abstract interfaceFlexibilityRunning algorithms in virtually any execution environmentThe abstract algorithm interface (IALG) defines a framework for the creation of algorithm instance objects. It must be implemented by all algorithms in order for them to define their memory resource requirements, therefore enabling the efficient use of on-chip data memories by the client application. The client and algorithm now have a defined communication protocol used in creating, managing and terminating an algorithm instance object at run-time. All of the features of the IALG interface allow clients to uniformly manipulate all algorithms within any system.What are the benefits to consumers?Uniform communication interface supports interoperability. Ease of integration is another benefit in that integrators would learn the communication protocol once and apply it to all compliant algorithms being integrated leading to shorter integration time. The IALG interface is also flexible in that it allows the client to use the algorithm in virtually any system (i.e., preemptive and non-preemptive, static and dynamic).
66 Abstract Interface Implementation Each of the IALG methods implemented by an algorithm must be independently relocatable.Need for design/run-time creation of algorithm instancesAbility to relocate algorithm interface methods in memoryAbility to discard unused functions to reduce code sizeOptimized use of program memoryBenfitsFlexibilityPlacement of algorithm components anywhere in memorySupport for design/run-time (static/dynamic) integration(SPRA577, SPRA580, SPRA716)In order to support both design-time creation and run-time creation of algorithm objects, all methods defined by the IALG interface must be independently relocatable. By simply placing each method in a separate file or in a separate compiler output file format (COFF) output section, the linker can be used to optimize placement of algorithm components in memory or eliminate unnecessary methods not required in run-time object creation. Both leading to optimum use of memory.Benefits? Flexibility, ability to optimize memory usage and support of design/run- time systems. Applications notes demonstrating such flexibility including one demonstrating integrating algorithms with zero overhead (SPRA716); well that’s not exactly true. Only one word of memory is required; are available on TI web site.
67 Algorithm PackagingEach compliant algorithm must be packaged in an archive which has a name that follows a uniform naming convention.Integrate different algorithms without symbol collisionUnique archive names between different/versions of algorithmsUniform format for delivering algorithmsAll algorithm modules built into a single archiveUse of algorithms in different development platforms (UNIX, Win)Archive names are case sensitiveBenfitsConsistencyUniform naming conventionEase of integrationSingle consistent naming conventionUnique archive names (no symbol collision)Shorter system integration timeFlexibilitySupport different development systemsSystem integrators know the harsh reality of linking an application and realizing that symbols may be multiply defined. Well, the same frustration may occur in copying algorithms with the same name from multiple vendors into a single directory. Maybe the vendor has provided multiple versions of the same algorithm with the same filename. It becomes a hassle for system integrators to rename all these files. This rule have addressed this issue so that each vendor organizes the filenames rather than have all the system integrators try to keep track of all the algorithms. Since most modules may consist of many object files, algorithms must be delivered in a form that can be uniformly integrated into a system. The standard has determined the method of delivery to be an archive. To ensure that no two algorithms have the same archive name, all algorithms must follow the naming convention. To further ensure the usage of the algorithm on both UNIX and Windows development systems, all archives must not be case sensitive.Consumer Benefits in this case are uniform naming convention for consistency, support for different development systems (Unix, windows) for flexibility and ease of integration no symbol collision surprises.
68 Performance and Requirements Metrics All compliant algorithms must characterize:Program/heap/static/stack memory requirementsWorst case interrupt latencyTypical period and worst case execution for each methodPlanning integration of algorithm A vs. B into systemAssess performance metrics and compatibility with systemAssess resource requirements and compatibility with systemMeasurabilityUp-front assessment and comparison toolEase of integrationDetermine algorithm compatibility with systemOptimum data/code placement in memoryOptimum resource allocation (static, stack, etc.)Optimum scheduling (latency, execution, etc.)BenfitsSystem integrators would have no problem creating systems if DSPs have unlimited resources. Unfortunately DSPs have a limited amount of program/data memory available. Therefore algorithm providers must characterize these memory requirements for their algorithms. Otherwise the burden of determining how much memory is allocated for each algorithm is placed on the system integrator. While this task is not impossible, it is quite costly and inefficient. Characterizing memory requirements will allow system integrators to determine the feasibility of a particular algorithm in their system. Interrupt latency, the maximum time interrupts are disabled, must also be documented. It is important that the designer know this time so that the system can compensate for it. Also typical period and worst case execution time must be documented. This improves the integrators ability to integrate, plan and schedule the algorithm without breaking real-time of the system and risking data loss. Clearly the benefit to the consumer is measurability. All these metrics give the integrator a measure for comparing algorithm A vs. B. they also simplifies integration by allowing optimum resource management.
69 Summary of Key Benefits Ease of IntegrationUniform abstract interface (learn once apply many)Single consistent naming conventionShorter system integration timeDetermine algorithm compatibility with systemNo run-time surprisesPortabilityTransfer s/w between systemsPlatform independenceMeasurabilityUp-front assessment and comparison tool for planning algorithm A vs. BFlexibilityAlgorithm components anywhere in memoryAlgorithms run in virtually any execution environmentDesign/run-time integrationDifferent development systemsMulti-channelConsistencyUniform naming conventionsEnhanced code readabilityInteroperabilityUniform abstract interfaceNo resource competitionIn summary these are the key benefits in one slide that customers can have when they integrate compliant algorithms into their system solutions.This concludes this presentation session.
70 Further Reading on XDAIS Reference Frameworks for eXpressDSP Software: A White Paper \XDAIS\spra094.pdfReference Frameworks for eXpressDSP Software: API Reference \XDAIS\spra147.pdfUsing the TMS320 DSP Algorithm Standard in a Static DSP System \XDAIS\spra577b.pdfMaking DSP Algorithms Compliant with the TMS320 DSP Algorithm Standard \XDAIS\spra579b.pdfUsing the TMS320 DSP Algorithm Standard in a Dynamic DSP System \XDAIS\spra580b.pdfThe TMS320 DSP Algorithm Standard \XDAIS\spra581b.pdfAchieving Zero Overhead With the TMS320 DSP Algorithm Standard IALG Interface \XDAIS\spra716.pdfReal-Time Analysis in an eXpressDSP-Compliant Algorithm \XDAIS\spra732.pdfReference Frameworks for eXpressDSP Software: RF1, A Compact Static System \XDAIS\spra791b.pdf
71 Further Reading on XDAIS Reference Frameworks for eXpressDSP Software: RF3, A Flexible, Multi-Channel, Multi-Algorithm, Static System \XDAIS\spra793b.pdfTMS320 DSP Algorithm Standard Rules and Guidelines \XDAIS\spru352d.pdfTMS320 DSP Algorithm Standard API Reference \XDAIS\spru360b.pdfTMS320 DSP Algorithm Standard Demonstration Application \XDAIS\spru361d.pdfTMS320 DSP Algorithm Standard Developer’s Guide \XDAIS\spru424.pdfTMS320 DSP Algorithm Standard \XDAIS\spru427.pdf