4 Machine Independent Assembler Features More related to issues about:Programmer convenienceSoftware environmentCommon examples:LiteralsSymbol-defining statementsExpressionsProgram blocksControl sectionsAssembler directives are widely used to support these features
5 Literals Literal is equivalent to: Why use literals: Define a constant explicitly and assign an address label for itUse the label as the instruction operandWhy use literals:To avoid defining the constant somewhere and making up a label for itInstead, to write the value of a constant operand as a part of the instructionHow to use literals:A literal is identified with the prefix =, followed by a specification of the literal value
11 Object Program Using Literal The same as before
12 Literal vs. Immediate Addressing Same:Operand field contains constant valuesDifference:Immediate addressing: the assembler put the constant value as part of the machine instructionLiteral: the assembler store the constant value elsewhere and put that address as part of the machine instruction
13 Literal PoolAll of the literal operands are gathered together into one or more literal pools.Where is the literal pool:At the end of the object program, generated immediately following the END statementAt the location where the LTORG directive is encounteredTo keep the literal operand close to the instruction that uses it
14 Duplicate Literals Duplicate literals: The same literal used more than once in the programOnly one copy of the specified value needs to be storedFor example, =X’05’ in the example programHow to recognize the duplicate literalsCompare the character strings defining themEasier to implement, but has potential problem (see next)E.g., =X’05’Compare the generated data valueBetter, but will increase the complexity of the assemblerE.g., =C’EOF’ and =X’454F46’
15 Problem of Duplicate-Literal Recognition using Character Strings There may be some literals that have the same name, but different valuesFor example, the literal whose value depends on its location in the programThe value of location counter denoted by *BASE *LDB =*The literal =* repeatedly used in the program has the same name, but different valuesAll this kind of literals have to be stored in the literal pool
16 Implementation of Literal Data structure: a literal table LITTABLiteral nameOperand value and lengthAddressLITTAB is often organized as a hash table, using the literal name or value as the key
17 Implementation of Literal Pass 1As each literal operand is recognizedSearch the LITTAB for the specified literal name or valueIf the literal is already present, no action is neededOtherwise, the literal is added to LITTAB (store the name, value, and length, but not address)As LTORG or END is encounteredScan the LITTABFor each literal with empty address field, assign the address and update the LOCCTR accordingly
18 Implementation of Literal Pass 2As each literal operand is recognizedSearch the LITTAB for the specified literal name or valueIf the literal is found, use the associated address as the operand of the instructionOtherwise, error (should not happen)As LTORG or END is encounteredinsert the data values of the literals in the object programModification record is generated if necessary
19 Symbol-Defining Statements How to define symbols and their valuesAddress labelThe label is the symbol name and the assigned address is its valueFIRST STL RETADRAssembler directive EQUsymbol EQU valueThis statement enters the symbol into SYMTAB and assigns to it the value specifiedThe value can be a constant or an expressionAssembler directive ORGORG value
20 Use of EQUTo improve the program readability, avoid using the magic numbers, make it easier to find and change constant valuesReplace+LDT #4096withMAXLEN EQU+LDT #MAXLENTo define mnemonic names for registersA EQU 0X EQU 1BASE EQU R1COUNT EQU R2
21 Use of ORG Indirect value assignment: ORG value When ORG is encountered, the assembler resets its LOCCTR to the specified valueORG will affect the values of all labels defined until the next ORGIf the previous value of LOCCTR can be automatically remembered, we can return to the normal use of LOCCTR by simply writeORG
22 Example of Using ORG Consider the following data structure SYMBOL: 6 bytesVALUE: 3 bytes (one word)FLAGS: 2 byteswe want to refer to every field of each entry
23 Less readable and meaningful Not Using ORGWe can fetch the VALUE field byLDA VALUE,XX = 0, 11, 22, … for each entryOffsets from STABLess readable and meaningful
24 Restore the LOCCTR to its previous value Using ORGSet the LOCCTR to STABSize of fieldmore meaningfulRestore the LOCCTR to its previous value
25 Forward-Reference Problem Forward reference is not allowed her for EQU and ORG.That is, all terms in the value field must have been defined previously in the program.The reason is that all symbols must have been defined during Pass 1 in a two-pass assembler.AllowedNot allowed
27 Forward-Reference Problem Not allowedNot allowed
28 ExpressionsA single term as an instruction operand can be replaced by an expression.STAB RESBSTAB RESB *100STAB RESB (6+3+2)*MAXENTRIESThe assembler has to evaluate the expression to produce a single operand address or value.Expressions consist ofOperator+,-,*,/ (division is usually defined to produce an integer result)Individual termsConstantsUser-defined symbolsSpecial terms, e.g., *, the current value of LOCCTR
29 Relocation Problem in Expressions Values of terms can beAbsolute (independent of program location)constantsRelative (to the beginning of the program)Address labels* (value of LOCCTR)Expressions can beAbsoluteOnly absolute termsRelative terms in pairs with opposite signs for each pairRelativeAll the relative terms except one can be paired as described in “absolute”. The remaining unpaired relative term must have a positive sign.No relative terms may enter into a multiplication or division operationExpressions that do not meet the conditions of either “absolute” or “relative” should be flagged as errors.
30 Values associated with symbols Absolute ExpressionRelative term or expression implicitly represents (S+r)S: the starting address of the programr: value of the term or expression relative to SFor exampleBUFFER: S+r1BUFEND: S+r2The expression, BUFEND-BUFFER, is absolute.MAXLEN = (S+r2)-(S+r1) = r2-r1 (no S here)MAXLEN means the length of the buffer areaIllegal expressions: BUFEND+BUFFER, 100-BUFFER, 3*BUFFERValues associated with symbols
31 Absolute or RelativeTo determine the type of an expression, we must keep track of the types of all symbols defined in the program.We need a “flag” in the SYMTAB for indication.
32 Program Blocks and Control Sections Although the source program logically contains subroutines, data area, etc, they were assembled into a single block of object code in which the machine instructions and data appeared in the same order as they were in the source program.To provide flexibility:Program blocksSegments of code that are rearranged within a single object program unitControl sectionsSegments of code that are translated into independent object program units
33 Program Blocks As an example, three blocks are used: default: executable instructionsCDATA: all data areas that are less in lengthCBLKS: all data areas that consists of larger blocks of memoryThe assembler directive USE indicates which portions of the source program belong to the various blocks.
34 Program with Multiple Program Blocks At the beginning, the default block is assumed.
35 Program with Multiple Program Blocks Resume the default blockResume the CDATA block
36 Program with Multiple Program Blocks Resume the default blockResume the CDATA block
37 Program BlocksEach program block may actually contain several separate segments of the source program.The assembler will logically rearrange these segments to gather together the pieces of each block.The result is the same as if the programmer had physically rearranged the source statements to group together all the source lines belonging to each block.
38 Why Program Blocks To satisfy the contradictive goals: Separate the program into blocks in a particular orderLarge buffer area is moved to the end of the object programUsing the extended format instructions or base relative mode may be reduced. (lines 15, 35, and 65)Placement of literal pool is easier: simply put them before the large data area, CDATA block. (line 253)Data areas are scatteredProgram readability is better if data areas are placed in the source program close to the statements that reference them.
39 How to Rearrange Codes into Program Blocks Pass 1Maintain a separate LOCCTR for each program blockinitialized to 0 when the block is first begunsaved when switching to another blockrestored when resuming a previous blockAssign to each label an address relative to the start of the block that contains itStore the block name or number in the SYMTAB along with the assigned relative address of the labelIndicate the block length as the latest value of LOCCTR for each block at the end of Pass1Assign to each block a starting address in the object program by concatenating the program blocks in a particular order
40 How to Rearrange Codes into Program Blocks Pass 2Calculate the address for each symbol relative to the start of the object program by addingthe location of the symbol relative to the start of its blockthe assigned starting address of this block
41 Object Program with Multiple Program Blocks Loc/Block0: default1: CDATA2: CBLKSNo block number because MAXLEN is an absolute symbol
45 Example of Address Calculation LDA LENGTHThe value of the operand (LENGTH)Address 0003 relative to Block 1 (CDATA)address =0069 relative to programaddress =0060 relative to PC, in which the address of PC relative to program is =0009
46 Object ProgramIt is not necessary to physically rearrange the generated code in the object program to place the pieces of each program block together.The assembler just simply insert the proper load address in each Text record.
47 Program Blocks Loaded in Memory same orderNot present in object programRearrangement through loading
48 Control Sections A control section is a part of the program that maintains its identity after assemblyis often used for subroutine or other logical subdivision of a programcan be assembled, loaded, and relocated independentlyis more flexible
50 Program LinkingProgram linking is used to link together logically related control sectionsProblem:The assembler does not know where any other control section will be located at execution time.When an instruction needs to refer to instructions or data located in another control section, the assembler is unable to process this reference.The assembler has to generate information for such kind of references, called external references, that will allow the loader to perform the required linking.
51 Program with Multiple Control Sections Implicitly defined as an external symbolFirst control section: COPYDefine external symbolsExternal reference
52 Program with Multiple Control Sections Implicitly defined as an external symbolSecond control section: RDRECExternal reference
53 Program with Multiple Control Sections Implicitly defined as an external symbolThird control section: WRRECExternal reference
54 Assembler Directives for Control Section START:start the first control sectionset program name as the control section namedefine the control section name as an external symbolCSECT:start a new control sectionspecify the control section nameEXTDEF:define external symbolsEXTREF:name symbols defined in other control sections
55 How to Handle External References CLOOP +JSUB RDREC B100000The operand RDREC is an external reference.The assemblerhas no idea where RDREC isinserts an address of zerocan only use extended format to provide enough room (that is, relative addressing for external reference is invalid)passes information to the loader
56 How to Handle External References MAXLEN WORD BUFEND-BUFFERThere are two external references in the expression, BUFEND and BUFFER.The assemblerinserts a value of zeropasses information to the loaderAdd to this data area the address of BUFENDSubtract from this data area the address of BUFFEROn line 107, BUFEND and BUFFER are defined in the same control section and the expression can be calculated immediately.MAXLEN EQU BUFEND-BUFFER
60 How to Handle Control Sections The assemblerprocesses each control section independentlyestablishes a separate LOCCTR (initialized to 0) for each control sectionstores in SYMTAB the control section in which a symbol is definedallow the same symbol to be used in different control sectionsreports an error when attempting to refer to a symbol in another control section, unless the symbol is defined as an external referencegenerates information in the object program for external references
61 New Records for External References gives information about external symbols named by EXTDEFlists symbols used as external references, i.e., symbols named by EXTREF
64 Program RelocationAs well as for program linking, the revised Modification record may still be used to perform program relocation.
65 Expressions in Multiple Control Sections Extended restrictionBoth terms in each pair of an expression must be within the same control sectionLegal: BUFEND-BUFFERIllegal: RDREC-COPYHow to enforce this restrictionWhen an expression involves external references, the assembler cannot determine whether or not the expression is legal.The assembler evaluates all of the terms it can, combines these to form an initial expression value, and generates Modification records.The loader checks the expression for errors and finishes the evaluation.