Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Assembly Process Basically why does it all work.

Similar presentations


Presentation on theme: "The Assembly Process Basically why does it all work."— Presentation transcript:

1 The Assembly Process Basically why does it all work

2 The Assembly Process A computer understands machine code - binary People (and compilers) write assembly language An assembler is a program that translates each instruction to its binary machine code equivalent. It is relatively simple program A one-to-one or near one-to-one correspondence between assembly language instructions and machine language instructions. Assemblers do some code manipulation Like MAL to TAL Label resolution A macro assembler can process simple macros like puts, or preprocessor directives. assembler Assembly source code Machine code

3 MAL  TAL MAL is the set of instructions accepted by the assembler. TAL is a subset of MAL – the instructions that can be directly turned into machine code. There are many MAL instructions that have no single TAL equivalent. To determine whether an instruction is a TAL instruction or not: Look in appendix C or on the MAL/TAL sheet. The assembler takes (non MIPS) MAL instructions and synthesizes them into 1 or more MIPS instructions.

4 MAL  TAL mul $8, $17, $20 For example Becomes MIPS has 2 registers for results from integer multiplication and division: HI and LO Each is a 32 bit register mult and multu places the least significant 32 bits of its result into LO, and the most significant into HI. Multiplying two 32-bit numbers gives a 64-bit result (2 32 – 1)(2 32 – 1) = 2 64 – 2x2 32 - 1 mult $17, $20 mflo $8

5 MAL  TAL mflo, mtlo, mfhi, mthi Move From loMove To hi Data is moved into or out of register HI or LO One operand is needed to tell where the data is coming from or going to. For division (div or divu) HI gets the remainder LO gets the dividend Why aren’t these just put in $0-$31 directly?

6 MAL  TAL TAL has only base displacement addressing So this: lw $8, label Becomes: la $7, label lw $8, 0($7) Which becomes lui $8, 0xMSPART of label ori $8, $8, 0xLSpart of label lw $8, 0($8)

7 MAL  TAL Instructions with immediates are synthesized with other instructions So: add $sp, $sp, 4 Becomes: addi $sp, $sp, 4 For TAL: add requires 3 operands in registers. addi requires 2 operands in registers and one operand that is immediate. In MIPS assembly immediate instructions include: addi, addiu, andi, lui, ori, xori Why not more?

8 MAL  TAL TAL implementation of I/O instructions This: putc $18 Becomes addi$2, $0, 11# code for putc add$4, $18, $0# put character argument in $4 syscall# ask operating system to do a function

9 MAL  TAL getc$11 Becomes: addi$2, $0, 12 syscall add$11, $0, $2 puts$13 Becomes: addi$2, $0, 4 add$4, $0, $13 syscall done Becomes: addi$2, $0, 10 syscall

10 MAL  TAL MALTAL move $4, $3add $4, $3, $0 add $4, $3, 15addi $4, $3, 15 # also andi, ori, etc. mul $8, $9, $10mult $9, $10 #HI || LO  product # never overflow mflo $8 # $8  $L0, ignore $HI! div $8, $9, $10div $9, $10 # $LO  quotient # $HI  remainder mflo $8 rem $8, $9, $10div $9, $10 mfhi $8 bltz, bgez, blez, bgtz, beqz, bnez, blt, bge, bgt, beq, bne bltz, bgez, blez, bgtz, beq, bne

11 MAL  TAL MALTAL Branches: beqz $4, loopbeq $4, $0, loop blt $4, $5, targetslt $at, $4, $5 # $at is 1 if $4 < $5 # $at is 0 otherwise bne $at, $0, target I/O instructions: put, puts, putc, get, getc, done Really “procedure call to OS” Assume $2  call type Assume $4  input parameters putc $12addi $2, $0, 11 # putc is syscall 11 # see page 262 add $4, $12, $0 # char to putc syscall # call OS doneaddi $2, $0, 10 # done is syscall 10 syscall

12 Assembly The assembler will Assign addresses Generate machine code If necessary, the assembler will Translate (synthesize) from the accepted assembly to the instructions available in the architecture Provide macros and other features Generate an image of what memory must look like for the program to be executed.

13 Assembler What should the assembler do when it sees a directive?.data.text.space,.word,.byte org (HC11) equ (HC11) How is the memory image formed?

14 Assembler Example Data Declaration Assembler aligns data to word addresses unless told not to. Assembly process is very sequential..data a1:.word 3 a2:.byte ‘\n’ a3:.space 5 AddressContents 0x000010000x00000003 0x000010040x??????0a 0x000010080x???????? 0x0000100c0x????????

15 Assembler Machine code generation from simple instructions: Opcode is 6 bits – addi is defined to be 001000 Rs – source register is 5 bits, encoding of 20, 10100 Rt – target register is 5 bits, encoding of 8, 01000 The 32-bit instruction for addi $8, $20, 15 is: 001000 10100 01000 0000000000001111 Or 0x2288000f Assembly language:addi $8, $20, 15 Machine code format: opcode rtrs immediate 310 opcodersrtimmediate

16 Instruction Formats I-Type Instructions with 16-bit immediates ADDI, ORI, ANDI LW, SW BNE OPC:6rs1:5rd:5immediate:16 OPC:6rs1:5rs2/rddisplacement:16 OPC:6rs1:5rs2:5distance(instr):16

17 Instruction Formats J-Type Instructions with 26-bit immediate J, JAL R-Type All other instructions ADD, AND, OR, JR, JALR, SYSCALL, MULT, MFHI, LUI, SLT OPC:626-bits of jump address OPC:6rs1:5rs2:5ALU function:11rd:5

18 Assembly Example.data a1:.word3 a2:.word16:4 a3:.word5.text main: la $6, a2 loop:lw $7, 4($6) mult $9, $10 b loop done

19 Assembly Example Symbol Table Symboladdress a10040 0000 a20040 0004 a30040 0014 main0080 0000 loop0080 0008 addressContents (hex)Contents (binary) 0040 00000000 00030000 0000 0000 0000 0000 0000 0000 0011 0040 00040000 00100000 0000 0000 0000 0000 0000 0001 0000 0040 00080000 00100000 0000 0000 0000 0000 0000 0001 0000 0040 000c0000 00100000 0000 0000 0000 0000 0000 0001 0000 0040 00100000 00100000 0000 0000 0000 0000 0000 0001 0000 0040 00140000 00050000 0000 0000 0000 0000 0000 0000 0101 Memory map of data section

20 Assembly Example Translation to TAL code.text main:lui $6, 0x0040# la $6, a2 ori $6, $6, 0x0004 loop:lw $7, 4($6) mult $9, $10 beq $0, $0, loop# b loop ori $2, $0, 10# done syscall addressContents (hex)Contents (binary) 0080 00003c06 00400011 1100 0000 0110 0000 0000 0100 0000 (lui) 0080 000434c6 00040011 0100 1100 0110 0000 0000 0000 0100 (ori) 0080 00088cc7 00041000 1100 1100 0111 0000 0000 0000 0100 (lw) 0080 000c012a 00180000 0001 0010 1010 0000 0000 0001 1000 (mult) 0080 00101000 fffd0001 0000 0000 0000 1111 1111 1111 1101 (beq) 0080 00143402 000a0011 0100 0000 0010 0000 0000 0000 1010 (ori) 0080 00180000 000c0000 0000 0000 0000 0000 0000 0000 1100 (sys) Memory map of text section

21 Assembly Branch offset computation. At execution time: PC  NPC + {sign extended offset field,00} PC points to instruction after the beq when offset is added. At assembly time: Byte offset= target addr – (address of branch + 4) = 00800008 – (00800010+00000004) = FFFFFFF4 (-12) 3 important observations: Offset is stored in the instruction as a word offset An offset may be negative The field dedicated to the offset is 16 bits, range is thus limited.

22 Assembly Jump target computation. At execution time: PC  {most significant 4 bits of PC, target field, 00} At assembly time Take 32 bit target address Eliminate least significant 2 bits (since word aligned) Eliminate most significant 4 bits What remains is 26 bits, and goes in the target field

23 Linking and Loading Object file headerstart/size of other parts text Machine Language data static data – size and initial values relocation info instructions and data with absolute addresses symbol table addresses of external labels Debuggin` info

24 Linking and Loading Linker Search libraries Read object files Relocate code/data Resolve external references Loader Create address spaces for text & data Copy text & data in memory Initialize stack and copy args Initialize regs (maybe) Initialize other things (OS) Jump to startup routine And then address of main:

25 Linking and Loading The data section starts at 0x00400000 for the MIPS RISC processor. If the source code has,.data a1:.word 15 a2:.word –2 then the assembler specifies initial configuration memory as addresscontents 0x004000000000 0000 0000 0000 0000 0000 0000 1111 0x004000041111 1111 1111 1111 1111 1111 1111 1110 Like the data, the code needs to be placed starting at a specific location to make it work

26 Linking and Loading Consider the case where the assembly language code is split across 2 files. Each is assembled separately. File 1:File2:.data a1:.word 15 a2:.word –2.text main:la $t0, a1 add $t1, $t0, $s3 jal proc5 done.data a3:.word 0.text proc5:lw $t6, a1 sub $t2, $t0, $s4 jr $ra

27 Linking and Loading What happens to… a1 a3 main proc5 lw la jal

28 Linking and Loading Problem: there are absolute addresses in the machine code. Solutions: 1.Only allow a single source file Why not? 2.Allow linking and loading to Relocate pieces of data and code sections Finish the machine code where symbols were left undefined Basically makes absolute address a relative address

29 Linking and Loading The assembler will Start both data and code sections at address 0, for all files. Keep track of the size of every data and code section. Keep track of all absolute addresses within the file. Linking and loading will: Assign starting addresses for all data and code sections, based on their sizes. The blocks of data and code go at non-overlapping locations. Fix all absolute addresses in the code Place the linked code and data in memory at the location assigned Start it up

30 MIPS Example Code levels of abstraction (from James Larus) “C” code #include int main (int argc, char *argv[]) { int I; int sum = 0; for (I=0; I<=100; I++) sum += I * I; printf (“The sum 0..100=%d\n”,sum); } Compile this HLL into a machine’s assembly language with the compiler.

31 MIPS Example.text main: subu$sp, 32 sw$31, 20($sp) sw$4, 32($sp) sw$0, 24($sp) sw$0, 28($sp) loop: lw$14, 28($sp) mul$15, $14, $14 lw$24, 24($sp) addu$25, $24, $15 sw$8, 28($sp) ble$8, 100, loop la$4, str lw$5, 24($sp) jalprintf move$2, $0 lw$31, 20($sp) addu$sp, 32 jr$31.data str:.asciiz “The sum 0..100=%d\n”

32 MIPS Assembly Language addiu$sp, $sp,-32 sw$ra, 20($sp) sw$a0, 32($sp) sw$a1, 36($sp) sw$0, 24($sp) sw$0, 28($sp) lwt6, 28($sp) lw$t8, 24($sp) multu$t6, $t6 addiu$t0, $t6, 1 slti$at, $t0, 101 sw$t0, 28($sp) mflo$t7 addu$t9, $t8, $t7 bne$at, $0, -9 sw$t9, 24($sp) lui$a0,4096 lw$a1, 24($sp) jal1048812 addiu$a0, $a0, 1072 lw$ra, 20($sp) addiu$sp, $sp, 32 jr$ra Which then the assembler translates into binary machine code for instructions and data. Now resolve the labels…

33 MIPS Machine language 00100111101111011111111111100000 10101111101111110000000000010100 10101111101001000000000000100000 10101111101001010000000000100100 10101111101000000000000000011000 10101111101000000000000000011100 10001111101011100000000000011100 10001111101110000000000000011000 00000001110011100000000000011001 00100101110010000000000000000001 00101001000000010000000001100101 10101111101010000000000000011100 00000000000000000111100000010010 00000011000011111100100000100001 00010100001000001111111111110111 10101111101110010000000000011000 00111100000001000001000000000000 10001111101001010000000000011000 00001100000100000000000011101100 00100100100001000000010000110000 10001111101111110000000000010100 00100111101111010000000000100000 00000011111000000000000000001000 00000000000000000001000000100001


Download ppt "The Assembly Process Basically why does it all work."

Similar presentations


Ads by Google