Lec 4Systems Architecture1 Systems Architecture Lecture 4: Compilers, Assemblers, Linkers & Loaders Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some material drawn from CMU CSAPP Slides: Kesden and Puschel
Lec 4Systems Architecture2 Introduction Objective: To introduce the role of compilers, assemblers, linkers and loaders. To see what is underneath a C program: assembly language, machine language, and executable.
Lec 4Systems Architecture3 Compilation Process
Lec 4Systems Architecture4 Below Your Program Example from a Unix system Source Files: count.c and main.c Corresponding assembly code: count.s and main.s Corresponding machine code (object code): count.o and main.o Library functions: libc.a Executable file: a.out format for a.out and object code: ELF (Executable and Linking Format)
Lec 4Systems Architecture5 Producing an Executable Program Example from a Unix system (SGI Challenge running IRIX 6.5) Compiler: count.c and main.c count.s and main.s –gcc -S count.c main.c Assembler: count.s and main.s count.o and main.o –gcc -c count.s main.s –as count.s -o count.o Linker/Loader: count.o main.o libc.a a.out –gcc main.o count.o –ld main.o count.o -lc (additional libraries are required)
Lec 4Systems Architecture6 Source Files void main() { int n,s; printf("Enter upper limit: "); scanf("%d",&n); s = count(n); printf("Sum of i from 1 to %d = %d\n",n,s); } int count(int n) { int i,s; s = 0; for (i=1;i<=n;i++) s = s + i; return s; }
Lec 4Systems Architecture7 Assembly Code for MIPS (count.s) #.file 1 "count.c".option pic2.section.text.text.align 2.globl count.ent count count:.LFB1:.frame $fp,48,$31 # vars= 16, regs= 2/0, args= 0, extra= 1 6.mask 0x ,-8.fmask 0x ,0 subu $sp,$sp,48.LCFI0: sd $fp,40($sp)
Lec 4Systems Architecture8. LCFI1: sd $28,32($sp).LCFI2: move $fp,$sp.LCFI3:.set noat lui $1,%hi(%neg(%gp_rel(count))) addiu $1,$1,%lo(%neg(%gp_rel(count))) daddu $gp,$1,$25.set at sw $4,16($fp) sw $0,24($fp) li $2,1 # 0x1 sw $2,20($fp).L3: lw $2,20($fp) lw $3,16($fp) slt $2,$3,$2 beq $2,$0,.L6 b.L4 L6: lw $2,24($fp) lw $3,20($fp) addu $2,$2,$3 sw $2,24($fp).L5: lw $2,20($fp) addu $3,$2,1 sw $3,20($fp) b.L3.L4: lw $3,24($fp) move $2,$3 b.L2.L2: move $sp,$fp ld $fp,40($sp) ld $28,32($sp) addu $sp,$sp,48 j $31.LFE1:.end count
Lec 4Systems Architecture9 Executable Program for MIPS (a.out) f45 4c c e e b b b c c c
Assembly Characteristics: Data Types “Integer” data of 1, 2, or 4 bytes –Data values –Addresses (untyped pointers) Floating point data of 4, 8, or 10 bytes No aggregate types such as arrays or structures –Just contiguously allocated bytes in memory Lec 4Systems Architecture10
Assembly Characteristics: Operations Perform arithmetic function on register or memory data Transfer data between memory and register –Load data from memory into register –Store register data into memory Transfer control –Unconditional jumps to/from procedures –Conditional branches Lec 4Systems Architecture11
Code for sum 0x : 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3 Object Code Assembler –Translates.s into.o –Binary encoding of each instruction –Nearly-complete image of executable code –Missing linkages between code in different files Linker –Resolves references between files –Combines with static run-time libraries E.g., code for malloc, printf –Some libraries are dynamically linked Linking occurs when program begins execution Total of 13 bytes Each instruction 1, 2, or 3 bytes Starts at address 0x Lec 4Systems Architecture12
Disassembled : 0:55 push %ebp 1:89 e5 mov %esp,%ebp 3:8b 45 0c mov 0xc(%ebp),%eax 6: add 0x8(%ebp),%eax 9:89 ec mov %ebp,%esp b:5d pop %ebp c:c3 ret d:8d lea 0x0(%esi),%esi Disassembling Object Code Disassembler objdump -d p –Useful tool for examining object code –Analyzes bit pattern of series of instructions –Produces approximate rendition of assembly code –Can be run on either a.out (complete executable) or.o file Lec 4Systems Architecture13
Disassembled 0x :push %ebp 0x :mov %esp,%ebp 0x :mov 0xc(%ebp),%eax 0x :add 0x8(%ebp),%eax 0x :mov %ebp,%esp 0x40104b :pop %ebp 0x40104c :ret 0x40104d :lea 0x0(%esi),%esi Alternate Disassembly Within gdb Debugger gdb p disassemble sum –Disassemble procedure x/13b sum –Examine the 13 bytes starting at sum Object 0x401040: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3 Lec 4Systems Architecture14
What Can be Disassembled? Anything that can be interpreted as executable code Disassembler examines bytes and reconstructs assembly source % objdump -d WINWORD.EXE WINWORD.EXE: file format pei-i386 No symbols in "WINWORD.EXE". Disassembly of section.text: : :55 push %ebp :8b ec mov %esp,%ebp :6a ff push $0xffffffff : push $0x a:68 91 dc 4c 30 push $0x304cdc91 Lec 4Systems Architecture15