Download presentation
Presentation is loading. Please wait.
1
COMP 2003: Assembly Language and Digital Logic
Chapter 6: Becoming the Machine Notes by Neil Dickson This chapter discusses machine code: what all executables are made of. Note that the examples go into great detail, but don’t be frightened, since knowing the gory details is much less important than understanding generally how instructions are encoded. The examples are just to give a flavour of the variations that can occur.
2
Machine Code The CPU doesn’t understand text
Need a concise way of representing an instruction such that it is easy (fast) for the CPU to determine what to do This representation is called machine code
3
Example of Machine Code
address encoding source code 000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C Notice that the line labels take up no space. They are just names for addresses.
4
Example of Machine Code
address encoding source code 000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C +2 +5 +2 +2 +7 +3 +2 +1 Notice that the increase in address is the size of the instruction.
5
x86 Instruction Machine Code
prefix(es) REX prefix opcode mod-reg-r/m SIB offset immediate opcode: main indication of what the instruction is; looked up in an opcode map the only part present in all instructions may be multiple bytes, or use reg in mod-reg-r/m if only one operand or an immediate mod-reg-r/m byte: mod (high 2 bits): 0 = r/m is memory & no offset; 1 = memory & 8-bit offset; 2 = memory & 32-bit offset; 3 = r/m is a register reg (middle 3 bits): specifies the register (eax=0 to edi=7) used as the register operand r/m (low 3 bits): if mod=3, specifies the other register used as an operand, else specifies an addressing register scale-index-base byte: allows 2 addressing registers; present iff mod≠3 and r/m=4 (esp) scale (high 2 bits): power of two by which to multiply the index register (0reg; 1reg*2; 2reg*4; 3reg*4) index (middle 3 bits): addressing register to be multiplied by 2scale base (low 3 bits): addressing register not to be multiplied only esp used for addressing if index=4 (esp) and base=4 (esp) prefixes: most common prefix is 66h, which changes the operand size from dwords to words
6
Let’s look back at our example code
Register Numbers eax ax al 1 ecx 1 cx 1 cl 2 edx 2 dx 2 dl 3 ebx 3 bx 3 bl 4 esp 4 sp 4 ah 5 ebp 5 bp 5 ch 6 esi 6 si 6 dh 7 edi 7 di 7 bh Let’s look back at our example code
7
Decoding Machine Code mod=11=3=both registers; reg=011=3=ebx; r/m=000=0=eax address encoding source code 000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C opcode map says that: 3B cmp register,dword ptr register/memory & followed by mod-reg-r/m
8
Decoding Machine Code address encoding source code
000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C opcode map says that BB mov ebx,constant & followed by 32-bit constant opcode map says that C3 ret
9
Decoding Machine Code mod=0=no offset; reg is ignored; r/m=4=followed by SIB address encoding source code 000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C scale=2=index*4; index=3=ebx; base=1=ecx opcode map says that C7 mov dword ptr register/memory,constant & followed by mod-reg-r/m & 32-bit constant at the end
10
Decoding Machine Code mod=3=register; reg=4=mul in opcode map; r/m=3=ebx address encoding source code 000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C opcode map says that F7 ??? dword ptr register/memory & followed by mod-reg-r/m where reg specifies the operation (from not, neg, mul, div, ...) similar for the add instruction
11
What about jumps and calls?
Opcode indicates that it is a jump or call and the condition (if conditional jump) Opcode is followed by a signed constant that is the number to add to eip if the condition is met i.e. jumps and calls are relative to the following instruction because eip contains the address of the following instruction
12
Decoding Machine Code Jumps
000001FF (address of following instruction) + 0C = B, address of Done address encoding source code 000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C opcode map says that 73 jae LineLabel & followed by 8-bit signed relative address of LineLabel
13
Decoding Machine Code Jumps
B (address of following instruction) + FFFFFFF0 = B + (-10) = FB, address of NextPixel sign-extended address encoding source code 000001F4 F7 E mul ebx 000001F6 BB mov ebx,0 000001FB NextPixel: 000001FB 3B D cmp ebx,eax 000001FD 73 0C jae Done 000001FF C mov dword ptr [ecx+ebx*4],00080FFh 000080FF C add ebx,1 EB F jmp NextPixel B Done: B C ret C Note: Jumps beyond -128 bytes or +127 bytes and all calls have a 32-bit relative address instead. opcode map says that EB jmp LineLabel & followed by 8-bit signed relative address of LineLabel
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.