Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Spring 2016 Instructor: John Barr * Modified slides.

Similar presentations


Presentation on theme: "1 Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Spring 2016 Instructor: John Barr * Modified slides."— Presentation transcript:

1 1 Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Spring 2016 Instructor: John Barr * Modified slides from the book “Computer Systems: a Programmer’s Perspective”, Randy Bryant & David O’Hallaron, 2015

2 2 Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64

3 3 CISC Properties Instruction can reference different operand types  Immediate, register, memory Arithmetic operations can read/write memory Memory reference can involve complex computation  Rb + S*Ri + D  Useful for arithmetic expressions, too Instructions can have varying lengths  IA32 instructions can range from 1 to 15 bytes

4 4 Features of IA32 instructions IA32 instructions can be from 1 to 15 bytes.  More commonly used instructions are shorter  Instructions with fewer operands are shorter Each instruction has an instruction format  Each instruction has a unique byte representation  i.e., instruction pushl %ebp has encoding 55 IA32 started as a 16 bit language  So IA32 calls a 16-bit piece of data a “word”  A 32-bit piece of data is a “double word” or a “long word”  A 64-bit piece of data is a “quad word”

5 5 CPU Assembly Programmer’s View (review) Programmer-Visible State  PC: Program counter  Address of next instruction  Called “EIP” (IA32) or “RIP” (x86-64)  Register file  Heavily used program data  8 named locations, 32 bit values (x86-64)  Condition codes  Store status information about most recent arithmetic operation  Used for conditional branching PC Registers Memory Object Code Program Data OS Data Addresses Data Instructions Stack Condition Codes  Memory  Byte addressable array  Code, user data, (some) OS data  Includes stack used to support procedures

6 6 Instruction format Assembly language instructions have a very rigid format For most instructions the format is movl Source, Dest Instruction name Source of data for the instruction: Registers/memory Destination of instruction results: Registers/memory Instruction suffix

7 7 Data Representations: IA32 + x86-64 Sizes of C Objects (in Bytes)  C Data TypeGeneric 32-bitIntel IA32x86-64  unsigned444  int444  long int448  char111  short222  float444  double888  long double810/1216  char *448 –Or any other pointer

8 8 Instruction suffix Every operation in GAS has a single-character suffix  Denotes the size of the operand  Example: basic instruction is mov  Can move byte ( movb ), word ( movw ) and double word ( movl ) Note that floating point operations have entirely different instructions. C declaration Intel data type GAS suffixSize (bytes) char Byteb1 short Wordw2 int Double wordl4 unsigned Double wordl4 long int Quad wordq8 unsigned long Double wordq8 char * Quad wordq8 float Single precision s4 double Double precision d8 long double Extended precision t16

9 9 Registers 16 64-bit general purpose registers  Programmers/compilers can use these  All registers begin with %r  Rest of name is historical: from 8086  Registers originally had specific purposes  No restrictions on use of registers in commands  However, some instructions use fixed registers as source/destination  In procedures there are different conventions for saving/restoring the first 4 registers (%rax, %rbx, %rcx, %rdx) than the next 4 (%rsi, %rdi, %rsp, %rbp).  Final two registers have special purposes in procedures –%rbp (frame pointer) –%rsp (stack pointer)  Will discuss all these later

10 10 Registers 16 64-bit general purpose registers  The low-order 4 bytes can be independently read or written by operation instructions.  Done for backward compatibility with 8008 and 8080 (1970’s!)  When a byte of the register is changed, the rest of the register is unaffected.  The low-order 2 bytes (16 bits, i.e., a single word) can be independently read/wrote by word operation instructions  Comes from 8086 16-bit heritage  When a word of the register is changed, the rest of the register is unaffected.  See next slide!

11 11 %rsp x86-64 Integer Registers  Can reference low-order 4 bytes (also low-order 1 & 2 bytes) %eax %ebx %ecx %edx %esi %edi %esp %ebp %r8d %r9d %r10d %r11d %r12d %r13d %r14d %r15d %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 %rax %rbx %rcx %rdx %rsi %rdi %rbp

12 12 History: IA32 Registers %eax %ecx %edx %ebx %esi %edi %esp %ebp %ax %cx %dx %bx %si %di %sp %bp %ah %ch %dh %bh %al %cl %dl %bl 16-bit virtual registers (%ax, %cx,dx, …) (backwards compatibility) general purpose accumulate counter data base source index destination index stack pointer base pointer Origin (mostly obsolete) 8-bit register (%ah, %al,ch, …) 32-bit register (%eax, %ecx, …)

13 13 Moving Data movq Source, Dest  Move 8-byte (“quad”) word  Lots of these in typical code Operand Types  Immediate: Constant integer data  Example: $0x400, $-533  Like C constant, but prefixed with ‘$’  Encoded with 1, 2, or 4 bytes  Register: One of 16 integer registers  Example: %rax, %r13  But %rsp reserved for special use  Others have special uses for particular instructions  Memory: 8 consecutive bytes of memory at address given by register  Simplest example: (%rax)  Various other “address modes” %rax %rcx %rdx %rbx %rsi %rdi %rsp %rbp %rN

14 14 movl Operand Combinations Cannot do memory-memory transfer with a single instruction movq Imm Reg Mem Reg Mem Reg Mem Reg SourceDestC Analog movq $0x4,%raxtemp = 0x4; movq $-147,(%rax)*p = -147; movq %rax,%rdxtemp2 = temp1; movq %rax,(%rdx)*p = temp; movq (%rax),%rdxtemp = *p; Src,Dest

15 15 Simple Memory Addressing Modes Normal(R)Mem[Reg[R]]  Register R specifies memory address  Aha! Pointer dereferencing in C movq (%rcx),%rax DisplacementD(R)Mem[Reg[R]+D]  Register R specifies start of memory region  Constant displacement D specifies offset movq 8(%rbp),%rdx

16 16 Simple Addressing Modes (cont) Immediate$ImmImm  The value Imm is the value that is used movq $4096,%rax AbsoluteImmMem[Imm]  No dollar sign before the number  The number is the memory address to use movq 4096,%rdx The book has more details on addressing modes!!

17 17 mov instructions InstructionEffectDescription movq S,DD  SMove quad word movl S,D movw S,D D  S Move double word Move word movb S,D D  SMove byte movsbl S,D D  SignExtend (S) Move sign-extended byte movzbl S,D D  ZeroExtend Move zero-extended byte Notes:1. byte movements must use one of the 8 single-byte registers 2. word movements must use one of the 8 2-byte registers 3. movsbl takes single byte source, performs sign-extension on high-order 24 bits, copies the resulting double word to dest. 4. movzbl takes single byte source, performs adds 24 0’s to high-order bits, copies the resulting double word to dest

18 18 mov instruction example Assume that %dh = 8D and %eax = 98765432 at the beginning of each of these instructions instructionresult 1. movb %dh, %al %eax = 2. movsbl %dh, %eax %eax = 3. movzbl %dh, %eax %eax = 9876548D FFFFFF8D 0000008D

19 19 mov instruction example instructionaddressing mode 1. movq $0x4050, %eax 2. movq %ebp, %esp 3. movq (%ecx), %eax 4. movq $-17, (%esp) 5. movq %eax, -12(%ebp) Imm  Reg Reg  Reg Mem  Reg Imm  Mem Reg  Mem (Displacement)

20 20 Special memory areas: stack and heap Stack  Space in Memory allocated to each running program (called a process)  Used to store “temporary” variable values  Used to store variables for functions Heap  Space in Memory allocated to each running program (process)  Used to store dynamically allocated variables Kernel virtual memory Memory mapped region for shared libraries Run-time heap (created at runtime by malloc) User stack (created at runtime) Unused Read/write segment (.data,.bss ) Read-only segment (.init,.text,.rodata )

21 21 Example of Simple Addressing Modes void swap (long *xp, long *yp) { long t0 = *xp; long t1 = *yp; *xp = t1; *yp = t0; } swap: movq (%rdi), %rax movq (%rsi), %rdx movq %rdx, (%rdi) movq %rax, (%rsi) ret Note: if you look at the assembly code, it’s much more complicated; we’re ignoring some code to simplify.

22 22 %rdi %rsi %rax %rdx Understanding Swap () void swap (long *xp, long *yp) { long t0 = *xp; long t1 = *yp; *xp = t1; *yp = t0; } Memory RegisterValue %rdixp %rsiyp %raxt0 %rdxt1 swap: movq (%rdi), %rax # t0 = *xp movq (%rsi), %rdx # t1 = *yp movq %rdx, (%rdi) # *xp = t1 movq %rax, (%rsi) # *yp = t0 ret Registers

23 23 Understanding Swap () 123 456 %rdi %rsi %rax %rdx 0x120 0x100 Registers Memory swap: movq (%rdi), %rax # t0 = *xp movq (%rsi), %rdx # t1 = *yp movq %rdx, (%rdi) # *xp = t1 movq %rax, (%rsi) # *yp = t0 ret 0x120 0x118 0x110 0x108 0x100 Address

24 24 Understanding Swap () 123 456 %rdi %rsi %rax %rdx 0x120 0x100 123 Registers Memory swap: movq (%rdi), %rax # t0 = *xp movq (%rsi), %rdx # t1 = *yp movq %rdx, (%rdi) # *xp = t1 movq %rax, (%rsi) # *yp = t0 ret 0x120 0x118 0x110 0x108 0x100 Address

25 25 Understanding Swap () 123 456 %rdi %rsi %rax %rdx 0x120 0x100 123 456 Registers Memory swap: movq (%rdi), %rax # t0 = *xp movq (%rsi), %rdx # t1 = *yp movq %rdx, (%rdi) # *xp = t1 movq %rax, (%rsi) # *yp = t0 ret 0x120 0x118 0x110 0x108 0x100 Address

26 26 Understanding Swap () 456 %rdi %rsi %rax %rdx 0x120 0x100 123 456 Registers Memory swap: movq (%rdi), %rax # t0 = *xp movq (%rsi), %rdx # t1 = *yp movq %rdx, (%rdi) # *xp = t1 movq %rax, (%rsi) # *yp = t0 ret 0x120 0x118 0x110 0x108 0x100 Address

27 27 Understanding Swap () 456 123 %rdi %rsi %rax %rdx 0x120 0x100 123 456 Registers Memory swap: movq (%rdi), %rax # t0 = *xp movq (%rsi), %rdx # t1 = *yp movq %rdx, (%rdi) # *xp = t1 movq %rax, (%rsi) # *yp = t0 ret 0x120 0x118 0x110 0x108 0x100 Address

28 28 Complete Memory Addressing Modes Most General Form D(Rb,Ri,S)Mem[Reg[Rb]+S*Reg[Ri]+ D]  D: Constant “displacement” 1, 2, or 4 bytes  Rb: Base register: Any of 16 integer registers  Ri:Index register: Any, except for %rsp  Unlikely you’d use %rbp, either  S: Scale: 1, 2, 4, or 8 (why these numbers?) Special Cases (Rb,Ri)Mem[Reg[Rb]+Reg[Ri]] D(Rb,Ri)Mem[Reg[Rb]+Reg[Ri]+D] (Rb,Ri,S)Mem[Reg[Rb]+S*Reg[Ri]] D(Rb,Ri,S)Mem[Reg[Rb]+S*Reg[Ri]+D]

29 29 Address Computation Examples %rdx %rcx 0xf000 0x100 ExpressionComputationAddress 0x8(%rdx) (%rdx,%rcx) (%rdx,%rcx,4) 0x80(,%rdx,2)

30 30 Address Computation Examples %edx %ecx 0xf000 0x100 ExpressionComputationAddress 0x8(%rdx) (%rdx,%rcx) (%rdx,%rcx,4) 0x80(,%rdx,2) 0xf000 + 0x80xf008 0xf000 + 0x1000xf100 0xf000 + 4*0x1000xf400 2 *0xf000 + 0x800x1e080

31 31 Address Computation Instruction leaq Src,Dest  Src is address mode expression  Set Dest to address denoted by expression Format  Looks like a memory access  Does not actually access memory  Rather, calculates the memory address, then stores in a register Uses  Computing addresses without a memory reference  E.g., translation of p = &x[i];  Computing arithmetic expressions of the form x + k*y  k = 1, 2, 4, or 8. lea = load effective address

32 32 Example Converted to ASM by compiler: long m12(long x) { return x*12; } long m12(long x) { return x*12; } leaq (%rdi,%rdi,2), %rax ;t <- x+x*2 salq $2, %rax ;return t<<2 ; or t * 4 leaq (%rdi,%rdi,2), %rax ;t <- x+x*2 salq $2, %rax ;return t<<2 ; or t * 4

33 33 Address Computation Instruction ExpressionResult leaq 6(%rax), %rdx leaq (%rax, %rcx), %rdx leaq (%rax, %rcx, 4), %rdx leaq 7(%rax, %rax, 8), %rdx leaq 0xA(,%rax, 4), %rdx leaq 9(%rax,%rcx, 2), %rdx Assume %rax holds value x and %rcx holds value y * Note the leading comma in the next to last entry

34 34 Address Computation Instruction ExpressionResult leal 6(%rax), %rdx leal (%rax, %rcx), %rdx leal (%rax, %rcx, 4), %rdx leal 7(%rax, %rax, 8), %rdx leal 0xA(,%rax, 4), %rdx leal 9(%rax,%rcx, 2), %rdx Assume %rax holds value x and %rcx holds value y * Note the leading comma in the next to last entry %rdx  x + 6 %rdx  x + (y * 2) + 9 %rdx  (x * 4) + 10 %rdx  x + (x * 8) + 7 = (x * 9) + 7 %rdx  x + (y * 4) %rdx  x + y

35 35 Machine Programming I: Summary History of Intel processors and architectures  Evolutionary design leads to many quirks and artifacts C, assembly, machine code  New forms of visible state: program counter, registers,…  Compiler must transform statements, expressions, procedures into low-level instruction sequences Assembly Basics: Registers, operands, move  The x86-64 move instructions cover wide range of data movement forms Intro to x86-64  A major departure from the style of code seen in IA32


Download ppt "1 Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Spring 2016 Instructor: John Barr * Modified slides."

Similar presentations


Ads by Google