Machine-Level Programming: X86-64 Topics Registers Stack Function Calls Local Storage X86-64.ppt CS 105 Tour of Black Holes of Computing.

Slides:



Advertisements
Similar presentations
University of Washington Procedures and Stacks II The Hardware/Software Interface CSE351 Winter 2013.
Advertisements

Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:
Machine-Level Programming V: Miscellaneous Topics Topics Buffer Overflow Linux Memory Layout Understanding Pointers Floating-Point Code CS 105 Tour of.
Machine Programming – Procedures and IA32 Stack CENG334: Introduction to Operating Systems Instructor: Erol Sahin Acknowledgement: Most of the slides are.
University of Washington Last Time For loops  for loop → while loop → do-while loop → goto version  for loop → while loop → goto “jump to middle” version.
Machine-Level Programming III: Procedures Apr. 17, 2006 Topics IA32 stack discipline Register saving conventions Creating pointers to local variables CS213.
Machine-Level Programming III: Procedures Sept. 17, 2007 IA32 stack discipline Register saving conventions Creating pointers to local variablesx86-64 Argument.
– 1 – , F’02 ICS05 Instructor: Peter A. Dinda TA: Bin Lin Recitation 4.
Machine-Level Programming III: Procedures Jan 30, 2003
Machine-Level Programming III: Procedures Sept. 15, 2006 IA32 stack discipline Register saving conventions Creating pointers to local variablesx86-64 Argument.
Stack Activation Records Topics IA32 stack discipline Register saving conventions Creating pointers to local variables February 6, 2003 CSCE 212H Computer.
1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron.
1 1 Machine-Level Programming IV: x86-64 Procedures, Data Andrew Case Slides adapted from Jinyang Li, Randy Bryant & Dave O’Hallaron.
Ithaca College Machine-Level Programming IV: IA32 Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2013 * Modified slides.
64-Bit Architectures Topics 64-bit data New registers and instructions Calling conventions CS 105 “Tour of the Black Holes of Computing!”
Machine-Level Programming III: Switch Statements and IA32 Procedures Seoul National University.
University of Washington Today More on procedures, stack etc. Lab 2 due today!  We hope it was fun! What is a stack?  And how about a stack frame? 1.
Fabián E. Bustamante, Spring 2007 Machine-Level Programming III - Procedures Today IA32 stack discipline Register saving conventions Creating pointers.
Machine-Level Programming III: Procedures Topics IA32 stack discipline Register-saving conventions Creating pointers to local variables CS 105 “Tour of.
Carnegie Mellon 1 Odds and Ends Intro to x86-64 Memory Layout.
Machine-level Programming III: Procedures Topics –IA32 stack discipline –Register saving conventions –Creating pointers to local variables.
1 Procedure Call and Array. 2 Outline Data manipulation Control structure Suggested reading –Chap 3.7, 3.8.
University of Washington x86 Programming I The Hardware/Software Interface CSE351 Winter 2013.
Carnegie Mellon 1 Machine-Level Programming I: Basics Lecture, Feb. 21, 2013 These slides are from website which accompanies the.
F’08 Stack Discipline Aug. 27, 2008 Nathaniel Wesley Filardo Slides stolen from CMU , whence they were stolen from CMU CS 438.
X86_64 programming Tutorial #1 CPSC 261. X86_64 An extension of the IA32 (often called x86 – originated in the Intel 8086 processor) instruction set to.
1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Carnegie Mellon Machine-Level Programming II: Control Carnegie Mellon.
University of Washington x86 Programming II The Hardware/Software Interface CSE351 Winter 2013.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Carnegie Mellon Instructor: San Skulrattanakulchai Machine-Level Programming.
1 Assembly Language: Function Calls Jennifer Rexford.
IA32 Stack –Region of memory managed with stack discipline –Grows toward lower addresses –Register %esp indicates lowest stack address address of top element.
A job ad at a game programming company
Reading Condition Codes (Cont.)
Machine-Level Programming 2 Control Flow
Instruction Set Architecture
Credits and Disclaimers
Machine-Level Programming I: Basics
Credits and Disclaimers
CSCE 212 Computer Architecture
Conditional Branch Example
Machine-Level Programming III: Procedures
Machine-Level Programming 1 Introduction
Machine-Level Programming IV: x86-64 Procedures, Data
Machine-Level Programming II: Control Flow
Y86 Processor State Program Registers
Instructors: Majd Sakr and Khaled Harras
Machine-Level Programming 4 Procedures
Machine-Level Programming III: Procedures /18-213/14-513/15-513: Introduction to Computer Systems 7th Lecture, September 18, 2018.
Carnegie Mellon Machine-Level Programming III: Procedures : Introduction to Computer Systems October 22, 2015 Instructor: Rabi Mahapatra Authors:
Condition Codes Single Bit Registers
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Machine-Level Programming 2 Control Flow
Machine-Level Programming 2 Control Flow
Machine-Level Programming III: Procedures Sept 18, 2001
Machine-Level Programming 2 Control Flow
Machine Level Representation of Programs (IV)
Machine-Level Programming: Introduction
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Machine Level Representation of Programs (IV)
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Machine-Level Programming II: Control Flow
Machine-Level Representation of Programs (x86-64)
Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Spring 2016 Instructor: John Barr * Modified slides.
“Way easier than when we were students”
CS201- Lecture 8 IA32 Flow Control
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Credits and Disclaimers
Credits and Disclaimers
Presentation transcript:

Machine-Level Programming: X86-64 Topics Registers Stack Function Calls Local Storage X86-64.ppt CS 105 Tour of Black Holes of Computing

– 2 – 105 Interesting Features of x86-64 Pointers and long Integers are 64 bits Arithmetic ops support 8, 16, 32, and 64-bit ints Set of general purpose registers is 16 Full compatibility with X32 Conditional ops are implemented using conditional move instructions (when possible) – faster than branch Much of pgm state held in registers Up to 6 arguments passed via registers (not stack) Some procedures do not need to access the stack at all?? Floating-point ops implemented using register-oriented instructions rather than stack-based With move to x86-64, gcc updated to take advantage of many architecture changes

– 3 – 105 Data Representations: IA32 + x86-64 Sizes of C Objects (in Bytes) C Data TypeTypical 32-bitIntel IA32x86-64 unsigned444 int444 long int448 char111 short222 float444 double888 long double810/1216 char *448 Or any other pointer x86-64: 64 bit ints and 64 bit pointers

– 4 – 105 %rax %rbx %rcx %rdx %rsi %rdi %rsp %rbp x86-64 Integer Registers - 16 Extend existing registers. Add 8 new ones, word (16 bit and byte (8 bit) registers still available, first 2 bytes available (rax, rbx, rcx, rdx) Make %ebp / %rbp general purpose, note 32 bit names %eax %ebx %ecx %edx %esi %edi %esp %ebp %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 %r8d %r9d %r10d %r11d %r12d %r13d %r14d %r15d

– 5 – 105 Instructions Long word l (4 Bytes) ↔ Quad word q (8 Bytes) New instructions: movl → movq addl → addq sall → salq etc. 32-bit instructions that generate 32-bit results Set higher order bits of destination register to 0 Example: addl

– 6 – 105 Swap in 32-bit Mode void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } swap: pushl %ebp movl %esp,%ebp pushl %ebx movl 12(%ebp),%ecx movl 8(%ebp),%edx movl (%ecx),%eax movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx) movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret Body Setup Finish

– 7 – 105 Swap in 64-bit Mode Operands passed in registers (why useful?) First ( xp ) in %rdi, second ( yp ) in %rsi 64-bit pointers No stack operations required, actually no stack frame 32-bit data Data held in registers %eax and %edx movl operation void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } swap: movl(%rdi), %edx movl(%rsi), %eax movl%eax, (%rdi) movl%edx, (%rsi) retq

– 8 – 105 %rax %rbx %rcx %rdx %rsi %rdi %rsp %rbp x86-64 Integer Registers, 6 used for arguments %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 Callee saved C: Callee saved Callee saved Stack pointer Used for linking Return value Argument #4 Argument #1 Argument #3 Argument #2 Argument #6 Argument #5

– 9 – 105 Interesting Features of Stack Frame Allocate entire frame at once All stack accesses can be relative to %rsp Do so by decrementing stack pointer Can delay allocation, since safe to temporarily use red zone- 128 bytes available below the stack Simple deallocation Increment stack pointer No base/frame pointer needed

– 10 – 105 x86-64 Registers Arguments passed to functions via registers If more than 6 integral parameters, then pass rest on stack These registers can be used as caller-saved as well All references to stack frame via stack pointer Eliminates need to update %ebp/%rbp Other Registers R12-R15, Rbx, Rbp, callee saved 2 or 3 have special uses Rax holds return value for a function Rsp is the stack pointer

– 11 – 105 Reading Condition Codes: x86-64 int gt (long x, long y) { return x > y; } xorl %eax, %eax# eax = 0 cmpq %rsi, %rdi# Compare x and y setg %al# al = x > y Body (same for both) long lgt (long x, long y) { return x > y; } SetX Instructions: Set single byte based on combination of condition codes Does not alter remaining 3 bytes Will disappear Blackboard?

– 12 – 105 Reading Condition Codes: x86-64 int gt (long x, long y) { return x > y; } xorl %eax, %eax# eax = 0 cmpq %rsi, %rdi# Compare x and y setg %al# al = x > y Body (same for both) long lgt (long x, long y) { return x > y; } Is %rax zero? Yes: 32-bit instructions set high order 32 bits to 0! SetX Instructions: Set single byte based on combination of condition codes Does not alter remaining 3 bytes

– 13 – 105 Conditionals: x86-64 absdiff: # x in %edi, y in %esi movl %edi, %eax # eax = x movl %esi, %edx # edx = y subl %esi, %eax # eax = x-y subl %edi, %edx # edx = y-x cmpl %esi, %edi # x:y cmovle %edx, %eax # eax=edx if <= ret int absdiff( int x, int y) { int result; if (x > y) { result = x-y; } else { result = y-x; } return result; } Will disappear Blackboard?

– 14 – 105 Conditionals: x86-64 Conditional move instruction cmovC src, dest Move value from src to dest if condition C holds More efficient than conditional branching (simple control flow) But overhead: both branches are evaluated absdiff: # x in %edi, y in %esi movl %edi, %eax # eax = x movl %esi, %edx # edx = y subl %esi, %eax # eax = x-y subl %edi, %edx # edx = y-x cmpl %esi, %edi # x:y cmovle %edx, %eax # eax=edx if <= ret int absdiff( int x, int y) { int result; if (x > y) { result = x-y; } else { result = y-x; } return result; }

– 15 – 105 x86-64 Procedure Summary Heavy use of registers Parameter passing More temporaries since more registers Minimal use of stack Sometimes none Allocate/deallocate entire block Many tricky optimizations What kind of stack frame to use Calling with jump Various allocation techniques

– 16 – 105 IA32 Example Addresses $esp0xffffbcd0 p3 0x p1 0x p40x1904a110 p20x1904a008 &p20x beyond 0x big_array 0x huge_array 0x main()0x080483c6 useless() 0x final malloc()0x006be166 address range ~2 32 FF 00 Stack Text Data Heap not drawn to scale malloc() is dynamically linked address determined at runtime

– 17 – 105 x86-64 Example Addresses $rsp0x7ffffff8d1f8 p3 0x2aaabaadd010 p1 0x2aaaaaadc010 p40x p20x &p20x a60 beyond 0x a44 big_array 0x a80 huge_array 0x a50 main()0x useless() 0x final malloc()0x00386ae6a170 address range ~ F Stack Text Data Heap not drawn to scale malloc() is dynamically linked address determined at runtime

– 18 – 105 x86-64 Summary Pointers and long integers are 64 bit 16 Registers, with 64, 32, 16, 8, bit uses Minimal use of stack Parameters passed in registers Allocate/deallocate entire block for a procedure Can use 128 bytes passed (beyond) stack pointer (red zone) can use stack without messing with stack pointer No Frame Pointer Many tricky optimizations What kind of stack frame to use Calling with jump Various allocation techniques Compilers need to be smarter

– 19 – 105 x86-64 Locals in the Red Zone Avoiding Stack Pointer Change Can hold all information within small window beyond stack pointer /* Swap, using local array */ void swap_a(long *xp, long *yp) { volatile long loc[2]; loc[0] = *xp; loc[1] = *yp; *xp = loc[1]; *yp = loc[0]; } swap_a: movq (%rdi), %rax movq %rax, -24(%rsp) movq (%rsi), %rax movq %rax, -16(%rsp) movq -16(%rsp), %rax movq %rax, (%rdi) movq -24(%rsp), %rax movq %rax, (%rsi) ret rtn Ptr unused %rsp −8−8 loc[1] loc[0] − 16 − 24

– 20 – 105 x86-64 Stack Frame Example Keeps values of a and i in callee save registers Must set up stack frame to save these registers long sum = 0; /* Swap a[i] & a[i+1] */ void swap_ele_su (long a[], int i) { swap(&a[i], &a[i+1]); sum += a[i]; } swap_ele_su: movq %rbx, -16(%rsp) movslq %esi,%rbx movq %r12, -8(%rsp) movq %rdi, %r12 leaq (%rdi,%rbx,8), %rdi subq $16, %rsp leaq 8(%rdi), %rsi call swap movq (%r12,%rbx,8), %rax addq %rax, sum(%rip) movq (%rsp), %rbx movq 8(%rsp), %r12 addq $16, %rsp ret Blackboard?

– 21 – 105 Understanding x86-64 Stack Frame swap_ele_su: movq %rbx, -16(%rsp) # Save %rbx movslq %esi,%rbx # Extend & save i movq %r12, -8(%rsp) # Save %r12 movq %rdi, %r12 # Save a – array pointer leaq (%rdi,%rbx,8), %rdi # Create &a[i] subq $16, %rsp # Allocate stack frame leaq 8(%rdi), %rsi # &a[i+1] call swap # swap() movq (%r12,%rbx,8), %rax # a[i] addq %rax, sum(%rip) # sum += a[i] movq (%rsp), %rbx # Restore %rbx movq 8(%rsp), %r12 # Restore %r12 addq $16, %rsp # Deallocate stack frame ret

– 22 – 105 Understanding x86-64 Stack Frame swap_ele_su: movq %rbx, -16(%rsp) # Save %rbx movslq %esi,%rbx # Extend & save i movq %r12, -8(%rsp) # Save %r12 movq %rdi, %r12 # Save a leaq (%rdi,%rbx,8), %rdi # &a[i] subq $16, %rsp # Allocate stack frame leaq 8(%rdi), %rsi # &a[i+1] call swap # swap() movq (%r12,%rbx,8), %rax # a[i] addq %rax, sum(%rip) # sum += a[i] movq (%rsp), %rbx # Restore %rbx movq 8(%rsp), %r12 # Restore %r12 addq $16, %rsp # Deallocate stack frame ret rtn addr %r12 %rsp −8−8 %rbx − 16 rtn addr %r12 %rsp +8+8 %rbx

– 23 – 105 Interesting Features of Stack Frame Allocate entire frame at once All stack accesses can be relative to %rsp Do by decrementing stack pointer Can delay allocation, since safe to temporarily use red zone Simple deallocation Increment stack pointer No base/frame pointer needed