Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:

Slides:



Advertisements
Similar presentations
University of Washington Procedures and Stacks II The Hardware/Software Interface CSE351 Winter 2013.
Advertisements

Introduction to Machine/Assembler Language Noah Mendelsohn Tufts University Web:
Copyright 2014 – Noah Mendelsohn UM Macro Assembler Functions Noah Mendelsohn Tufts University Web:
COMP 2003: Assembly Language and Digital Logic
Lect 3: Instruction Set and Addressing Modes. 386 Instruction Set (3.4) –Basic Instruction Set : 8086/8088 instruction set –Extended Instruction Set :
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.
PC hardware and x86 3/3/08 Frans Kaashoek MIT
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Lect 4: Instruction Set and Addressing Modes. 386 Instruction Set (3.4)  Basic Instruction Set : 8086/8088 instruction set  Extended Instruction Set.
1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron.
6.828: PC hardware and x86 Frans Kaashoek
64-Bit Architectures Topics 64-bit data New registers and instructions Calling conventions CS 105 “Tour of the Black Holes of Computing!”
University of Washington Today More on procedures, stack etc. Lab 2 due today!  We hope it was fun! What is a stack?  And how about a stack frame? 1.
Machine/Assembler Language Control Flow & Compiling Function Calls Noah Mendelsohn Tufts University Web:
INSTRUCTION SET AND ASSEMBLY LANGUAGE PROGRAMMING
Machine-Level Programming: X86-64 Topics Registers Stack Function Calls Local Storage X86-64.ppt CS 105 Tour of Black Holes of Computing.
Machine-Level Programming III: Procedures Topics IA32 stack discipline Register-saving conventions Creating pointers to local variables CS 105 “Tour of.
Derived from "x86 Assembly Registers and the Stack" by Rodney BeedeRodney Beede x86 Assembly Registers and the Stack Nov 2009.
1 ICS 51 Introductory Computer Organization Fall 2009.
Carnegie Mellon 1 Odds and Ends Intro to x86-64 Memory Layout.
1 Carnegie Mellon Assembly and Bomb Lab : Introduction to Computer Systems Recitation 4, Sept. 17, 2012.
Chapter 2 Parts of a Computer System. 2.1 PC Hardware: Memory.
Machine/Assembler Language Control Flow & Compiling Function Calls Noah Mendelsohn Tufts University Web:
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Carnegie Mellon Instructor: San Skulrattanakulchai Machine-Level Programming.
Microprocessors CSE- 341 Dr. Jia Uddin Assistant Professor, CSE, BRAC University Dr. Jia Uddin, CSE, BRAC University.
Microprocessors CSE- 341 Dr. Jia Uddin Assistant Professor, CSE, BRAC University Dr. Jia Uddin, CSE, BRAC University.
Spring 2016Assembly Review Roadmap 1 car *c = malloc(sizeof(car)); c->miles = 100; c->gals = 17; float mpg = get_mpg(c); free(c); Car c = new Car(); c.setMiles(100);
Precept 7: Introduction to IA-32 Assembly Language Programming
CS 177 Computer Security Lecture 9
Instruction Set Architecture
Credits and Disclaimers
Credits and Disclaimers
CSCE 212 Computer Architecture
Machine-Level Programming IV: Data
Machine-Level Programming IV: Data
Introduction to Compilers Tim Teitelbaum
The Stack & Procedures CSE 351 Spring 2017
Assembly IA-32.
Homework Reading Continue work on mp1
Low level Programming.
Assembly Language Programming V: In-line Assembly Code
Machine-Level Programming III: Procedures
BIC 10503: COMPUTER ARCHITECTURE
Data Addressing Modes • MOV AX,BX; This instruction transfers the word contents of the source-register(BX) into the destination register(AX). • The source.
Machine-Level Programming III: Procedures /18-213/14-513/15-513: Introduction to Computer Systems 7th Lecture, September 18, 2018.
Arrays CSE 351 Winter
Carnegie Mellon Machine-Level Programming III: Procedures : Introduction to Computer Systems October 22, 2015 Instructor: Rabi Mahapatra Authors:
Machine-Level Programming IV: Data
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
8086 Registers Module M14.2 Sections 9.2, 10.1.
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Machine-Level Programming IV: Data
Machine Level Representation of Programs (IV)
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Computer Architecture CST 250
X86 Assembly Review.
Machine-Level Representation of Programs (x86-64)
Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Instructor: John Barr * Modified slides from the book.
Get To Know Your Compiler
Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Spring 2016 Instructor: John Barr * Modified slides.
Instructor: Fatma CORUT ERGİN
Machine-Level Programming VIII: Data Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017 Systems book chapter 3* * Modified slides.
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Credits and Disclaimers
The von Neumann Machine
Credits and Disclaimers
Credits and Disclaimers
Computer Architecture and System Programming Laboratory
Low level Programming.
Presentation transcript:

Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web: COMP 40: Machine Structure and Assembly Language Programming

© 2010 Noah Mendelsohn Goals for today  Explore the compilation of realistic programs from C to ASM  Fill in a few more details on things like structure mapping 2

© 2010 Noah Mendelsohn Review

© 2010 Noah Mendelsohn X86-64 / AMD 64 / IA 64 General Purpose Registers %eax %ecx %edx %ebx %esi %edi %esp %ebp %ah $al %ax %ch $cl %cx %dh $dl %dx %bh $bl %bx %si %di %sp %bp %rax %rcx %rdx %rbx %rsi %rdi %rsp %rbp 63 %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r

© 2010 Noah Mendelsohn Machine code (typical)  Simple instructions – each does small unit of work  Stored in memory  Bitpacked into compact binary form  Directly executed by transistor/hardware logic * 5 * We’ll show later that some machines execute user-provided machine code directly, some convert it to an even lower level machine code and execute that.

© 2010 Noah Mendelsohn A simple function in C 6 int times16(int i) { return i * 16; } 0:89 f8 mov %edi,%eax 2:c1 e0 04 shl $0x4,%eax 5:c3 retq REMEMBER: you can see the assembler code for any C program by running gcc with the –S flag. Do it!!

© 2010 Noah Mendelsohn Addressing modes 7 %rax // contents of rax is data (%rax) // data pointed to by rax $123,%rax // Copy 123 into %rax 0x10(%rax) // get *(16 + rax) $0x4089a0(, %rax, 8) // Global array index // of 8-byte things (%ebx, %ecx, 2) // Add base and scaled index 4(%ebx, %ecx, 2) // Add base and scaled // index plus offset 4

© 2010 Noah Mendelsohn movl and leal 8 %rax // contents of rax is data (%rax) // data pointed to by rax $123,%rax // Copy 123 into %rax 0x10(%rax) // get *(16 + rax) $0x4089a0(, %rax, 8) // Global array index // of 8-byte things (%ebx, %ecx, 2) // Add base and scaled index 4(%ebx, %ecx, 2) // Add base and scaled // index plus offset 4 movl (%ebx, %ecx, 1), %edx // edx <- *(ebx + (ecx * 1)) leal (%ebx, %ecx, 1), %edx // edx <- (ebx + (ecx * 1))

© 2010 Noah Mendelsohn movl and leal 9 %rax // contents of rax is data (%rax) // data pointed to by rax $123,%rax // Copy 123 into %rax 0x10(%rax) // get *(16 + rax) $0x4089a0(, %rax, 8) // Global array index // of 8-byte things (%ebx, %ecx, 2) // Add base and scaled index 4(%ebx, %ecx, 2) // Add base and scaled // index plus offset 4 movl (%ebx, %ecx, 1), %edx // edx <- *(ebx + (ecx * 1)) leal (%ebx, %ecx, 1), %edx // edx <- (ebx + (ecx * 1)) movl Moves the data at the address: similar to: char *ebxp; char edx = ebxp[ecx]

© 2010 Noah Mendelsohn movl and leal 10 %rax // contents of rax is data (%rax) // data pointed to by rax $123,%rax // Copy 123 into %rax 0x10(%rax) // get *(16 + rax) $0x4089a0(, %rax, 8) // Global array index // of 8-byte things (%ebx, %ecx, 2) // Add base and scaled index 4(%ebx, %ecx, 2) // Add base and scaled // index plus offset 4 movl (%ebx, %ecx, 1), %edx // edx <- *(ebx + (ecx * 1)) leal (%ebx, %ecx, 1), %edx // edx <- (ebx + (ecx * 1)) leal Moves the address itself: similar to: char *ebxp; char *edxp = &(ebxp[ecx]);

© 2010 Noah Mendelsohn movl and leal 11 %rax // contents of rax is data (%rax) // data pointed to by rax $123,%rax // Copy 123 into %rax 0x10(%rax) // get *(16 + rax) $0x4089a0(, %rax, 8) // Global array index // of 8-byte things (%ebx, %ecx, 2) // Add base and scaled index 4(%ebx, %ecx, 2) // Add base and scaled // index plus offset 4 movl (%ebx, %ecx, 1), %edx // edx <- *(ebx + (ecx * 1)) leal (%ebx, %ecx, 1), %edx // edx <- (ebx + (ecx * 1)) movl (%ebx, %ecx, 4), %edx // edx <- *(ebx + (ecx * 4)) leal (%ebx, %ecx, 4), %edx // edx <- (ebx + (ecx * 4)) scale factors support indexing larger types similar to: int *ebxp; int edxp = ebxp[ecx];

© 2010 Noah Mendelsohn Control flow: simple jumps 12.L4: …code here… j.L4 // jump back to L4

© 2010 Noah Mendelsohn Conditional jumps 13.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4 // conditional: jump iff %r8 != %rdf This technique is the key to compiling if statements, for loops, while loops, etc. …code here…

© 2010 Noah Mendelsohn Challenge: how would this compile? 14 switch (exp) { case 0: SS0; case 1: SS1; case 2: SS2; case 3: SS3: default: SSd; } static code *jump_table[4] = { L0, L1, L2, L3 }; t = exp; if ((unsigned)t < 4) { t = jump_table[t]; goto t; } else goto Ld; L0: SS0; L1: SS1; L2: SS2; L3: SS3: Ld: SSd; Lend:... And break becomes goto Lend

© 2010 Noah Mendelsohn Function calls on Linux/AMD 64  Caller “pushes” return address on stack  Where practical, arguments passed in registers  Exceptions: –Structs, etc. –Too many –What can’t be passed in registers is at known offsets from stack pointer!  Return values –In register, typically %rax for integers and pointers –Exception: structures  Each function gets a stack frame –Leaf functions that make no calls may skip setting one up 15

© 2010 Noah Mendelsohn Arguments/return values in registers 16 Arguments and return values passed in registers when types are suitable and when there aren’t too many Return values usually in %rax, %eax, etc. Callee may change these and some other registers! MMX and FP 87 registers used for floating point Read the specifications for full details! Operand Size Argument Number %rdi%rsi%rdx%rcx%r8%r9 32%edi%esi%edx%ecx%r8d%r9d 16%di%si%dx%cx%r8w%r9w 8%dil%sil%dl%cl%r8b%r9b

© 2010 Noah Mendelsohn The stack – general case 17 Before call ???? %rsp Arguments Return address After callq %rsp Arguments %rsp If callee needs frame ???? Return address Args to next call? Callee vars sub $0x{framesize},%rsp Arguments framesize

© 2010 Noah Mendelsohn A simple function 18 unsigned int times2(unsigned int a) { return a * 2; } times2: leaq(%rdi,%rdi), %rax ret Double the argument …and put it in return value register

© 2010 Noah Mendelsohn The stack – general case 19 Before call ???? %rsp Arguments Return address After callq %rsp Arguments %rsp If callee needs frame ???? Return address Args to next call? Callee vars sub $0x{framesize},%rsp Arguments framesize Must be multiple of 16 when a function is called! But now it’s not a multiple of 16… Here too!

© 2010 Noah Mendelsohn The stack – general case 20 Before call ???? %rsp Arguments Return address After callq %rsp Arguments %rsp If callee needs frame ???? Return address Waste 8 bytes sub $8,%rsp Arguments If we’ll be calling other functions with no arguments, very common to see sub $8,%rsp to re-establish alignment Must be multiple of 16 when a function is called!

© 2010 Noah Mendelsohn Getting the details on function call “linkages”  Bryant and O’Halloran has excellent introduction –Watch for differences between 32 and 64 bit  The official specification: –System V Application Binary Interface: AMD64 Architecture Processor Supplement –Find it at: –See especially sections 3.1 and 3.2 –E.g., You’ll see that %rbp, %rbx, %r12 − %r15 are callee saves. All other GPRs are caller saves. See also Figure

© 2010 Noah Mendelsohn Factorial Revisited 22 int fact(int n) { if (n == 0) return 1; else return n * fact(n - 1); } fact:.LFB2: pushq %rbx.LCFI0: movq %rdi, %rbx movl $1, %eax testq %rdi, %rdi je.L4 leaq -1(%rdi), %rdi call fact imulq %rbx, %rax.L4: popq %rbx ret

© 2010 Noah Mendelsohn An example that integrates what we’ve learned

© 2010 Noah Mendelsohn Sum positive numbers in an array 24 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; }

© 2010 Noah Mendelsohn Sum positive numbers in an array 25 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret

© 2010 Noah Mendelsohn Sum positive numbers in an array 26 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret

© 2010 Noah Mendelsohn Sum positive numbers in an array 27 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret Looks like sum will be in %eax/%rax

© 2010 Noah Mendelsohn Sum positive numbers in an array 28 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret

© 2010 Noah Mendelsohn Sum positive numbers in an array 29 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret If length argument is zero, just leave

© 2010 Noah Mendelsohn Sum positive numbers in an array 30 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret We’re testing the value of %rcx, but why???

© 2010 Noah Mendelsohn Sum positive numbers in an array 31 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret We’re testing the value of %rcx, could that be our arr[i] > 0 test? Yes!

© 2010 Noah Mendelsohn Sum positive numbers in an array 32 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret Hmm. We’re conditionally updating our result (%rax) with $rsi…

© 2010 Noah Mendelsohn Sum positive numbers in an array 33 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret Hmm. We’re conditionally updating our result (%rax) with $rsi… …we must have computed the sum there!

© 2010 Noah Mendelsohn Sum positive numbers in an array 34 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret

© 2010 Noah Mendelsohn Sum positive numbers in an array 35 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret This is looking a lot like our for loop index increment & test.

© 2010 Noah Mendelsohn Sum positive numbers in an array 36 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret But why are we adding 8 to the index variable???

© 2010 Noah Mendelsohn Sum positive numbers in an array 37 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret Address just past array!! (unclear why compiler bothers to subtract one and then add 8)

© 2010 Noah Mendelsohn Sum positive numbers in an array 38 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret But…what is %rdx?

© 2010 Noah Mendelsohn Sum positive numbers in an array 39 unsigned long sum_positives(long arr[], unsigned int length) { unsigned long sum = 0; unsigned int i; for (i = 0; i < length; i++) { if (arr[i] > 0) sum += arr[i]; } return sum; } sum_positives:.LFB0: movl$0, %eax testl%esi, %esi je.L2 subl$1, %esi leaq8(,%rsi,8), %r8 movl$0, %edx.L4: movq(%rdi,%rdx), %rcx leaq(%rax,%rcx), %rsi testq%rcx, %rcx cmovg%rsi, %rax addq$8, %rdx cmpq%r8, %rdx jne.L4.L2: rep ret Offset to the next array element…we never actually compute variable i!

© 2010 Noah Mendelsohn Calling Functions (Review)

© 2010 Noah Mendelsohn How C Language Data is Represented

© 2010 Noah Mendelsohn Sizes and alignments C TypeWidthAlignment char11 short22 int44 float44 pointer, long, long long88 double88 long double

© 2010 Noah Mendelsohn Why align to “natural sizes”?  Memory systems delivers aligned to cache blocks  CPU loads aligned words or blocks from cache  Data that spans boundaries needs two accesses!  Naturally aligned data never unnecessarily crosses a boundary  Intel/AMD alignment rules –Pre AMD 64: unaligned data allowed but slower –AMD 64: data must be aligned as shown on previous slides 43

© 2010 Noah Mendelsohn Alignment of C Language Data  Structures –Wasted space used to “pad” between struct members if needed to maintain alignment –Therefore: when practical, put big members of larger primitive types ahead of smaller ones! –Structures sized for worst-case member –Arrays of structures will have properly aligned members  malloc return values always multiple of 16 – structp = (struct s *)malloc(sizeof(struct s)) is always OK  Argument area on stack is 16 byte aligned –(%rsp - 8) is multiple of 16 on entry to functions –Extra space “wasted” in caller frame as needed –Therefore: called function knows how to maintain alignment 44

© 2010 Noah Mendelsohn Summary  C code compiled to assembler  Data moved to registers for manipulation  Conditional and jump instructions for control flow  Stack used for function calls  Compilers play all sorts of tricks when compiling code  Simple C data types map directly to properly aligned machine types in memory and registers 45